Bin indexing system

Introduction

The binning index system used in the genome browser is a mechanism used in concert with MySQL indexes to speed up selection of MySQL rows for genome coordinate overlapping items. This type of search is sometimes called a range request. The system as first used in the genome browser is described in: "The Human Genome Browser at UCSC" Kent, et. al. Genome Research 2002.12:996-1006, see Figure 7, quote:

We settled on a binning scheme suggested by Lincoln Stein and Richard Durbin. A simple version of this scheme is shown in
Figure7. In the browser itself, we use five different sizes of bins: 128 kb, 1 Mb, 8 Mb, 64 Mb, and 512 Mb.

That initial implementation has since been enhanced by an additional level of bins to allow items of size up to 4 Gb (actually only to 2Gb given integer size limits). The new and the old system coexist together. Given an item with a chromEnd coordinate of less than or equal to 512 Mb, a bin number in the old system will be used. An item with a chromEnd coordinate greater than 512 Mb, a bin number in the new system will be used.

Since all of these bins are in sizes of powers of two, the calculation of the bin number is a simple matter of bit shifting of the chromStart and chromEnd coordinates. The C code for the bin calculation can be seen in the kent source tree in src/lib/binRange.c.

Initial implementation

Used when chromEnd is less than or equal to 536,870,912 = 2²⁹

level	#bins	start	end	size
		bin numbers		bin
0	1	0	0	512 Mb
1	8	1	8	64 Mb
2	64	9	72	8 Mb
3	512	73	584	1 Mb
4	4096	585	4680	128 kb

Extended implementation

Used when chromEnd is greater than 536,870,912 = 2²⁹ and less than 2,147,483,647 = 2³¹ - 1

level	#bins	start	end	size
		bin numbers		bin
0	1	4691	4691	2 Gb
1	8	4683	4685	512 Mb
2	64	4698	4721	64 Mb
3	512	4818	5009	8 Mb
4	4,096	5778	7313	1 Mb
5	32,768	13458	25745	128 kb

Bin indexing system

Introduction

Initial implementation

Extended implementation

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

related sites

hosted projects

Tools