TandemDups

From genomewiki
Revision as of 01:27, 21 May 2017 by Hiram (talk | contribs) (update with specifics about these tandem dup procedures)
Jump to navigationJump to search

methods

These are measurements of identical paired sequences with a separation between the pairs of at least one nucleotide. The smallest size sequence in this survey is 30 bases, the largest separation considered was 20,000 bases.

The procedure to find these sequences is as follows:

  • generate 29 base kmers for the entire genome, allow only kmers with bases: A C T G, no N's allowed.
  • pair up identical kmers with at least one base separation and up to 20,000 bases separation. This separation sequence is referred to here as the gap in this document.
  • collapse overlapping kmer pairs when they are the same size of sequence and the same spacing between the pairs. This procedure preserves the definition of duplicated identical pairs.
  • the resulting pairs can now be longer sequences with smaller separation then the constituent pairs
  • final result selects sizes of 30 bases or more for the size of the paired sequence, and at least one base remaining as a separation gap.
  • collapsed pairs that close the gap are discarded. They appear to indicate simple repeat sequences when this happens. It would be interesting to have this result available, but that is not available at this time.

The reason for starting with 29 base sized pairs and then selecting results of at least 30 base sized pairs results in a reasonable number of 30 base pairs. If the procedure starts with 30 base sized pairs, it produces way too many 30 base kmer pairs for a reasonable count.

The measurements are taken from the resulting db.tandemDups.bed[.gz] file in /hive/data/genomes/<db>/bed/tandemDups/db.tandemDups.bed[.gz] The score column in the bed file (column 5) is the size of the duplicated sequence. The gap size between the duplicated sequence is thus calculated from: end - start + 2 * score

This data is also available in the tandemDups table for sequences with genome browser databases. Not all sequences measured here have genome browser databases.

The item total is the sum of the sizes of the duplicated sequences. Not both sides though, just one side. This indicates how much sequence is duplicated. Multiply this by 2 to see total amount of sequence involved in these repeats for both sides.

The gap total is the sum of the sizes of all the separation gaps between the paired sequences.

table features

The table columns can be sorted, click on the up/down arrow icon in the column header. The 'year' is what we have in the dbDb table as indicated from the assembly information files for the date of the assembly. A few do not have dates (set to 1880), and do not have database genome browsers The example item is a worst case example, where the ratio of dup sequence size to gap size is the highest, i.e. smallest gap with largest dup size

tandemDups table statistics

count year dbName ncbiAsmId assembly method item

count

item

median

item

total

  gap

median

gap

total

example item

dup size, gap size, link

scatter plot

dup size vs. gap size

019 2014 acaChl1 GCF_000695815.1 SOAPdenovo v. 1.6 69222 38 3687159   479 270396956 448, 1, KK830956:17007-17903 plot acaChl1
020 2012 aciBauTYTH_1 GCF_000302575.1 tbd 175 43 11650   1165 777735 35, 1, chr_CP003856:2714351-2714421 plot aciBauTYTH_1
021 2006 afrOth13 tbd tbd 63170 39 3109998   1833 319093167 126, 1, 3-3:1498405-1498657 plot afrOth13
022 2013 amaVit1 GCA_000332375.1 Ray software v. 3 37254 46 10717950   256 63766115 5218, 14, KB272082:90533-100982 plot amaVit1
023 2013 anaPla1 GCF_000355885.1 SOAPdenovo Release v. 1.03 27792 43 2580234   401 91882502 1258, 1, KB745735:2510-5026 plot anaPla1
024 2014 ancCey1 GCA_000688135.1 Velvet v. 1.2.05; BGI GapCloser v. 1.12 (release_2011); HaploMerger v. 20111230; ERANGE v. 3.2 55040 40 3358224   389.5 157542766 4183, 1, JARK01001394v1:846761-855127 plot ancCey1
025 2014 angJap1 GCA_000470695.1 CLC NGS Cell v. 3.2; CLC NGS Cell v. 4.0beta 546983 44 32365612   800 580570188 1181, 1, KI307522:56201-58563 plot angJap1
026 2007 anoCar1 tbd tbd 751544 38 40990303   2923 4158391303 351, 1, scaffold_31:2142294-2142996 plot anoCar1
028 2003 anoGam1 tbd tbd 75957 43 5353295   2383 356254048 1175, 1, chrX:14717656-14720006 plot anoGam1
029 2014 apaVit1 GCF_000703405.1 SOAPdenovo v. 1.6 24291 35 1216678   112 56084043 453, 1, KL385068:59340-60246 plot apaVit1
030 2004 apiMel1 tbd tbd 83149 38 3897116   280 81345069 203, 1, GroupUn.6971:896-1302 plot apiMel1
031 2005 apiMel2 tbd tbd 91284 38 4467103   295 110415978 234, 1, Group8:8120781-8121249 plot apiMel2
032 2005 apiMel3 tbd tbd 100380 38 5142600   318 146199642 234, 1, Group8:9788059-9788527 plot apiMel3
033 2010 apiMel4 GCF_000002195.4 Atlas assembly system v. before 2011 101842 38 5017659   328 182902657 234, 1, Group8:11123745-11124213 plot apiMel4
034 2008 aplCal1 tbd tbd 362965 33 14931570   1162 1691576982 254, 1, scaffold_486:233208-233716 plot aplCal1
035 2014 aptFor1 GCF_000699145.1 SOAPdenovo v. 1.6 107781 59 13547646   120 162744750 2370, 1, KL225745:613720-618460 plot aptFor1
036 2015 aptMan1 GCF_001039765.1 tbd 282012 72 84603242   159 396269019 13772, 1, NW_013987314v1:1452350-1479894 plot aptMan1
037 2014 aquChr1 GCA_000696035.1 ABySS v. 1.3.6 91231 42 4160856   43 70439237 98, 1, KK848043:4630547-4630743 plot aquChr1
038 2014 aquChr2 GCA_000766835.1 AllPaths-LG v. August 2013 27433 39 1833575   548 117223799 6351, 4, KN265844v1:264833-277538 plot aquChr2
039 1880 araTha1 GCF_000001735.3 tbd 42547 40 2336394   4141 237921318 415, 1, chr3:12226092-12226922 plot araTha1
040 2012 ascSuu1 GCA_000298755.1 SOAPdenovo v. 1.04 26863 62 2864530   136 30068625 680, 1, JH878990v1:516922-518282 plot ascSuu1
041 2013 astMex1 GCF_000372685.1 AllPathsLG v. Jan-2013 300773 36 13069886   344 878056894 647, 1, KB872443:19678-20972 plot astMex1
042 2014 balPav1 GCA_000709895.1 SOAPdenovo v. 1.6 17941 37 1027192   127 51713556 375, 1, KL478702:45795-46545 plot balPav1
043 2008 braFlo2 tbd tbd 335984 40 21237833   1088 1244816809 1512, 1, Bf_V2_32:3091839-3094863 plot braFlo2
044 1880 braRap1 GCF_000309985.1 SOAPdenovo v. 1.04 74411 41 5348931   2750 314357742 1288, 2, chrA5:13328835-13331412 plot braRap1
045 2007 bruMal1 tbd tbd 70008 37 3314327   264 46555408 321, 1, Bmal_supercontigDegenerate10576:240-882 plot bruMal1
046 2014 bruMal2 tbd tbd 72743 38 3759073   402 111607550 453, 1, Bmal_v3_scaffold8088:119-1025 plot bruMal2
047 2014 bucRhi1 GCF_000710305.1 SOAPdenovo v. 1.6 43210 56 3393282   73 48672532 715, 1, KL533494:44624-46054 plot bucRhi1
049 2011 burXyl1 tbd tbd 12719 36 772590   1694 51888226 134, 1, scaffold01254:876286-876554 plot burXyl1
050 2010 caeAng1 GCA_000165025.1 Velvet v. 0.7.56 100045 42 4176946   305 156635287 71, 1, scafRNAPATHr22140:12806-12948 plot caeAng1
051 2012 caeAng2 tbd tbd 148180 41 6380576   369 202631392 131, 1, Cang_2012_03_13_00262:54689-54951 plot caeAng2
052 2008 caeJap1 tbd tbd 54911 37 2702441   669 184995823 179, 1, chrUn:91344286-91344644 plot caeJap1
053 2009 caeJap2 tbd tbd 61394 38 3566172   1128.5 219090212 153, 1, chrUn:143662847-143663153 plot caeJap2
054 1880 caeJap2a tbd tbd 59499 38 3480262   973 192672580 153, 1, Cjap_Contig3098:12088-12394 plot caeJap2a
055 2010 caeJap3 tbd tbd 47394 36 2103165   360 75708219 176, 1, ABLE03028834:844-1196 plot caeJap3
056 2010 caeJap4 GCA_000147155.1 Celera assembler v. 6.0 66567 37 3336966   815 226847582 176, 1, Scaffold17893:329482-329834 plot caeJap4
057 2007 caePb1 tbd tbd 67100 38 3590813   2009 281616954 168, 1, chrUn:161968878-161969214 plot caePb1
058 2008 caePb2 tbd tbd 71710 39 3958202   3788.5 420907669 239, 1, chrUn:97561553-97562031 plot caePb2
059 2010 caePb3 GCA_000143925.2 PCAP v. 9/3/04 71721 39 3951441   3770 420080585 239, 1, Scfld02_132:346628-347106 plot caePb3
060 2005 caeRem1 tbd tbd 78288 40 4264984   1786 318215569 193, 1, SuperCont3184:2552-2938 plot caeRem1
061 2006 caeRem2 tbd tbd 102926 41 5484074   859 350685995 193, 1, chrUn:145434398-145434784 plot caeRem2
062 2007 caeRem3 tbd tbd 69306 40 3736990   2408 318941600 181, 1, chrUn:147913992-147914354 plot caeRem3
063 2007 caeRem4 GCF_000149515.1 tbd 70702 39 3779779   2320 319257206 181, 1, Crem_Contig169:93478-93840 plot caeRem4
064 2010 caeSp111 GCA_000186765.1 Celera assembler v. 6.0 16576 36 783303   2230.5 70961296 161, 1, Scaffold630:3047861-3048183 plot caeSp111
065 2012 caeSp51 tbd tbd 20559 36 877191   1869 57741956 109, 1, Csp5_scaffold_04217:6885-7103 plot caeSp51
066 2010 caeSp91 tbd tbd 67737 36 3110776   1418 221671479 195, 1, Scaffold7109:118818-119208 plot caeSp91
067 2014 calAnn1 GCF_000699085.1 SOAPdenovo v. 1.6 115073 39 9049123   590 450885170 1104, 1, KL218440:2851016-2853224 plot calAnn1
068 2013 calMil1 GCF_000165045.1 Celera v. 6.1 365912 35 15637123   1428 1794921679 144, 1, KI635985:586597-586885 plot calMil1
069 2005 canHg12 tbd tbd 33765 35 1344552   617 139440788 121, 1, 12:76878573-76878815 plot canHg12
070 2014 capCar1 GCF_000700745.1 SOAPdenovo v. 1.6 63810 36 3017510   389.5 221999357 1265, 2, KL360999:16916-19447 plot capCar1
071 2014 carCri1 GCF_000690535.1 SOAPdenovo v. 1.6 20461 37 1163337   68 30549622 529, 1, KK515247:46620-47678 plot carCri1
072 2002 cb1 tbd tbd 36051 37 2442691   558 98718219 191, 1, chrUn:90311348-90311730 plot cb1
073 2005 cb2 tbd tbd 35978 37 2444306   560 98588417 317, 1, chrIII:126724-127358 plot cb2
074 2007 cb3 tbd tbd 35990 37 2451574   568.5 99618994 317, 1, chrIII:11646590-11647224 plot cb3
075 2011 cb4 tbd tbd 36155 37 2519414   578 100960462 317, 1, chrIII:76433-77067 plot cb4
076 2010 ce10 tbd tbd 33806 38 1760308   427 82769023 1500, 1, chrIV:5554976-5557976 plot ce10
077 2013 ce11 GCF_000002985.6 tbd 33816 38 1760641   427 82800067 1500, 1, chrIV:5554985-5557985 plot ce11
078 2004 ce2 tbd tbd 33799 38 1759889   427 82752389 1500, 1, chrIV:5554978-5557978 plot ce2
079 2005 ce3 tbd tbd 33799 38 1759889   427 82752398 1500, 1, chrIV:5554972-5557972 plot ce3
080 2007 ce4 tbd tbd 33792 38 1759610   427 82743753 1500, 1, chrIV:5554972-5557972 plot ce4
081 2007 ce5 tbd tbd 33794 38 1759781   427 82743927 1500, 1, chrIV:5554972-5557972 plot ce5
082 2008 ce6 tbd tbd 33794 38 1759781   427 82743927 1500, 1, chrIV:5554972-5557972 plot ce6
083 2009 ce7 tbd tbd 33806 38 1760308   427 82769007 1500, 1, chrIV:5554972-5557972 plot ce7
084 2009 ce8 tbd tbd 33806 38 1760308   427 82769007 1500, 1, chrIV:5554972-5557972 plot ce8
085 2010 ce9 tbd tbd 33806 38 1760308   427 82769007 1500, 1, chrIV:5554972-5557972 plot ce9
086 2014 chaVoc1 GCF_000708025.1 SOAPdenovo v. 1.6 53439 41 4896801   1813 256127930 1729, 2, KL408233:1252791-1256250 plot chaVoc1
087 2014 chaVoc2 GCF_000708025.1 SOAPdenovo v. 1.6 53430 41 4896157   1812 256095280 1729, 2, KL871243:1252791-1256250 plot chaVoc2
088 2014 chlUnd1 GCF_000695195.1 SOAPdenovo v. 1.6 28454 43 1824188   165 94696650 497, 1, KK750077:105999-106993 plot chlUnd1
089 2002 ci1 GCA_000183065.1 tbd 48663 39 2653274   626 122189601 486, 1, Scaffold_604:30085-31057 plot ci1
090 2005 ci2 tbd tbd 119965 43 8287326   1961 566064667 358, 1, scaffold_83:159982-160698 plot ci2
091 2011 ci3 GCF_000224145.1 tbd 48178 39 2574684   692 149951145 486, 1, chrUn_NW_004190340v1:65099-66071 plot ci3
092 2003 cioSav1 tbd tbd 123843 39 6363601   2503 618224769 189, 1, ps_297:30448-30826 plot cioSav1
093 2005 cioSav2 tbd tbd 157468 38 7731855   2875 819500559 280, 1, reftig_238:125039-125599 plot cioSav2
094 2013 colLiv1 GCF_000337935.1 SOAPdenovo v. 2.0 139510 33 6394287   112 419571584 3080, 2, KB375367:1029739-1035900 plot colLiv1
095 2014 colStr1 GCF_000690715.1 SOAPdenovo v. 1.6 55406 40 3066780   131 154687137 309, 1, KK533057:6873-7491 plot colStr1
096 2012 conCri1 GCF_000260355.1 AllPaths v. 2012 253120 35 11854934   7807 2109955223 265, 1, JH655888:40209348-40209878 plot conCri1
097 2014 corBra1 GCF_000691975.1 SOAPdenovo v. 1.6 91630 37 6371852   297 248570100 1583, 1, KK718913:5901493-5904659 plot corBra1
098 2014 corCor1 GCF_000738735.1 AllPaths v. Allpaths-LG version 41687 58556 36 3043376   370 184023111 1935, 1, KL997525:15617964-15621834 plot corCor1
099 2013 cotJap1 GCA_000511605.1 Soapdenovo v. 1.0.5b; bwa v. 0.5.9; SSPACE v. 1.2 7329 33 280548   75 2914174 214, 1, DF262918:84572-85000 plot cotJap1
100 2014 cucCan1 GCF_000709325.1 SOAPdenovo v. 1.6 126008 42 16101261   2142 633238955 2278, 1, KL448309:4464943-4469499 plot cucCan1
101 2014 cynSem1 GCF_000523025.1 SOAPdenovo v. April-2011 83655 37 6369004   261 184971677 1536, 1, chr1:16715796-16718868 plot cynSem1
102 2014 cypVar1 GCA_000732505.1 AllPaths v. May 2014 136313 39 9823682   1299 502219020 2138, 1, KL652705:564642-568918 plot cypVar1
103 2014 danRer10 GCF_000002035.5 tbd 830199 36 39955973   1138 3506165088 829, 1, chr7:19536660-19538318 plot danRer10
104 2007 danRer5 tbd tbd 878638 36 43432292   1103 3637100049 829, 1, chr7:21951036-21952694 plot danRer5
105 2008 danRer6 tbd tbd 1054071 36 55306251   1381 4517471825 2083, 1, chr10:31284766-31288932 plot danRer6
106 2010 danRer7 GCF_000002035.4 tbd 910457 36 43398798   1289 3925402743 829, 1, chr7:20946224-20947882 plot danRer7
107 2014 dicLab1 GCA_000689215.1 tbd 204757 36 11489728   924 691437386 841, 1, HG916851:32290203-32291885 plot dicLab1
108 2013 dirImm1 tbd tbd 3309 36 316992   58 1820327 1613, 1, nDi_2_2_scaf00284:19002-22228 plot dirImm1
109 2003 dm1 tbd tbd 12199 50 1260685   3445 60460276 3882, 2, chr2L:1894810-1902575 plot dm1
110 2004 dm2 tbd tbd 13213 51 1372092   3723 67342596 3882, 2, chr2L:1893145-1900910 plot dm2
111 2006 dm3 tbd tbd 113222 40 6787673   2412 622002007 3882, 2, chr2L:1893145-1900910 plot dm3
112 2014 dm6 GCF_000001215.4 tbd 48031 48 4403254   3448 254474999 3882, 2, chr2L:1893145-1900910 plot dm6
113 2003 dp2 tbd tbd 16790 44 1110948   926 42217528 227, 1, Contig7446_Contig2444:1979445-1979899 plot dp2
114 2004 dp3 tbd tbd 20334 44 1389766   1239 62703720 312, 1, chrU:9357988-9358612 plot dp3
115 2006 dp4 tbd tbd 53060 46 3495255   2437 228127096 312, 1, Unknown_singleton_2460:32411-33035 plot dp4
116 2012 droAlb1 GCA_000298335.1 SOAPdenovo v. 1.04 126521 30 3970849   70 35343627 76, 1, JH853217:889-1041 plot droAlb1
117 2004 droAna1 tbd tbd 67882 40 3659056   697 196198368 572, 1, 2446670:645-1789 plot droAna1
118 2005 droAna2 tbd tbd 248263 42 14624595   918 748771139 572, 1, scaffold_13499:1095908-1097052 plot droAna2
119 2006 droAna3 GCF_000005115.1 tbd 246334 42 14515690   927 745835855 572, 1, scaffold_13499:1092668-1093812 plot droAna3
120 2013 droBia2 GCA_000233415.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 46906 42 2807408   2591 175612579 2241, 3, AFFD02006372:54233-58717 plot droBia2
121 2013 droBip2 GCA_000236285.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_calland_upgrade.pl v. 1.0 54693 39 2946335   1371 204354268 179, 1, KB463958:131929-132287 plot droBip2
122 2013 droEle2 GCA_000224195.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATKv. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 34862 41 2092385   2187 135919965 270, 1, KB458613:1953986-1954526 plot droEle2
123 2005 droEre1 tbd tbd 96336 44 5585722   674 191777516 359, 1, scaffold_1301:371-1089 plot droEre1
124 2006 droEre2 GCF_000005135.1 tbd 95081 44 5535640   676 190524603 359, 1, scaffold_1301:371-1089 plot droEre2
125 2013 droEug2 GCA_000236325.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 59111 40 3495099   1518 181127987 141, 1, KB464979:6084-6366 plot droEug2
126 2013 droFic2 GCA_000220665.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 21964 44 1418464   2380.5 83742114 190, 1, AFFG02001364:4041-4421 plot droFic2
127 2005 droGri1 tbd tbd 458551 40 21432909   509 538034457 491, 1, scaffold_2211:899-1881 plot droGri1
128 2006 droGri2 GCF_000005155.2 tbd 302510 40 14467041   522 418546843 188, 1, scaffold_6592:1167-1543 plot droGri2
129 2013 droKik2 GCA_000224215.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 31358 41 1855833   1533.5 112074726 117, 1, KB459586:778466-778700 plot droKik2
130 2013 droMir2 GCA_000269505.2 Newbler v. 2.6 18368 45 1204033   1747 75101816 140, 1, chr2:3040735-3041015 plot droMir2
131 2004 droMoj1 tbd tbd 69928 38 3220037   310 78711524 225, 1, contig_34282:247-697 plot droMoj1
132 2005 droMoj2 tbd tbd 102140 41 5658627   1086 363630258 202, 1, scaffold_6540:14391223-14391627 plot droMoj2
133 2006 droMoj3 GCF_000005175.2 tbd 101230 41 5606832   1114 361818537 202, 1, scaffold_6540:14384339-14384743 plot droMoj3
134 2005 droPer1 GCF_000005195.2 tbd 75046 45 5536401   2595 325634621 580, 2, super_62:246420-247581 plot droPer1
135 2013 droPse3 GCF_000001765.3 PBJelly v. 12.8.2; Atlas genome assembly 53481 45 3494617   2480 231367590 312, 1, chrUn_CH674897_1:32411-33035 plot droPse3
136 2013 droRho2 GCA_000236305.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 68118 41 4002283   1672 235074913 205, 1, AFPP02028413:1419-1829 plot droRho2
137 2005 droSec1 GCA_000005215.1 tbd 127886 38 6652375   465 177925676 360, 1, super_6483:1086-1806 plot droSec1
138 2005 droSim1 tbd tbd 47915 42 2874159   1400 171101666 217, 1, chr3R_random:168062-168496 plot droSim1
139 2014 droSim2 GCF_000754195.2 Velvet v. 1.1.04 10385 40 551061   466 26120972 217, 1, chrUn_NW_015496898v1:4674-5108 plot droSim2
140 2013 droSuz1 GCA_000472105.1 SOAPdenovo v. 2 117859 46 12342639   2277 517597655 4939, 1, KI419149:2637663-2647541 plot droSuz1
141 2013 droTak2 GCA_000224235.2 Celera Assembler v. 6.1; BWA v. 0.6.0; Samtools v. 0.1.14; GATK v. 1.1-9; Indel_call_and_upgrade.pl v. 1.0 48870 41 2816693   2013 202395837 306, 1, AFFI02002878:4290-4902 plot droTak2
142 2004 droVir1 tbd tbd 147648 35 6630839   429 239695710 244, 1, scaffold_0:5707381-5707869 plot droVir1
143 2005 droVir2 tbd tbd 432783 30 17539885   481 757905036 244, 1, scaffold_13049:18877549-18878037 plot droVir2
144 2006 droVir3 GCF_000005245.1 tbd 407188 30 16757243   495 747769004 244, 1, scaffold_13049:18848863-18849351 plot droVir3
145 2006 droWil1 GCF_000005925.1 tbd 118355 42 7234894   1695 472632964 954, 1, scaffold_181130:9135849-9137757 plot droWil1
146 2006 droWil2 GCF_000005925.1 tbd 118240 42 7228999   1695 472142234 954, 1, CH964272:9135849-9137757 plot droWil2
147 2004 droYak1 tbd tbd 81563 43 6228480   3752 459910434 851, 1, chr3L:24830341-24832043 plot droYak1
148 2005 droYak2 tbd tbd 93441 45 6941572   3518 501527798 1122, 2, chrU:731511-733756 plot droYak2
149 2006 droYak3 GCF_000005975.2 tbd 85136 46 6497582   2641 404384429 1122, 2, chrUn_CH892674_1:731511-733756 plot droYak3
152 2014 egrGar1 GCF_000687185.1 SOAPdenovo v. 1.6 74371 43 9954381   893 271113520 2791, 1, KK500861:718810-724392 plot egrGar1
153 2014 esoLuc1 GCA_000721915.1 AllPaths v. 43500 309026 36 12717997   109 633934424 1803, 1, KL593524:286555-290161 plot esoLuc1
154 2006 euaGli13 tbd tbd 85054 35 3561209   981 360128068 95, 1, 4-7:24024058-24024248 plot euaGli13
155 2014 eurHel1 GCF_000690775.1 SOAPdenovo v. 1.6 48033 41 2646849   3690 271468700 462, 1, KK561721:27808-28732 plot eurHel1
156 2006 eutHer13 tbd tbd 75114 37 3312319   1220 345926827 125, 1, 5-6:21163986-21164236 plot eutHer13
157 2013 falChe1 GCF_000337975.1 SOAPdenovo v. 1.4 33448 36 3028131   144 71160661 1589, 1, KB397979:787306-790484 plot falChe1
158 2013 falPer1 GCF_000337955.1 SOAPdenovo v. 1.4 32904 38 2747346   156 69981752 1888, 1, KB390863:6987566-6991342 plot falPer1
159 2002 fr1 tbd tbd 65564 35 3413218   165 124234021 304, 1, chrUn:169005183-169005791 plot fr1
160 2004 fr2 tbd tbd 104956 35 5519798   156 261084207 151, 1, chrUn:356162839-356163141 plot fr2
161 2011 fr3 GCF_000180615.1 tbd 105765 35 5702900   157 257018230 151, 1, HE592488:202-504 plot fr3
162 2014 fulGla1 GCF_000690835.1 SOAPdenovo v. 1.6 20338 37 1197675   93 36399474 576, 1, KK595554:8470-9622 plot fulGla1
163 2010 gadMor1 GCA_000231765.1 tbd 523023 30 17042881   24 573821154 131, 1, HE571852:62524-62786 plot gadMor1
164 2004 galGal2 tbd tbd 157813 51 15224008   3723 846924205 208, 1, chr2:94743278-94743694 plot galGal2
165 2006 galGal3 tbd tbd 243371 62 25066425   3004 1138149811 208, 1, chr2:97262115-97262531 plot galGal3
166 2011 galGal4 GCF_000002315.3 Celera Assembler v. 5.4 78961 39 4732239   2968 493569103 17591, 16, chrZ:21320544-21355741 plot galGal4
167 2015 galGal5 GCF_000002315.4 MHAP/PBcR v. 8.2beta 985564 36 47990278   1925 3490824745 17591, 16, chrZ:21461674-21496871 plot galGal5
168 2006 gasAcu1 tbd tbd 131262 39 7836949   3635 770629287 296, 1, chrUn:59780312-59780904 plot gasAcu1
169 1880 gasAsc0 GCA_000180675.1 tbd 116031 39 6613325   1417 518945407 316, 1, contig_16726:674-1306 plot gasAsc0
170 2014 gavSte1 GCF_000690875.1 SOAPdenovo v. 1.6 24295 36 1352476   75 47417993 302, 1, KK611813:2739-3343 plot gavSte1
171 2012 geoFor1 GCF_000277835.1 SOAPdenovo v. 2.01 113187 37 6785044   204 268468600 1013, 1, JH739970:2776008-2778034 plot geoFor1
172 2006 gliRes13 tbd tbd 31976 36 1385928   1661 143989628 95, 1, 4-7:15697982-15698172 plot gliRes13
173 2016 gorGor5 tbd tbd 3555119 33 133654191   4102 21270991545 338, 1, CYUI01005848v1:13590-14266 plot gorGor5
174 2009 haeCon1 tbd tbd 109810 42 5802860   1251.5 319005845 196, 1, Hcon_Contig0025586:3955-4347 plot haeCon1
175 2013 haeCon2 tbd tbd 53818 43 3892613   570 187299643 2147, 1, scaffold_1557:10532-14826 plot haeCon2
176 2014 halAlb1 GCF_000691405.1 SOAPdenovo v. 1.6 13685 36 754065   301 41720730 548, 2, KK653364:30569-31666 plot halAlb1
177 2014 halLeu1 GCF_000737465.1 SOAPdenovo2 v. May 2014 16099 40 1241068   4586 98353830 79, 1, KL869431:1084034-1084192 plot halLeu1
178 2011 hapBur1 GCF_000239415.1 ALLPATHS-LG v. R35951 34584 37 2342045   3524.5 202958431 10390, 20, JH425331:1373378-1394177 plot hapBur1
179 2011 hetBac1 GCA_000223415.1 Celera assembler v. 6.0 3302 51 353946   673 12041725 317, 1, GL996135v1:102345-102979 plot hetBac1
180 1880 homNea0 tbd tbd 148 30 4725   14.5 2669 37, 1, 151586_3339_2553:20-94 plot homNea0
182 2006 lauRas13 tbd tbd 43311 36 1891314   1493 210427636 125, 1, 5-6:22761072-22761322 plot lauRas13
183 2014 lepDis1 GCF_000691785.1 SOAPdenovo v. 1.6 16795 33 749994   94 31438992 199, 1, JJRK01010260:4452-4850 plot lepDis1
184 2011 lepOcu1 GCF_000242695.1 AllPaths v. R38293 77276 37 4579890   1084 350156295 488, 2, chrLG5:14840992-14841969 plot lepOcu1
185 2013 letCam1 GCA_000466285.1 Newbler v. 2.7 768187 35 32420899   153 1525909318 207, 1, KE997215:997-1411 plot letCam1
186 1880 linHum0 GCF_000217595.1 CABOG v. 5.3 42080 41 2313367   5018 259661061 101, 1, NW_012160424:64875-65077 plot linHum0
187 2012 loaLoa1 GCA_000183805.3 Newbler v. 2.1-PreRelease-4/28/2009 15889 37 1294459   188 16478676 109, 1, JH717180v1:404-622 plot loaLoa1
188 2014 manVit1 GCF_000692015.1 SOAPdenovo v. 1.6 75103 35 4570292   206 245456943 5268, 1, KK733349:4046063-4056599 plot manVit1
190 2012 mayZeb1 GCF_000238955.1 AllPaths v. R37043 48899 38 3205287   4159 307830483 9605, 105, JH720664:938440-957754 plot mayZeb1
191 2009 melGal1 tbd tbd 25253 45 1794885   2825 133788059 169, 1, chr3:54352580-54352918 plot melGal1
192 2014 melGal5 GCF_000146605.2 MaSuRCA v. 1.9.2 101903 36 4643970   197 188019328 173, 1, chrUn_NW_011217171v1:1-347 plot melGal5
193 2008 melHap1 GCA_000172435.1 tbd 12515 41 832837   375 21806223 157, 1, MhA1_Contig2844:850-1164 plot melHap1
194 2008 melInc1 GCA_000180415.1 tbd 19330 40 1244828   558 47252172 183, 1, Minc_Contig6373:3669-4035 plot melInc1
195 2008 melInc2 tbd tbd 22743 41 1594354   1067 83086394 183, 1, MiV1ctg2756:3669-4035 plot melInc2
196 2011 melUnd1 GCF_000238935.1 Celera v. 6.1 99875 41 5463070   6271 754338919 120, 1, JH556605:5210251-5210491 plot melUnd1
197 2014 merNub1 GCF_000691845.1 SOAPdenovo v. 1.6 37279 43 2280064   174 103616988 543, 1, KK705997:21022-22108 plot merNub1
198 2014 mesUni1 GCF_000695765.1 SOAPdenovo v. 1.6 52863 36 2585696   257 120590969 271, 1, JJRI01098248:16372-16914 plot mesUni1
199 2013 musDom2 GCF_000371365.1 AllPathsLG v. September 2012 575203 36 27589830   1941 2513509860 2028, 1, KB856326:64184-68240 plot musDom2
200 2013 necAme1 GCF_000507365.1 Newbler v. MapAsmResearch-04/19/2010-patch-08/17/2010 30870 43 1735538   1580 123448969 93, 1, KI659398v1:132-318 plot necAme1
201 2007 nemVec1 tbd tbd 540729 40 30645450   589 1501673463 353, 1, scaffold_201:423580-424286 plot nemVec1
202 2011 neoBri1 GCF_000239395.1 ALLPATHS-LG v. R36800 62939 46 6697998   1655 274117853 8242, 20, JH422273:8382583-8399086 plot neoBri1
203 2014 nipNip1 GCF_000708225.1 SOAPdenovo v. 1.6 32623 43 3676893   251 104452416 1496, 1, KL409846:432608-435600 plot nipNip1
204 2014 notCor1 GCF_000735185.1 Celera Assembler v. 7.0 164483 34 7109993   199 393266432 407, 1, KL665414:596304-597118 plot notCor1
205 2013 oncVol1 GCA_000499405.1 tbd 5247 39 423838   2124 28089581 739, 1, HG738137v1:12037947-12039425 plot oncVol1
206 2014 opiHoa1 GCF_000692075.1 SOAPdenovo v. 1.6 75782 40 6783674   4083 456471575 1119, 1, KK734928:246587-248825 plot opiHoa1
207 2011 oreNil1 tbd tbd 80799 39 4892346   6067 586359236 9755, 32, GL831139:3510855-3530396 plot oreNil1
208 2006 oryLat1 tbd tbd 191645 40 12487530   1210 620191379 379, 1, chr9:5041681-5042439 plot oryLat1
209 2005 oryLat2 tbd tbd 189087 40 12356700   1234 592910738 379, 1, chr9:5041681-5042439 plot oryLat2
210 2013 panRed1 GCA_000341325.1 Velvet v. 1.2.07 23300 42 1084666   1396 66559726 101, 1, KB454925:8492-8694 plot panRed1
211 2014 pelCri1 GCF_000687375.1 SOAPdenovo v. 1.6 42891 37 2053042   265 127612523 574, 1, KK474798:35527-36675 plot pelCri1
212 2007 petMar1 tbd tbd 855653 36 38339527   201 932728649 362, 1, Contig99174:237-961 plot petMar1
213 2010 petMar2 GCA_000148955.1 Arachne v. 3.2 836219 37 38092404   692 2324558773 363, 1, GL498477:1987-2713 plot petMar2
214 2014 phaCar1 GCF_000708925.1 SOAPdenovo v. 1.6 41541 34 1992939   36 26355615 779, 1, KL418261:30029-31587 plot phaCar1
215 2014 phaLep1 GCF_000687285.1 SOAPdenovo v. 1.6 26611 38 1718609   216 85779441 1077, 1, KK455387:47187-49341 plot phaLep1
216 2014 phoRub1 GCA_000687265.1 SOAPdenovo v. 1.6 17140 43 1055248   135 39934970 526, 1, KK424259:7588-8640 plot phoRub1
217 2014 picPub1 GCF_000699005.1 SOAPdenovo v. 1.6 397798 36 17968999   7796 3299119050 4915, 3, KL215520:252741-262573 plot picPub1
218 2013 poeFor1 GCF_000485575.1 AllPaths-LG v. July 2013 161806 52 28441594   4650 990605574 1886, 1, KI520679:7484-11256 plot poeFor1
219 2014 poeRet1 tbd tbd 71291 38 4816409   1511 343849361 8790, 4, chrLG5:27506035-27523618 plot poeRet1
220 2014 priExs1 tbd tbd 33878 44 2639333   1128 98936342 1626, 1, scaffold830:51430-54682 plot priExs1
221 2007 priPac1 tbd tbd 29257 43 1929264   448 80615899 500, 1, chrUn:71534792-71535792 plot priPac1
222 2008 priPac2 GCA_000180635.1 tbd 19318 39 1050200   259 25863820 500, 1, ABKE01002096:3239-4239 plot priPac2
223 2014 priPac3 tbd tbd 36759 40 2176431   321 82136541 500, 1, Ppa_Contig5:941324-942324 plot priPac3
224 2013 pseHum1 GCF_000331425.1 SOAPdenovo v. 1.5 140033 36 7836448   130 200423267 7517, 10, KB221191:4083820-4098863 plot pseHum1
225 2014 pteGut1 GCF_000699245.1 SOAPdenovo v. 1.6 41594 36 2007893   100 106772284 534, 1, JMFR01060883:1891-2959 plot pteGut1
226 2011 punNye1 GCF_000239375.1 ALLPATHS-LG v. R37016 36249 38 2385815   3630 214952201 5131, 20, JH419262:1608400-1618681 plot punNye1
227 2014 pygAde1 GCA_000699105.1 SOAPdenovo v. 1.6 38534 46 4145677   2165.5 202245057 2147, 1, KL225502:93242-97536 plot pygAde1
228 2013 pytBiv1 GCF_000186305.1 Soap deNovo v. March 2012 167510 47 14323626   217 396468667 235, 1, KE953144:259222-259692 plot pytBiv1
229 2012 repBase0 tbd tbd 54 39 2426   81.5 7619 70, 1, MER51A:232-372 plot repBase0
230 2012 repBase1 tbd tbd 73 39 3249   77 9691 70, 1, MER51A:232-372 plot repBase1
231 1880 repBase2 tbd tbd 51 40 2203   79 6907 70, 1, MER51A:232-372 plot repBase2
233 1880 ricCom1 GCF_000151685.1 tbd 430308 35 18638247   2948 2156357980 460, 1, EQ974418:17730-18650 plot ricCom1
234 2016 rouAeg1 GCF_001466805.2 SparseAssembler v. OCTOBER-2015; DBG2OLC v. OCTOBER-2015; LINKS v. 1.5.1; L_RNA_Scaffolder v. OCTOBER-2015; SSPACE v. 3.0 74090 36 4591683   5517 522939887 410, 2, NW_015493119v1:122550-123371 plot rouAeg1
235 2003 sacCer1 tbd tbd 666 45 76615   1914.5 2711249 50, 1, chr7:519107-519207 plot sacCer1
236 2008 sacCer2 tbd tbd 669 45 76812   1868 2711600 71, 1, chrX:120898-121040 plot sacCer2
237 2011 sacCer3 GCF_000146045.2 tbd 669 45 78716   1868 2711582 1988, 10, chrVIII:212266-216251 plot sacCer3
238 2013 sebNig1 GCA_000475235.1 tbd 344468 43 18020844   94 249080272 492, 1, AUPR01114153:357-1341 plot sebNig1
239 2013 sebRub1 GCA_000475215.1 SOAPdenovo v. 1.05 299208 38 14417679   139 352997550 408, 1, KI445670:61530-62346 plot sebRub1
240 2014 stePar1 GCF_000690725.1 ALLPATHS-LG v. August 2013 96483 40 8151514   3673 549500672 2624, 1, KK581067:134955-140203 plot stePar1
241 1880 strCam0 tbd tbd 23662 45 2730974   666 94888209 1134, 1, superscaffold9:6014874-6017142 plot strCam0
242 2014 strCam1 GCF_000698965.1 SOAPdenovo v. 1.6 23599 45 2718563   646 93976596 1134, 1, KL206988:717376-719644 plot strCam1
243 2005 strPur1 tbd tbd 658932 40 35574440   847.5 1824767726 956, 1, Scaffold18311:2619-4531 plot strPur1
244 2006 strPur2 tbd tbd 611722 39 30968979   1453 2323533345 956, 1, Scaffold47464:201872-203784 plot strPur2
245 2009 strPur3 tbd tbd 689739 40 35781206   1763 2733632009 956, 1, Scaffold85:237230-239142 plot strPur3
246 2011 strPur4 GCF_000002235.3 Atlas v. WGS for Sanger Assembly, Atlas-Link and Atlas-GapFill for SOLiD and Illumina improvement 888256 42 54558021   2360 3918297070 956, 1, Scaffold382:244159-246071 plot strPur4
247 2011 strRat1 tbd tbd 9910 40 540468   1077 26058414 113, 1, RATTI_contig_57682:4110-4336 plot strRat1
248 2014 strRat2 GCA_001040885.1 tbd 8546 41 482721   2233 37502032 67, 1, chrUn_LN609483v1:243-377 plot strRat2
250 1880 taeGut0 tbd tbd 597003 49 43955858   3833 2936709220 209, 1, Contig47:5328655-5329073 plot taeGut0
251 2008 taeGut1 tbd tbd 602657 49 44658308   3557 2849213000 209, 1, chrZ:28813941-28814359 plot taeGut1
252 2013 taeGut2 GCF_000151805.1 PCAP v. 2008 602028 49 44591081   3541 2830875406 209, 1, chrZ:28813941-28814359 plot taeGut2
254 2013 takFla1 GCA_000400755.1 HAPs v. 0.2.2 97724 36 6776388   184 251435705 503, 1, KE121297:329-1335 plot takFla1
255 2014 tauEry1 GCF_000709365.1 SOAPdenovo v. 1.6 58294 35 2621924   2194 291694166 227, 1, JNOY01082112:3268-3722 plot tauEry1
256 2004 tetNig1 tbd tbd 132260 38 9178877   3055 661840431 413, 1, chrUn_random:43732955-43733781 plot tetNig1
257 2007 tetNig2 tbd tbd 130250 38 9072570   3013 657839855 413, 1, chrUn_random:35610230-35611056 plot tetNig2
258 2014 tinGut1 GCF_000705375.1 SOAPdenovo v. 1.6 43221 38 2742653   279 110109614 416, 1, KL400833:106660-107492 plot tinGut1
259 2014 tinGut2 GCF_000705375.1 SOAPdenovo v. 1.6 43210 38 2741340   279 110101482 416, 1, KL895505:106660-107492 plot tinGut2
260 2005 triCas1 tbd tbd 78796 41 4117576   794 170185880 307, 1, Reptig797:115-729 plot triCas1
261 2005 triCas2 tbd tbd 77889 41 4122780   885 196279053 192, 1, singleUn_1374:29986-30370 plot triCas2
262 2011 triSpi1 GCF_000181795.1 PCAP v. January 12, 2007 11343 54 1176023   4572 74727382 98, 1, GL622792v1:5540185-5540381 plot triSpi1
263 2014 triSui1 GCA_000701005.1 SOAPdenovo v. 2 17372 45 1593967   2042.5 75416120 501, 1, KL363185v1:221782-222784 plot triSui1
264 2014 tytAlb1 GCF_000687205.1 SOAPdenovo v. 1.6 15140 33 776246   81 23744286 199, 1, JJRD01024771:5513-5911 plot tytAlb1
265 2004 xenTro1 tbd tbd 1073717 40 65548900   216 2854287515 347, 1, scaffold_3418:6326-7020 plot xenTro1
266 2005 xenTro2 tbd tbd 1034566 40 64677537   219 2820134241 548, 1, scaffold_68:2539823-2540919 plot xenTro2
267 2009 xenTro3 tbd tbd 1034357 40 64662644   219 2819484140 548, 1, GL172704:2539823-2540919 plot xenTro3
268 2012 xenTro7 GCF_000004195.2 ARACHNE v. 20071016_modified 1100470 39 53428011   229 2808691507 3560, 2, KB021656:75384527-75391648 plot xenTro7
269 2016 xenTro9 GCF_000004195.3 Meraculous v. May-2013 967829 39 47452193   241 2585384269 3560, 2, chr6:35780447-35787568 plot xenTro9
270 2012 xipMac1 GCF_000241075.1 PCAP v. 3/30/09; Newbler v. MapAsmResearch-02/17/2010 39708 35 1764516   506 136874274 119, 1, JH557910:3615-3853 plot xipMac1
272 2013 zonAlb1 GCF_000385455.1 Allpaths-LG v. Feb-2013 220251 35 9228906   369 407868591 2060, 1, KB913045:8123897-8128017 plot zonAlb1

assemblies with zero duplicate gap sequences

count year dbName ncbiAsmId number of gaps assembly method