Where is the annotation data: Difference between revisions

Latest revision as of 13:59, 17 June 2020

Sometimes you know the name of the track, e.g. clinvarMain or est, from clicking around in the UI, and you need the name of the database table or file. This is not as straightforward as it seems.

1) First you can look at the trackDb files. in kent/src/hg/makeDb/trackDb, "grep -r <trackName> *" to find the trackDb.ra file that defines your track. If it has a bigDataUrl line, then the data is in a bigBed/bigWig file referenced here. The path MUST be /gbdb/<db>/xxxxx, we have a habit of storing big files under /gbdb/<db>/bbi, but not always. Either way, use the bigBedToBed to bigBedInfo -as or similar tools for bigWig to look at the data.

2) In all other cases, the data is in a database table with the same name as the track. You can use hgSql or 'mysql --no-defaults -h genome-mysql.cse.ucsc.edu -u genome -A' and then

  use hg19
  select * from clinvarMain;

This either shows (A) the data itself as rows or (B) a pointer to the data file. If it shows the data itself, case 2A, you can go to Mysql, and do a "SELECT * from <tableName> limit 10" to get an idea of the data itself (appending \G if your screen is too small) or "DESCRIBE <tableName>" to get the table schema.

Sometimes the table contains only a single row e.g. /gbdb/hg19/bbi/clinvar/clinvarMain.bb, this is case 2B. So this is non-MySQL data, you know where to look and can display the data with "bigBedToBed /gbdb/hg19/bbi/clinvar/clinvarMain.bb" and "bigBedInfo -as /gbdb/hg19/bbi/clinvar/clinvarMain.bb" to get the data schema.

3) other cases are obscure and very very rare these days. E.g. in hg18 there are some tables that are split by chromosome. You can check with "SHOW TABLES LIKE '%tableName%'" in mysql if there is a table like "chr4_clinvarMain" (there isn't, for this example). For very few and very old tracks, the table name is just different from the track name and the table browser is your friend, or the source code or your trusted older colleagues.

Where is the annotation data: Difference between revisions

Latest revision as of 13:59, 17 June 2020

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

related sites

hosted projects

Tools

@@ Line 2: / Line 2: @@
 This is not as straightforward as it seems.
-) First you can look at the trackDb files. in kent/src/hg/makeDb/trackDb, "grep -r <trackName> *" to find the trackDb.ra file that defines your track. If it has a bigDataUrl line, then the data is in a bigBed/bigWig file referenced here. The path MUST be /gbdb/<db>/xxxxx, we have a habit of storing big files under /gbdb/<db>/bbi, but not always
+) First you can look at the trackDb files. in kent/src/hg/makeDb/trackDb, "grep -r <trackName> *" to find the trackDb.ra file that defines your track. If it has a bigDataUrl line, then the data is in a bigBed/bigWig file referenced here. The path MUST be /gbdb/<db>/xxxxx, we have a habit of storing big files under /gbdb/<db>/bbi, but not always. Either way, use the bigBedToBed to bigBedInfo -as or similar tools for bigWig to look at the data.
-) In most cases, the data is in a database table with the same name as the track. You can use hgSql or 'mysql --no-defaults -h genome-mysql.cse.ucsc.edu -u genome -A' and then
+) In all other cases, the data is in a database table with the same name as the track. You can use hgSql or 'mysql --no-defaults -h genome-mysql.cse.ucsc.edu -u genome -A' and then
     use hg19
     select * from clinvarMain;
-This either shows (1) the data itself as rows or (2) a pointer to the data file. If it shows the data itself, you can go to Mysql, and do a "SELECT * from <tableName> limit 10" to get an idea or "DESCRIBE <tableName>" to get the schema.
+This either shows (A) the data itself as rows or (B) a pointer to the data file. If it shows the data itself, case 2A, you can go to Mysql, and do a "SELECT * from <tableName> limit 10" to get an idea of the data itself (appending \G if your screen is too small) or "DESCRIBE <tableName>" to get the table schema.
-Sometimes the table contains only a single row e.g. /gbdb/hg19/bbi/clinvar/clinvarMain.bb, this is case (2). So this is non-MySQL data, you know where to look and can display the data with "bigBedToBed  /gbdb/hg19/bbi/clinvar/clinvarMain.bb" and "bigBedInfo -as /gbdb/hg19/bbi/clinvar/clinvarMain.bb" to get the data schema.
+Sometimes the table contains only a single row e.g. /gbdb/hg19/bbi/clinvar/clinvarMain.bb, this is case 2B. So this is non-MySQL data, you know where to look and can display the data with "bigBedToBed  /gbdb/hg19/bbi/clinvar/clinvarMain.bb" and "bigBedInfo -as /gbdb/hg19/bbi/clinvar/clinvarMain.bb" to get the data schema.
-) other cases are obscure and very very rare these days. E.g. in hg18 there are some tables that are split by chromosome. You can check with "SHOW TABLES LIKE '%tableName%'" in mysql if there is a table like "chr4_clinvarMain" (there isn't). For very few and very old tracks, the table name different from the track name.
+) other cases are obscure and very very rare these days. E.g. in hg18 there are some tables that are split by chromosome. You can check with "SHOW TABLES LIKE '%tableName%'" in mysql if there is a table like "chr4_clinvarMain" (there isn't, for this example). For very few and very old tracks, the table name is just different from the track name and the table browser is your friend, or the source code or your trusted older colleagues.