Frequently asked mailing list questions

From Genecats
Jump to navigationJump to search

This FAQ page is intended to be a collection of previously answered questions on the Genome and Genome-Mirror mailing lists that are useful for answering repeat questions.

Genome

Helpful Items


If a user is looking for human or mouse genome updates, point them to:

To report errors in the human or mouse assemblies:

For users looking for help identifying effects of their novel SNPs, send them to (thanks Angie):

To check a user's Browser session in hgcentral:

  • From a shell prompt, enter a command similar to the following (depending on the username or session name you are searching for):
$ hgsql -h genome-centdb -e "select userName,sessionName,shared,firstUse,lastUse,useCount from namedSessionDb where userName like 'Gunnar%'" hgcentral
+------------------------+--------------------+--------+---------------------+---------------------+----------+
| userName               | sessionName        | shared | firstUse            | lastUse             | useCount |
+------------------------+--------------------+--------+---------------------+---------------------+----------+
| Gunnar%20H.            | 2Lfullgenom        |      1 | 2011-07-19 23:37:37 | 2011-09-27 13:42:47 |       43 |
| Gunnar%20H.            | 2Rfullgenom        |      1 | 2011-07-20 01:42:25 | 2011-08-20 05:17:27 |       28 |
| Gunnar%20H.            | 3Lfullgenom        |      1 | 2011-07-20 03:43:01 | 2011-09-27 06:43:31 |       31 |
| Gunnar%20H.            | 3Rfullgenom        |      1 | 2011-07-19 13:22:43 | 2011-08-20 05:18:21 |       31 |
| Gunnar%20H.            | 4fullgenom         |      1 | 2011-07-20 03:49:44 | 2011-09-27 07:34:00 |       25 |
| Gunnar%20H.            | dm3cov             |      1 | 2011-07-05 17:22:22 | 2011-08-25 05:02:24 |       22 |
| Gunnar%20H.            | dm3sub             |      1 | 2011-07-19 11:12:51 | 2011-07-27 00:22:07 |        4 |
| Gunnar%20H.            | Split              |      1 | 2011-09-03 05:00:42 | 2011-09-05 08:18:29 |        4 |
| Gunnar%20H.            | Xfullgenom         |      1 | 2011-07-20 05:21:20 | 2011-09-27 07:33:21 |       23 |
| Gunnar.thor.sigurdsson | mm9_test_session_1 |      1 | 2010-09-29 03:34:38 | 2010-09-29 03:34:38 |        0 |
+------------------------+--------------------+--------+---------------------+---------------------+----------+

To save a user's session to a file:

  • From a shell prompt, enter a command similar to the following:
$ hgsql -h genome-centdb -Ne "select contents from namedSessionDb where userName like 'Gunnar%' and sessionName='2Lfullgenom'" hgcentral > gunnarSession
  • The session can then be restored from this file on the hgSession page in the "Restore Settings" section

Questions

I have a list of Gene Symbols and I would like to get corresponding sequences for them.

Sharing custom tracks

Help me create a Custom Track

Is there a size limit for custom tracks?

How do I find non-protein-coding genes?

I have a list of identifiers, how do I find the coordinates?

Format of chain, chainLink and net tables

How do I get a table of restriction enzymes?

Note that the utility findCutters is better than oligoMatch: oligoMatch has no good way of finding AsuI (G'GnC_C). It's possible but it would need to be run four times: GGACC, GGCCC, GGGCC, GGTCC and the output combined. findCutters does it all in one command.

GO

How do I find orthologous genes (using TransMap)

How do I find telomeres and centromeres?


Questions about SNPs?

Instructions for downloading jksrc

I want to compare species A with species B

To tell a user we would be willing to add a permanent custom track

Is multiwig functionality available for custom tracks

Why do some gene have startCodon = stopCodon (thickStart = thickEnd)?

How do I get a list of SNPs that correspond to my gene?

How do I cross-reference UCSC gene names to RefSeq gene names?

Genome-Mirror

ENCODE

Suggest checking out / referring person to ENCODE Resources and FAQ page
http://genome.ucsc.edu/ENCODE/FAQ/index.html

Helpful resources

Questions

1) How do I display ENCODE data at GEO in the genome browser ?

  • Not by loading a custom track! Basically all ENCODE data at GEO data are already hosted in tracks at UCSC. Use Track Search and enter the GEO sample accession (GSM). A great answer by Pauline is here: [1]

2) Which cell types are used by ENCODE ? Did XXX ENCODE track use standard ENCODE cell protocols ? What was the ENCODE growth protocol for cell type YYY ?

  • See Cell Types page on portal. All ENCODE tracks use protocols registered on this page. Click 'Documents' link to see growth protocol. If you have further questions, contact lab that registered the protocol [2] [3]

3) Has transcription factor XXX been mapped by ENCODE ? How do I find overlaps between my own ChIP-seq regions and ENCODE transcription factors ?

  • Use ChIP-seq Experiment Matrix to show mapped TFs. Use Table Browser to intersect ENCODE Regulation Txn Factor clusters with custom track of user regions. [4]

4) What is represented by field NN in ENCODE bed files ?

  • See File Formats page for descriptions of ENCODE 'peak' file formats. See track descriptions for how scores and values were derived.

5) What is the difference between file XX and files XXV2 ? Why is file XX not displayed in the browser ?

  • Versioned files are often revoked and so not viewable in the browser, though still available for download. Revoke status is shown in metadata (files.txt).

6) How do I extract information about an ENCODE experiment from the filename ?

  • Don't do it! Filenames have some metadata embedded, but can only be relied on to be unique. Use file metadata, available in the following places:
    • Downloads directories: files.txt file
    • Track UI: down-arrow next to subtrack
    • Track/File search
    • genome-mysql, table browser: metaDb table

Example Galaxy Answers

  • Introduce Galaxy, Refer user to Galaxy Help

The bioinformatics tools at Galaxy may be of help to you: http://usegalaxy.org. The "Galaxy 101" tutorial features getting data from the UCSC table browser: http://wiki.galaxyproject.org/Learn/Screencasts. If you plan to use Galaxy's tools, please address questions to Galaxy: http://wiki.galaxyproject.org/Support. They should be able to help you with whatever questions you may have about their website.

  • Example of Galaxy Join

If you don't have a good way to accomplish a join of the tables, you could use Galaxy: https://main.g2.bx.psu.edu/. You would need to first fetch each of the tables separately using the "UCSC Main table browser" link (under "Get Data"), and then join them on the xxxrefGene.name/gbStatus.accxxx fields using the "Join two Datasets" link (under "Join, Subtract and Group").

  • Using Galaxy to find SNPs near genes

A method identify the distances of SNPs to a nearest gene would be to send the gene track of interest over to Galaxy via the table browser. Then use the "Fetch closest non-overlapping feature for every interval" under the "Operate on Genomic Intervals" menu.

  • Using Galaxy to filtering RNA from MAF

Galaxy has GTF/GFF tools under the "Filter and Sort" menu. If you have questions about these tools you will need to contact them.

  • Using Galaxy to reverse-complementing MAF data

According to this page: http://g2.trac.bx.psu.edu/wiki/MAFanalysis there is a tool to "Reverse complement a MAF file". https://lists.soe.ucsc.edu/pipermail/genome/2008-November/017512.html

  • Using Galaxy to convert file from FASTA to analyze in EXCEL and ACCESS

Galaxy (http://main.g2.bx.psu.edu/) has some data manipulation tools that should be of help. Go to their site, and on the left select "FASTA manipulation" and select "FASTA-to-Tabular converter": <http://main.g2.bx.psu.edu/tool_runner?tool_id=fasta2tab>

  • Using Galaxy to create your own multiple sequence alignments

Galaxy has several tools that look like they might be useful to you. See "Filter MAF blocks by Species," "Extract MAF blocks given a set of genomic intervals," and "Stitch Gene blocks given a set of coding exon intervals" on the left-hand side of the page under the "Fetch Alignments" header. https://lists.soe.ucsc.edu/pipermail/genome/2011-May/026067.html

  • Using Galaxy to convert wigToBigWig

We don't supply any executables for running wigToBigWig on Windows. An easier solution would be to use the Galaxy website (usegalaxy.org). In the Tools menu on the left-hand side of the page, select "Convert Formats" and then "Wig/BedGraph-to-bigWig converter."

  • Using Galaxy to divide whole genome into 1 Mb regions

One way you could do this would be to make a second custom track consisting of 1 Mb regions and then intersect that with your first custom track. Or select your custom track in the Table Browser and then check the box to "Send output to Galaxy". Galaxy (http://main.g2.bx.psu.edu/) has some additional tools to help manipulate data. The "Regional Variation" menu on the left-hand side of the page has a "Make windows" tool and a "Feature coverage" tool that look especially useful.

  • Using Galaxy to join, example GTF files with geneSymbol'

There is not a way to alter GTF output in the Table Browser, but Galaxy is an extensive set of tools that work in conjunction with the Genome Browser that can help do manipulations of data just like this. Use the "Join two Queries side by side on a specified field" under the "Join, Subtract and Group" header on the left-hand side of the page, and perhaps the "Text Manipulation" tools.

  • Using Galaxy to join, example derive Ensembl name from GNF atlas data

Use Galaxy's "Get Data" to load knownToGnf1m table and then load the ensGene table. Once you have loaded those tables into Galaxy, click on the "Join, Subtract and Group" link on the left-hand side of the page. Then click on "join two queries" on column 1.