Details pages -- conventions

From Genecats
Jump to navigationJump to search


General

Details pages should be consistent from track to track. To ease the burden on QA, developers (and QA) are requested to:

  1. Keep lines in the source page to 100 characters or less (occasional exceptions are OK, such as for long links).
The 100-char limit is dictated by the coding standards for the entire kent/src tree with the intention of improving code readability.
  1. Try to use lowercase html, <a>...</a> versus upper-case <A>...</A>.
  2. Quotes, ampersands, less than and greater than signs, and degree symbols should be represented with their [html names]. For example &quot; for " and &amp; for & and &lt; for < and &gt; for > and &deg; for °.
  3. Email addresses should go through Hiram's sanitizer (encodeEmail.pl). It turns the address into an encrypted HREF "mailto:" address that makes it harder for spammers to use.
  4. A track description page's "Display Conventions and Configuration" section should cover all track types.
  5. For ENCODE track description pages, the page should include contact information for the submitting lab.
  6. Links to pages outside of the Genome Browser should open in a new window (target="_blank").
  7. Do not include punctuation in link names. For example: target="_blank">link name</a>. NOT target="_blank">link name.</a>
  8. Links to our own SOE pages should use ".soe." in the URL, not ".cse." E.g., http://hgdownload.soe.ucsc.edu/
  9. When referencing User Interface settings, try to make the wording appear similar and bold the text. For instance, see the Wiggle Help page and how we bold Draw indicator lines: similar to how it appears on the UI page.

See the general style recommendations entry for similar related information.

Grammar/punctuation notes:

  1. Units should be separated from the numbers by a space: "200 bp", not "200bp".
  2. We are using data as a plural noun. "Data are" not "Data is."
  3. "It's" means "It is." "Its" is possessive.
  4. Use American spellings: analyze, minimize, color, etc.

References

  1. All references to websites not in our domain require the target="_blank" that will open a new tab for the reference.
  2. Papers are referenced with outlinks in one place at the bottom of the page only, not in the text above. In the text we simply say something like "Euskirchen, et al., 2007". Particularly if the paper is in preprint or Epub, we then only have to change the ref in one place.
  3. Refs are in alphabetical order by first author's last name.
  4. Refs are in PubMed format (see CBSE_citation_format). We have a script /cluster/bin/scripts/getTrackReferences that will generate this from a provided PMID, which you can find by searching pubmed.
    1. Refs with many authors are truncated to 10 authors and et al (which should be italicized).
    2. Example running of the getTrackReferences script on a Browser Paper with the PMID: 23155063 found at pubmed where the resulting html between <p> ... </p> can be pasted into a Track Description Page.
$ getTrackReferences 23155063
Accessing http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=Pubmed&id=23155063

<p>
Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, Sloan CA, Rosenbloom KR, Roe G, Rhead
B <em>et al</em>.
<a href="http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=23155063" target="_blank">
The UCSC Genome Browser database: extensions and updates 2013</a>.
<em>Nucleic Acids Res</em>. 2013 Jan;41(Database issue):D64-9.
PMID: <a href="http://www.ncbi.nlm.nih.gov/pubmed/23155063" target="_blank">23155063</a>; PMC: <a
href="http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531082/" target="_blank">PMC3531082</a>
</p>

Note for pubs.html The default should be to use the expanded list for verbose (getTrackReferences -v 23155063) so our team is mentioned (the hope is that these papers on pubs.html would generally be mostly our team).

Referring to BLAT

  • The tool = “BLAT” (if you can substitute the phrase “the BLAT tool”)
    • Thus, the target database of BLAT is not a set of GenBank sequences,...
    • ...note that BLAT results are limited to 16 results per chromosome strand...
  • The noun = “Blat”
    • Due to the high demand on our Blat servers…
    • Blat source and executables are freely available...
  • The verb = “blat”
    • When blatting on the command line....".
    • (unless it starts a sentence) “Blat the sequence...".
  • The commandline = “blat”
    • blat -stepSize=5 -repMatch=2253 -minScore=0 -minIdentity=0 database.2bit query.fa output.psl
  • The hgBlat CGI = “*hgBlat*”
    • ...some small differences between the hgBlat/gfServer and...

Data Access Templates

The Data Access section should include links to the Table Browser, Data Integrator, REST API, and Variant Annotation Integrator (if applicable). Below is a template for a few tracks on hg38 and mm10 that can be altered to fit the track being created.

GENCODE GENES TEMPLATE: (example of pointing to the blog)

<h2>Data access</h2>
<p>
GENCODE Genes and its associated tables can be explored interactively using the
<a href="/goldenPath/help/api.html">REST API</a>, the
<a href="/cgi-bin/hgTables">Table Browser</a> or the
<a href="/cgi-bin/hgIntegrator">Data Integrator</a>. 
The genePred format files for hg38 are available from our 
<a href="https://hgdownload.soe.ucsc.edu/goldenPath/hg38/database/">downloads directory</a>
or in our
<a href="https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/">
GTF download directory</a>. 
All the tables can also be queried directly from our public MySQL
servers, with more information available on our
<a href="/goldenPath/help/mysql.html">help page</a>
as well as on
<a href="http://genome.ucsc.edu/blog/tag/mysql/">our blog</a>.
</p>

NCBI RefSeq template: (example of archive server)

<p>
The raw data for these tracks can be accessed in multiple ways. It can be explored interactively 
using the <a href="/goldenPath/help/api.html" target="_blank">REST API</a>,
<a href="../cgi-bin/hgTables" target="_blank">Table Browser</a> or
<a href="../cgi-bin/hgIntegrator"
target="_blank">Data Integrator</a>. The tables can also be accessed programmatically through our
<a href="../../goldenPath/help/mysql.html"
target="_blank">public MySQL server</a> or downloaded from our
<a href="https://hgdownload.soe.ucsc.edu/goldenPath/$db/database/"
target="_blank">downloads server</a> for local processing. The previous track versions are available
in the <a href="https://hgdownload.soe.ucsc.edu/goldenPath/archive/$db/ncbiRefSeq/"
target="_blank">archives</a> of our downloads server. You can also access any RefSeq table
entries in JSON format through our <a href="http://genome.ucsc.edu/goldenPath/help/api.html">
JSON API</a>.</p>

<p>
Previous versions of the ncbiRefSeq set of tracks can be found on our <a href="https://hgdownload.soe
.ucsc.edu/goldenPath/archive/$db/ncbiRefSeq">archive download server</a>.
</p>

dbSNP 142 template: (example of pointing to our mailing list)

<p>
The raw data can be explored interactively with the
<a href="/goldenPath/help/api.html" target="_blank">REST API</a>,
<a href="../../cgi-bin/hgTables" target="_blank">Table Browser</a>,
<a href="../../cgi-bin/hgIntegrator" target="_blank">Data Integrator</a>,
or <a href="../../cgi-bin/hgVai" target="_blank">Variant Annotation Integrator</a>.
For automated analysis, the genome annotation can be downloaded from the
<a href="https://hgdownload.soe.ucsc.edu/goldenPath/mm10/database/" target="_blank">
downloads server</a> (snp142*.txt.gz) or the
<a href="../goldenPath/help/mysql.html" target="_blank">public MySQL server</a>.
Please refer to our
<a href="https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!search/download+snps" target="_blank">mailing list archives</a> for questions and example queries, or our
<a href="../FAQ/FAQdownloads.html#download36" target="_blank">Data Access FAQ</a>
for more information.
</p>

ENCODE cCREs: (example of using the bigBedToBed tool)

<p>
The ENCODE accession numbers of the constituent datasets at the
<a target="_blank" href="https://encodeproject.org">ENCODE Portal</a>
are available from the cCRE details page.
</p>
<p>
The data in this track can be interactively explored with the
<a href="../cgi-bin/hgTables">Table Browser</a> or the
<a href="../cgi-bin/hgIntegrator">Data Integrator</a>.
The data can be accessed from scripts through our
<a href="https://api.genome.ucsc.edu">API</a>, the track name is "encodeCcreCombined".

<p>
For automated download and analysis, this annotation is stored in a bigBed file that
can be downloaded from <a href="https://hgdownload.soe.ucsc.edu/gbdb/$db/encode3/ccre/"
target="_blank">our download server</a>. The file for this track is called <tt>encodeCcreCombined.bb</tt>.
Individual regions or the whole genome annotation can be obtained using our tool
<tt>bigBedToBed</tt> which can be compiled from the source code or downloaded as a precompiled
binary for your system.
Instructions for downloading source code and binaries can be found
<a href="https://hgdownload.soe.ucsc.edu/downloads.html#utilities_downloads">here</a>.
The tool can also be used to obtain only features within a given range, e.g.<br><br>
<tt>bigBedToBed https://hgdownload.soe.ucsc.edu/gbdb/mm10/encode3/ccre/encodeCcreCombined.bb -chrom=chr21 -start=0 -end=100000000 stdout</tt></p>

Tabula Muris: (example of pointing to the /gbdb fileserver)

<p>
The merged BAM files, coverage bigWig files and splice junctions in bigBed format can be
downloaded from the <a href="https://hgdownload.soe.ucsc.edu/gbdb/mm10/tabulamuris/"
target="_blank">/gbdb fileserver</a>.</p>

<p>
Since the splice junction .bigBed files have their scores capped at 1000, the original
IntronProspector .bed files are available in the same
<a href="https://cells.ucsc.edu/hubs/tabulamuris/" target="_blank">track hub directory</a>. You can
also find there *.calls.tsv files with more details about each junction, e.g. the number of
uniquely mapping reads.</p>

OMIM Genes: (example of restricted access)

<p>
Because OMIM has only allowed Data queries within individual chromosomes, no download files are
available from the Genome Browser. Full genome datasets can be downloaded directly from the
<a href="https://omim.org/downloads/" target="_blank">OMIM Downloads page</a>.
All genome-wide downloads are freely available from OMIM after registration.</p>
<p>
If you need the OMIM data in exactly the format of the UCSC Genome Browser,
for example if you are running a UCSC Genome Browser local installation (a partial "mirror"),
please create a user account on omim.org and contact OMIM via
<a href="https://omim.org/contact" target="_blank">https://omim.org/contact</a>. Send them your OMIM
account name and request access to the UCSC Genome Browser "entitlement". They will
then grant you access to a MySQL/MariaDB data dump that contains all UCSC
Genome Browser OMIM tables.</p>
<p>
UCSC offers queries within chromosomes from
<a href="hgTables" target=_blank>Table Browser</a> that include a variety
of filtering options and cross-referencing other datasets using our
<a href="hgIntegrator" target=_blank>Data Integrator</a> tool.
UCSC also has an <a href="../goldenPath/help/api.html" target=_blank>API</a>
that can be used to retrieve data in JSON format from a particular chromosome range.</p>
<p>
Please refer to our searchable
<A HREF="https://groups.google.com/a/soe.ucsc.edu/forum/?hl=en&fromgroups#!search/download+snps"
target=_blank>mailing list archives</a>
for more questions and example queries, or our
<a HREF="../FAQ/FAQdownloads.html#download36" target=_blank>Data Access FAQ</a>