Assembly Release QA Steps: Difference between revisions

From Genecats
Jump to navigationJump to search
No edit summary
No edit summary
Line 26: Line 26:
===<span style="color:blue">Change happens===
===<span style="color:blue">Change happens===
</span>
</span>
As time goes on, please update this section as things change!
Collaboration rocks. Keep me updated!




Line 39: Line 39:
===<span style="color:blue">Do storks bring new assemblies in?===
===<span style="color:blue">Do storks bring new assemblies in?===
</span>
</span>
From Hiram, 10/2016:


While genbank and refseq assemblies can be claimed to be 'identical'
While genbank and refseq assemblies can be claimed to be 'identical'
Line 46: Line 45:
format NW_013982187v1 which is a RefSeq identifier.
format NW_013982187v1 which is a RefSeq identifier.


galGal5 is a RefSeq assembly GCF_000002315.4
It does have the chrM/NC_001323.1


RefSeq assemblies often are delivered with chrMt, although not
:RefSeq assemblies:
always (sometimes none exists).  Genbank assemblies are almost
::use accession ID: '''GCF'''_000002315.4 (e.g., galGal5)
always delivered *without* a chrMt
::are delivered '''with''' chrMt (if they exisit)
::are delivered with NCBI gene predictions


RefSeq assemblies are delivered with NCBI gene predictions, Genbank
:Genbank assemblies:
assemblies do *not* have gene predictions.
::use accession ID: '''GCA'''_000001305.2
::delivered '''without''' a chrMt.
::do '''not''' have gene predictions.


Hence, whenever possible it is preferable to use RefSeq assemblies
For the UCSC Genome Browser, it is preferable to use RefSeq assemblies (in part due to 'more data').
since they have that extra stuff.  This is a recent innovation at
This is a "learn as we go" direction; historically GeneBank was preferred.
UCSC, we (==I) always used to use GenBank assemblies because I was
under the mistaken impression that somehow that was the gold standard
'official' assembly.  Not true at all.  Live and learn, however slowly.





Revision as of 16:34, 18 October 2016

Welcome to the Assembly Release: QA Guide 😀

Home: Assembly_Release_QA_Steps
  1. Assembly QA Part 1: DEV Steps
  2. Assembly QA Part 2: BETA Steps
  3. Assembly QA Part 3: RR Steps
  4. Assembly QA Part 4: Post Release Steps

Page created Fall. 2016 by Cath, Jairo, and ChrisV.
This page is currently a draft in progress.
For now, use Releasing_an_assembly instead.


Introduction

When a developer is ready for a new assembly to be released, the QA team (usually an individual of) will QA and release the assembly. This wiki section exists as a guide for the assembly QA and release process.


Change happens

Collaboration rocks. Keep me updated!


For the UCSC Genome Browser QA Team, there are two types of genome assemblies:

  1. New species: Assembly for a species that is not already exisiting as a browser.
  2. New version for exisiting species: Assembly version for a species that already exists as a browser.

When a new or updated assembly is ready to QA, the QA team should perform the following steps, outlined in this guide.


Do storks bring new assemblies in?

While genbank and refseq assemblies can be claimed to be 'identical' that just means they use the same sequence. The names for everything are different, aptMan1 has contig names of the format NW_013982187v1 which is a RefSeq identifier.


RefSeq assemblies:
use accession ID: GCF_000002315.4 (e.g., galGal5)
are delivered with chrMt (if they exisit)
are delivered with NCBI gene predictions
Genbank assemblies:
use accession ID: GCA_000001305.2
delivered without a chrMt.
do not have gene predictions.

For the UCSC Genome Browser, it is preferable to use RefSeq assemblies (in part due to 'more data'). This is a "learn as we go" direction; historically GeneBank was preferred.



🔵 Ready to get started? Let's go to Assembly QA Part 1: DEV Steps