KnownGene build: Difference between revisions

Revision as of 18:33, 23 August 2021

Build UniProt and Protein databases

I haven't been doing this recently. We need to look into whether the work Max has done with uniprot should replace this.

Initialize work directory

Create and cd into work directory of the form /hive/data/genomes/$db/bed/gencode$GENCODE_VERSION/build
Start a screen.
Copy buildEnv.sh from previous build on this db

Setting environment variables

The environment variables used in the build are set in the script buildEnv.sh. All the other scripts assume that this script has been sourced in the current shell. You have to edit this by hand.

Running the build

To run the build execute hg/utils/otto/knownGene/buildKnown.sh. It builds into It does the following steps:

Extracting Gencode data
Building initial knownGene table
Adding primary reference tables
Building final knownGene core tables
Building bigGenePred

Adding trackDb entry

Adding IsPcr server

After building /gbdb/$db/targetDb/${db}KgSeq${curVer}.2bit, which happens in the buildCore.sh script run at the beginning of the process, ask cluster-admin to start an untranslated, -stepSize=5 gfServer on /gbdb/$db/targetDb/${db}KgSeq${curVer}.2bit

 to cluster-admin

 Hey my friends,
 
 Could you please start an untranslated -stepSize=5 production gfserver
 with this 2bit file?
 
 hgwdev:/gbdb/mm39/targetDb/mm39KgSeq13.2bit
 
 thanks!
 brian

On hgwdev, insert new records into blatServers and targetDb, using the host (field 2) and port (field 3) specified by cluster-admin. Identify the blatServer by the keyword "$db"Kg with the version number appended

cluster-admin will say something like this:

 Starting untrans gfServer for mm39KgSeq13 on host blat1b port 17921

Add this info to blatServers and targetDb tables in hgcentral.

  hgsql hgcentraltest -e \
     'INSERT into blatServers values ("mm39KgSeq13", "blat1c", 17921, 0, 1,"");'
  hgsql hgcentraltest -e \
           'INSERT into targetDb values("mm39KgSeq13", "GENCODE Genes", \
                    "mm39", "kgTargetAli", "", "", \
                             "/gbdb/mm39/targetDb/mm39KgSeq13.2bit", 1, now(), "");'

@@ Line 1: / Line 1: @@
 == Build UniProt and Protein databases ==
+I haven't been doing this recently.  We need to look into whether the work Max has done with uniprot should replace this.
@@ Line 11: / Line 13: @@
 == Setting environment variables ==
-The environment variables used in the build are set in the script buildEnv.sh.   All the other scripts assume that this script has been sourced in the current shell.
+The environment variables used in the build are set in the script buildEnv.sh.   All the other scripts assume that this script has been sourced in the current shell.   You have to edit this by hand.
+== Running the build ==
+To run the build execute hg/utils/otto/knownGene/buildKnown.sh.   It builds into It does the following steps:
+* Extracting Gencode data
+* Building initial knownGene table
+* Adding primary reference tables
+* Building final knownGene core tables
+* Building bigGenePred
-== Extracting Gencode data ==
-== Building initial knownGene table ==
-== Adding primary reference tables ==
-== Building final knownGene core tables ==
-== Building bigGenePred ==
 == Adding trackDb entry ==
 == Adding IsPcr server ==

KnownGene build: Difference between revisions

Revision as of 18:33, 23 August 2021

Contents

Build UniProt and Protein databases

Initialize work directory

Setting environment variables

Running the build

Adding trackDb entry

Adding IsPcr server

all.joiner changes

Redmine ticket files and tables

Navigation menu

Page actions

Page actions

Personal tools

Genecats Wiki Navigation

Search

Media Wiki Navigation

Tools