HowTo: Syntenic Net or Reciprocal Best

From genomewiki
Jump to: navigation, search

This page is meant to be a general "How To" for performing a syntenic net, a reciprocal best, or a chain swap procedure.

You may be interested in these other pages for more information on LiftOver or creating your own Whole Genome Alignments:

Syntenic Net

Below is a script used to create the hg38 to oviAri3 syntenic net file. If you would like to create a syntenic net file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input.

#!/bin/csh -efx
# This script was automatically generated by /cluster/bin/scripts/doBlastzChainNet.pl
# from /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/DEF
# It is to be executed on hgwdev in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain .
# It filters the net for synteny and creates syntenic net MAF files for
# multiz. Use this option when the query genome is high-coverage and not
# too distant from the reference.  Suppressed unless -syntenicNet is included.
# This script will fail if any of its commands fail.

cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain

netFilter -syn hg38.oviAri3.net.gz | gzip -c > hg38.oviAri3.syn.net.gz
netToAxt hg38.oviAri3.syn.net.gz hg38.oviAri3.all.chain.gz \
    /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \
  | axtSort stdin stdout \
  | axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. stdin \
    /hive/data/genomes/hg38/chrom.sizes /hive/data/genomes/oviAri3/chrom.sizes \
    stdout \
| gzip -c > hg38.oviAri3.synNet.maf.gz
md5sum hg38.oviAri3.syn.net.gz hg38.oviAri3.synNet.maf.gz > synNet.md5sum.txt
mkdir -p /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3
cd /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3
ln -s /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/hg38.oviAri3.synNet.maf.gz .
cat /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/synNet.md5sum.txt >> md5sum.txt
sort -u md5sum.txt > tmp.sum
cat tmp.sum > md5sum.txt
rm -f tmp.sum

Reciprocal Best

Below is a script used to create the hg38 to oviAri3 reciprocal best file. If you would like to create a reciprocal best file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input. You can also see an example in the mailing list archives involving doRecipBest.pl involving hg19 & mm10 where the resulting files were placed here.

#!/bin/csh -efx
# This script was automatically generated by /cluster/bin/scripts/doRecipBest.pl
# It is to be executed on ku in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain .
# It nets in both directions to get reciprocal best chains and nets.
# This script will fail if any of its commands fail.

cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain

# Swap hg38-best chains to be oviAri3-referenced:
chainStitchId hg38.oviAri3.over.chain.gz stdout \
| chainSwap stdin stdout \
| chainSort stdin oviAri3.hg38.tBest.chain

# Net those on oviAri3 to get oviAri3-ref'd reciprocal best net:
chainPreNet oviAri3.hg38.tBest.chain \
  /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout \
| chainNet -minSpace=1 -minScore=0 \
  stdin /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout /dev/null \
| netSyntenic stdin stdout \
| gzip -c > oviAri3.hg38.rbest.net.gz

# Extract oviAri3-ref'd reciprocal best chain:
netChainSubset oviAri3.hg38.rbest.net.gz oviAri3.hg38.tBest.chain stdout \
| chainStitchId stdin stdout \
| gzip -c > oviAri3.hg38.rbest.chain.gz

# Swap to get hg38-ref'd reciprocal best chain:
chainSwap oviAri3.hg38.rbest.chain.gz stdout \
| chainSort stdin stdout \
| gzip -c > hg38.oviAri3.rbest.chain.gz

# Net those on hg38 to get hg38-ref'd reciprocal best net:
chainPreNet hg38.oviAri3.rbest.chain.gz \
  /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout \
| chainNet -minSpace=1 -minScore=0 \
  stdin /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout /dev/null \
| netSyntenic stdin stdout \
| gzip -c > hg38.oviAri3.rbest.net.gz

# Clean up the one temp file and make md5sum:
rm oviAri3.hg38.tBest.chain
md5sum *.rbest.*.gz > md5sum.rbest.txt

# Create files for testing coverage of *.rbest.*.
netToBed -maxGap=1 oviAri3.hg38.rbest.net.gz oviAri3.hg38.rbest.net.bed
netToBed -maxGap=1 hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.net.bed
chainToPsl oviAri3.hg38.rbest.chain.gz \
  /hive/data/genomes/{oviAri3,hg38}/chrom.sizes \
  /hive/data/genomes/oviAri3/oviAri3.2bit /hive/data/genomes/hg38/hg38.2bit \
  oviAri3.hg38.rbest.chain.psl
chainToPsl hg38.oviAri3.rbest.chain.gz \
  /hive/data/genomes/{hg38,oviAri3}/chrom.sizes \
  /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit \
  hg38.oviAri3.rbest.chain.psl

# Verify that all coverage figures are equal:
set tChCov = `awk '{print $19;}' hg38.oviAri3.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'`
set qChCov = `awk '{print $19;}' oviAri3.hg38.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'`
set tNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' hg38.oviAri3.rbest.net.bed`
set qNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' oviAri3.hg38.rbest.net.bed`
if ($tChCov != $qChCov) then
  echo "Warning: hg38 rbest chain coverage $tChCov != oviAri3 $qChCov"
endif
if ($tNetCov != $qNetCov) then
  echo "Warning: hg38 rbest net coverage $tNetCov != oviAri3 $qNetCov"
endif
if ($tChCov != $tNetCov) then
  echo "Warning: hg38 rbest chain coverage $tChCov != net cov $tNetCov"
endif

mkdir experiments
mv *.bed *.psl experiments
# Make rbest net axt's download
mkdir ../axtRBestNet
netToAxt hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.chain.gz \
    /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \
    | axtSort stdin stdout \
    | gzip -c > ../axtRBestNet/hg38.oviAri3.rbest.axt.gz
# Make rbest mafNet for multiz
mkdir ../mafRBestNet
axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. ../axtRBestNet/hg38.oviAri3.rbest.axt.gz \
        /hive/data/genomes/{hg38,oviAri3}/chrom.sizes  \
                stdout \
      | gzip -c > ../mafRBestNet/hg38.oviAri3.rbest.maf.gz
cd ../mafRBestNet
md5sum *.maf.gz > md5sum.txt
cd ../axtRBestNet
md5sum *.axt.gz > md5sum.txt

Chain Swap

Here is a sample command that will swap an existing chain alignment:

chainSwap hg38.oviAri3.all.chain.gz stdout | nice chainSort stdin stdout \ | nice gzip -c > oviAri3.hg38.all.chain.gz