HowTo: Syntenic Net or Reciprocal Best

From genomewiki
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

This page is meant to be a general "How To" for performing a syntenic net, a reciprocal best, or a chain swap procedure.

You may be interested in these other pages for more information on LiftOver or creating your own Whole Genome Alignments:

Syntenic Net

Below is a script used to create the hg38 to oviAri3 syntenic net file. If you would like to create a syntenic net file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input.

#!/bin/csh -efx
# This script was automatically generated by /cluster/bin/scripts/doBlastzChainNet.pl
# from /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/DEF
# It is to be executed on hgwdev in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain .
# It filters the net for synteny and creates syntenic net MAF files for
# multiz. Use this option when the query genome is high-coverage and not
# too distant from the reference.  Suppressed unless -syntenicNet is included.
# This script will fail if any of its commands fail.

cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain

netFilter -syn hg38.oviAri3.net.gz | gzip -c > hg38.oviAri3.syn.net.gz
netToAxt hg38.oviAri3.syn.net.gz hg38.oviAri3.all.chain.gz \
    /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \
  | axtSort stdin stdout \
  | axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. stdin \
    /hive/data/genomes/hg38/chrom.sizes /hive/data/genomes/oviAri3/chrom.sizes \
    stdout \
| gzip -c > hg38.oviAri3.synNet.maf.gz
md5sum hg38.oviAri3.syn.net.gz hg38.oviAri3.synNet.maf.gz > synNet.md5sum.txt
mkdir -p /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3
cd /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3
ln -s /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/hg38.oviAri3.synNet.maf.gz .
cat /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/synNet.md5sum.txt >> md5sum.txt
sort -u md5sum.txt > tmp.sum
cat tmp.sum > md5sum.txt
rm -f tmp.sum

Reciprocal Best

Below is a script used to create the hg38 to oviAri3 reciprocal best file. If you would like to create a reciprocal best file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input. You can also see an example in the mailing list archives involving doRecipBest.pl involving hg19 & mm10 where the resulting files were placed here.

#!/bin/csh -efx
# This script was automatically generated by /cluster/bin/scripts/doRecipBest.pl
# It is to be executed on ku in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain .
# It nets in both directions to get reciprocal best chains and nets.
# This script will fail if any of its commands fail.

cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain

# Swap hg38-best chains to be oviAri3-referenced:
chainStitchId hg38.oviAri3.over.chain.gz stdout \
| chainSwap stdin stdout \
| chainSort stdin oviAri3.hg38.tBest.chain

# Net those on oviAri3 to get oviAri3-ref'd reciprocal best net:
chainPreNet oviAri3.hg38.tBest.chain \
  /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout \
| chainNet -minSpace=1 -minScore=0 \
  stdin /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout /dev/null \
| netSyntenic stdin stdout \
| gzip -c > oviAri3.hg38.rbest.net.gz

# Extract oviAri3-ref'd reciprocal best chain:
netChainSubset oviAri3.hg38.rbest.net.gz oviAri3.hg38.tBest.chain stdout \
| chainStitchId stdin stdout \
| gzip -c > oviAri3.hg38.rbest.chain.gz

# Swap to get hg38-ref'd reciprocal best chain:
chainSwap oviAri3.hg38.rbest.chain.gz stdout \
| chainSort stdin stdout \
| gzip -c > hg38.oviAri3.rbest.chain.gz

# Net those on hg38 to get hg38-ref'd reciprocal best net:
chainPreNet hg38.oviAri3.rbest.chain.gz \
  /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout \
| chainNet -minSpace=1 -minScore=0 \
  stdin /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout /dev/null \
| netSyntenic stdin stdout \
| gzip -c > hg38.oviAri3.rbest.net.gz

# Clean up the one temp file and make md5sum:
rm oviAri3.hg38.tBest.chain
md5sum *.rbest.*.gz > md5sum.rbest.txt

# Create files for testing coverage of *.rbest.*.
netToBed -maxGap=1 oviAri3.hg38.rbest.net.gz oviAri3.hg38.rbest.net.bed
netToBed -maxGap=1 hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.net.bed
chainToPsl oviAri3.hg38.rbest.chain.gz \
  /hive/data/genomes/{oviAri3,hg38}/chrom.sizes \
  /hive/data/genomes/oviAri3/oviAri3.2bit /hive/data/genomes/hg38/hg38.2bit \
  oviAri3.hg38.rbest.chain.psl
chainToPsl hg38.oviAri3.rbest.chain.gz \
  /hive/data/genomes/{hg38,oviAri3}/chrom.sizes \
  /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit \
  hg38.oviAri3.rbest.chain.psl

# Verify that all coverage figures are equal:
set tChCov = `awk '{print $19;}' hg38.oviAri3.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'`
set qChCov = `awk '{print $19;}' oviAri3.hg38.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'`
set tNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' hg38.oviAri3.rbest.net.bed`
set qNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' oviAri3.hg38.rbest.net.bed`
if ($tChCov != $qChCov) then
  echo "Warning: hg38 rbest chain coverage $tChCov != oviAri3 $qChCov"
endif
if ($tNetCov != $qNetCov) then
  echo "Warning: hg38 rbest net coverage $tNetCov != oviAri3 $qNetCov"
endif
if ($tChCov != $tNetCov) then
  echo "Warning: hg38 rbest chain coverage $tChCov != net cov $tNetCov"
endif

mkdir experiments
mv *.bed *.psl experiments
# Make rbest net axt's download
mkdir ../axtRBestNet
netToAxt hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.chain.gz \
    /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \
    | axtSort stdin stdout \
    | gzip -c > ../axtRBestNet/hg38.oviAri3.rbest.axt.gz
# Make rbest mafNet for multiz
mkdir ../mafRBestNet
axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. ../axtRBestNet/hg38.oviAri3.rbest.axt.gz \
        /hive/data/genomes/{hg38,oviAri3}/chrom.sizes  \
                stdout \
      | gzip -c > ../mafRBestNet/hg38.oviAri3.rbest.maf.gz
cd ../mafRBestNet
md5sum *.maf.gz > md5sum.txt
cd ../axtRBestNet
md5sum *.axt.gz > md5sum.txt

Chain Swap

Here is a sample command that will swap an existing chain alignment:

chainSwap hg38.oviAri3.all.chain.gz stdout | nice chainSort stdin stdout \ | nice gzip -c > oviAri3.hg38.all.chain.gz