HowTo: Syntenic Net or Reciprocal Best: Difference between revisions
Matt Speir (talk | contribs) (Created page with "This page is meant to be a general "How To" for performing a syntenic net, a reciprocal best, or a chain swap procedure. You may be interested in these other pages for more info...") |
(→Reciprocal Best: adding link to MLQ and goldenPath/hg19/vsMm10/reciprocalBest/ as more references.) |
||
Line 42: | Line 42: | ||
== Reciprocal Best == | == Reciprocal Best == | ||
Below is a script used to create the hg38 to oviAri3 reciprocal best file. If you would like to create a reciprocal best file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input. | Below is a script used to create the hg38 to oviAri3 reciprocal best file. If you would like to create a reciprocal best file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input. You can also see an example in the mailing list archives involving [https://groups.google.com/a/soe.ucsc.edu/d/msg/genome/bYaydHLNUb0/DovSn8gs6j8J doRecipBest.pl involving hg19 & mm10] where the resulting files were placed [http://hgdownload-test.cse.ucsc.edu/goldenPath/hg19/vsMm10/reciprocalBest/ here]. | ||
<pre> | <pre> |
Latest revision as of 19:00, 12 January 2016
This page is meant to be a general "How To" for performing a syntenic net, a reciprocal best, or a chain swap procedure.
You may be interested in these other pages for more information on LiftOver or creating your own Whole Genome Alignments:
Syntenic Net
Below is a script used to create the hg38 to oviAri3 syntenic net file. If you would like to create a syntenic net file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input.
#!/bin/csh -efx # This script was automatically generated by /cluster/bin/scripts/doBlastzChainNet.pl # from /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/DEF # It is to be executed on hgwdev in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain . # It filters the net for synteny and creates syntenic net MAF files for # multiz. Use this option when the query genome is high-coverage and not # too distant from the reference. Suppressed unless -syntenicNet is included. # This script will fail if any of its commands fail. cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain netFilter -syn hg38.oviAri3.net.gz | gzip -c > hg38.oviAri3.syn.net.gz netToAxt hg38.oviAri3.syn.net.gz hg38.oviAri3.all.chain.gz \ /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \ | axtSort stdin stdout \ | axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. stdin \ /hive/data/genomes/hg38/chrom.sizes /hive/data/genomes/oviAri3/chrom.sizes \ stdout \ | gzip -c > hg38.oviAri3.synNet.maf.gz md5sum hg38.oviAri3.syn.net.gz hg38.oviAri3.synNet.maf.gz > synNet.md5sum.txt mkdir -p /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3 cd /usr/local/apache/htdocs-hgdownload/goldenPath/hg38/vsOviAri3 ln -s /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/hg38.oviAri3.synNet.maf.gz . cat /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain/synNet.md5sum.txt >> md5sum.txt sort -u md5sum.txt > tmp.sum cat tmp.sum > md5sum.txt rm -f tmp.sum
Reciprocal Best
Below is a script used to create the hg38 to oviAri3 reciprocal best file. If you would like to create a reciprocal best file for your pairwise alignment, you can use the script as a template. You will need to change file paths and file names to match your input. You can also see an example in the mailing list archives involving doRecipBest.pl involving hg19 & mm10 where the resulting files were placed here.
#!/bin/csh -efx # This script was automatically generated by /cluster/bin/scripts/doRecipBest.pl # It is to be executed on ku in /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain . # It nets in both directions to get reciprocal best chains and nets. # This script will fail if any of its commands fail. cd /hive/data/genomes/hg38/bed/lastzOviAri3.2015-01-21/axtChain # Swap hg38-best chains to be oviAri3-referenced: chainStitchId hg38.oviAri3.over.chain.gz stdout \ | chainSwap stdin stdout \ | chainSort stdin oviAri3.hg38.tBest.chain # Net those on oviAri3 to get oviAri3-ref'd reciprocal best net: chainPreNet oviAri3.hg38.tBest.chain \ /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout \ | chainNet -minSpace=1 -minScore=0 \ stdin /hive/data/genomes/{oviAri3,hg38}/chrom.sizes stdout /dev/null \ | netSyntenic stdin stdout \ | gzip -c > oviAri3.hg38.rbest.net.gz # Extract oviAri3-ref'd reciprocal best chain: netChainSubset oviAri3.hg38.rbest.net.gz oviAri3.hg38.tBest.chain stdout \ | chainStitchId stdin stdout \ | gzip -c > oviAri3.hg38.rbest.chain.gz # Swap to get hg38-ref'd reciprocal best chain: chainSwap oviAri3.hg38.rbest.chain.gz stdout \ | chainSort stdin stdout \ | gzip -c > hg38.oviAri3.rbest.chain.gz # Net those on hg38 to get hg38-ref'd reciprocal best net: chainPreNet hg38.oviAri3.rbest.chain.gz \ /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout \ | chainNet -minSpace=1 -minScore=0 \ stdin /hive/data/genomes/{hg38,oviAri3}/chrom.sizes stdout /dev/null \ | netSyntenic stdin stdout \ | gzip -c > hg38.oviAri3.rbest.net.gz # Clean up the one temp file and make md5sum: rm oviAri3.hg38.tBest.chain md5sum *.rbest.*.gz > md5sum.rbest.txt # Create files for testing coverage of *.rbest.*. netToBed -maxGap=1 oviAri3.hg38.rbest.net.gz oviAri3.hg38.rbest.net.bed netToBed -maxGap=1 hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.net.bed chainToPsl oviAri3.hg38.rbest.chain.gz \ /hive/data/genomes/{oviAri3,hg38}/chrom.sizes \ /hive/data/genomes/oviAri3/oviAri3.2bit /hive/data/genomes/hg38/hg38.2bit \ oviAri3.hg38.rbest.chain.psl chainToPsl hg38.oviAri3.rbest.chain.gz \ /hive/data/genomes/{hg38,oviAri3}/chrom.sizes \ /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit \ hg38.oviAri3.rbest.chain.psl # Verify that all coverage figures are equal: set tChCov = `awk '{print $19;}' hg38.oviAri3.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'` set qChCov = `awk '{print $19;}' oviAri3.hg38.rbest.chain.psl | sed -e 's/,/\n/g' | awk 'BEGIN {N = 0;} {N += $1;} END {printf "%d\n", N;}'` set tNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' hg38.oviAri3.rbest.net.bed` set qNetCov = `awk 'BEGIN {N = 0;} {N += ($3 - $2);} END {printf "%d\n", N;}' oviAri3.hg38.rbest.net.bed` if ($tChCov != $qChCov) then echo "Warning: hg38 rbest chain coverage $tChCov != oviAri3 $qChCov" endif if ($tNetCov != $qNetCov) then echo "Warning: hg38 rbest net coverage $tNetCov != oviAri3 $qNetCov" endif if ($tChCov != $tNetCov) then echo "Warning: hg38 rbest chain coverage $tChCov != net cov $tNetCov" endif mkdir experiments mv *.bed *.psl experiments # Make rbest net axt's download mkdir ../axtRBestNet netToAxt hg38.oviAri3.rbest.net.gz hg38.oviAri3.rbest.chain.gz \ /hive/data/genomes/hg38/hg38.2bit /hive/data/genomes/oviAri3/oviAri3.2bit stdout \ | axtSort stdin stdout \ | gzip -c > ../axtRBestNet/hg38.oviAri3.rbest.axt.gz # Make rbest mafNet for multiz mkdir ../mafRBestNet axtToMaf -tPrefix=hg38. -qPrefix=oviAri3. ../axtRBestNet/hg38.oviAri3.rbest.axt.gz \ /hive/data/genomes/{hg38,oviAri3}/chrom.sizes \ stdout \ | gzip -c > ../mafRBestNet/hg38.oviAri3.rbest.maf.gz cd ../mafRBestNet md5sum *.maf.gz > md5sum.txt cd ../axtRBestNet md5sum *.axt.gz > md5sum.txt
Chain Swap
Here is a sample command that will swap an existing chain alignment:
chainSwap hg38.oviAri3.all.chain.gz stdout | nice chainSort stdin stdout \ | nice gzip -c > oviAri3.hg38.all.chain.gz