KNETFILE HOOKS: Difference between revisions
m (fixup reference to knetUdc.c) |
(changed cse to soe) |
||
(16 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
=New and improved!= | |||
As of code release v275 (end of October 2012), the patching described below is no longer necessary! Instead of patching samtools and tabix, just download and build UCSC's samtabix package as follows: | |||
git clone http://genome-source.soe.ucsc.edu/samtabix.git samtabix | |||
cd samtabix | |||
make | |||
and set two environment variables, SAMTABIXDIR and USE_SAMTABIX=1. See [[Build Environment Variables]] for more details. | |||
=Obsolete but retained for backwards compatibility= | |||
==Description== | ==Description== | ||
This is a description of patch files for the samtools C library | This is a description of patch files for the samtools C library and tabix C library (both available from [http://samtools.sourceforge.net/ samtools.sourceforge.net]) that add a layer of indirection to the external functions in knetfile.c, samtools'/tabix's network code used to fetch data from URLs of BAM or tabix-compressed .gz files <nowiki>(http://.../x.bam, ftp://.../x.bam, http://.../x.vcf.gz)</nowiki>. | ||
(samtools.sourceforge.net) that add a layer of indirection to | |||
the external functions in knetfile.c, samtools' network code | Samtools and tabix contain separate knetfile code; if you link against both libraries, then both (or neither) must be patched using their respective patch files. | ||
used to fetch data from URLs of BAM files <nowiki>(http://.../x.bam, | |||
ftp://.../x.bam)</nowiki>. | |||
The purpose of the new "hooks" into knetfile.c is to enable alternate | The purpose of the new "hooks" into knetfile.c is to enable alternate | ||
implementations of knetfile functionality to be substituted for the | implementations of knetfile functionality to be substituted for the | ||
samtools knetfile implementation. The UCSC Genome Browser source code | samtools (and/or tabix) knetfile implementation. The UCSC Genome Browser source code | ||
has network code that supports https and authentication in addition to | has network code that supports https and authentication in addition to | ||
basic http and ftp ([http://hgwdev. | basic http and ftp ([http://hgwdev.soe.ucsc.edu/~kent/src/unzipped/lib/net.c net.c]), and URL data caching code | ||
([http://hgwdev. | ([http://hgwdev.soe.ucsc.edu/~kent/src/unzipped/lib/udc.c udc.c]) that saves accessed portions of URLs on local | ||
disk, while checking for remote file updates. When samtools is | disk, while checking for remote file updates. When samtools is | ||
rebuilt with this patch, the UCSC source code can be rebuilt with | rebuilt with this patch, the UCSC source code can be rebuilt with | ||
Line 18: | Line 27: | ||
take advantage of the new hooks in knetfile. Samtools calls knetfile | take advantage of the new hooks in knetfile. Samtools calls knetfile | ||
functions as usual, but now those knetfile functions call functions | functions as usual, but now those knetfile functions call functions | ||
from a UCSC wrapper on udc functionality ([http://hgwdev. | from a UCSC wrapper on udc functionality ([http://hgwdev.soe.ucsc.edu/~kent/src/unzipped/hg/lib/knetUdc.c knetUdc.c]). | ||
A pull request has been submitted to the samtools github repository ([https://github.com/lh3/samtools/pull/5 https://github.com/lh3/samtools/pull/5]) in hopes that this modification will be included in a future release of samtools. If it is, then patching will no longer be necessary. | |||
==UCSC source code== | ==UCSC source code== | ||
Line 24: | Line 35: | ||
[[Getting_Started_With_Git]] | [[Getting_Started_With_Git]] | ||
==Patch files== | ==Patch files for samtools== | ||
[[Image:Knetfile_hooks.0.1.7.patch]] | [[Image:Knetfile_hooks.0.1.7.patch]] | ||
[[Image:Knetfile_hooks.0.1.8.patch]] | [[Image:Knetfile_hooks.0.1.8.patch]] | ||
[[Image:Knetfile_hooks.0.1.9.patch]] | |||
[[Image:Knetfile_hooks.0.1.10.patch]] | |||
[[Image:Knetfile_hooks.0.1.11.patch]] | |||
[[Image:Knetfile_hooks.0.1.16.patch]] | |||
[[Image:Knetfile_hooks.0.1.17.patch]] | |||
[[Image:Knetfile_hooks.0.1.18.patch]] | |||
==Patch files for tabix== | |||
[[Image:Knetfile_hooks_tabix-0.2.3.patch]] | |||
[[Image:Knetfile_hooks_tabix-0.2.5.patch]] | |||
[[Image:Knetfile_hooks_tabix-0.2.6.patch]] | |||
NOTE: the patch files include one other UCSC-specific change which you | NOTE: the samtools patch files include one other UCSC-specific change which you | ||
may or may not want to keep: they disable the check for empty BZGF record | may or may not want to keep: they disable the check for empty BZGF record | ||
at the end of the BAM file (bgzf_check_EOF). The check adds an extra | at the end of the BAM file (bgzf_check_EOF). The check adds an extra | ||
socket open delay (or for FTP, two extra socket opens). If you want to | socket open delay (or for FTP, two extra socket opens). If you want to | ||
keep the EOF check, edit the patch file to remove the bam.c section. | keep the EOF check, edit the patch file to remove the bam.c section. | ||
NOTE: the tabix patch files include another set of changes to allow linking against both samtools and tabix libraries: tabix's bgzf_* functions are renamed to ti_bgzf_* in order to avoid linking conflicts with samtools' bgzf_* functions, which are nearly identical but not completely. | |||
==Apply Patch== | ==Apply Patch== | ||
To apply the patch, start with a clean samtools build directory, and | To apply the patch, start with a clean samtools (or tabix) build directory, and | ||
run these commands: | run these commands: | ||
patch < | # samtools: | ||
patch -p1 < Knetfile_hooks.0.1.18.patch | |||
make | make | ||
==samtools-0.1.7== | (Replace 0.1.11 with the appropriate version number if necessary.) | ||
# tabix: | |||
patch -p1 < Knetfile_hooks_tabix-0.2.6.patch | |||
==samtools version notes== | |||
===samtools-0.1.7=== | |||
[[Image:Knetfile_hooks.0.1.7.patch]] has been tested with samtools SVN versions | [[Image:Knetfile_hooks.0.1.7.patch]] has been tested with samtools SVN versions | ||
537 (~Feb '10) and 574 (latest as of May 11 2010). In SVN 574, some | 537 (~Feb '10) and 574 (latest as of May 11 2010). In SVN 574, some | ||
Line 53: | Line 84: | ||
-- but the patch is successful and the code builds correctly. | -- but the patch is successful and the code builds correctly. | ||
==samtools-0.1.8== | ===samtools-0.1.8=== | ||
The knetfile code is fairly stable; however, bam_index has ongoing | The knetfile code is fairly stable; however, bam_index has ongoing | ||
work which necessitated [[Image:Knetfile_hooks.0.1.8.patch]]. | work which necessitated [[Image:Knetfile_hooks.0.1.8.patch]]. | ||
===samtools-0.1.9=== | |||
A new subdirectory bcftools was added, and | |||
bcftools/main.c uses the macro knet_fileno() which is not supported by | |||
KNETFILE_HOOKS code. The "bcftools cat" command is implemented only | |||
when _USE_KNETFILE is defined and KENTFILE_HOOKS is *not* defined. | |||
[[Image:Knetfile_hooks.0.1.9.patch]]. | |||
===samtools-0.1.10=== | |||
[[Image:Knetfile_hooks.0.1.10.patch]] is identical to 0.1.9 but is | |||
provided as a separate file for convenience. | |||
===samtools-0.1.11 and beyond === | |||
[[Image:Knetfile_hooks.0.1.11.patch]] and subsequent patches are nearly identical to 0.1.9 | |||
and 0.1.10 but for some line numbers. |
Latest revision as of 08:01, 1 September 2018
New and improved!
As of code release v275 (end of October 2012), the patching described below is no longer necessary! Instead of patching samtools and tabix, just download and build UCSC's samtabix package as follows:
git clone http://genome-source.soe.ucsc.edu/samtabix.git samtabix cd samtabix make
and set two environment variables, SAMTABIXDIR and USE_SAMTABIX=1. See Build Environment Variables for more details.
Obsolete but retained for backwards compatibility
Description
This is a description of patch files for the samtools C library and tabix C library (both available from samtools.sourceforge.net) that add a layer of indirection to the external functions in knetfile.c, samtools'/tabix's network code used to fetch data from URLs of BAM or tabix-compressed .gz files (http://.../x.bam, ftp://.../x.bam, http://.../x.vcf.gz).
Samtools and tabix contain separate knetfile code; if you link against both libraries, then both (or neither) must be patched using their respective patch files.
The purpose of the new "hooks" into knetfile.c is to enable alternate implementations of knetfile functionality to be substituted for the samtools (and/or tabix) knetfile implementation. The UCSC Genome Browser source code has network code that supports https and authentication in addition to basic http and ftp (net.c), and URL data caching code (udc.c) that saves accessed portions of URLs on local disk, while checking for remote file updates. When samtools is rebuilt with this patch, the UCSC source code can be rebuilt with environment variable KNETFILE_HOOKS=1 (in addition to USE_BAM=1, see Build Environment Variables to take advantage of the new hooks in knetfile. Samtools calls knetfile functions as usual, but now those knetfile functions call functions from a UCSC wrapper on udc functionality (knetUdc.c).
A pull request has been submitted to the samtools github repository (https://github.com/lh3/samtools/pull/5) in hopes that this modification will be included in a future release of samtools. If it is, then patching will no longer be necessary.
UCSC source code
UCSC Genome Browser code is available via git Getting_Started_With_Git
Patch files for samtools
File:Knetfile hooks.0.1.7.patch File:Knetfile hooks.0.1.8.patch File:Knetfile hooks.0.1.9.patch File:Knetfile hooks.0.1.10.patch File:Knetfile hooks.0.1.11.patch File:Knetfile hooks.0.1.16.patch File:Knetfile hooks.0.1.17.patch File:Knetfile hooks.0.1.18.patch
Patch files for tabix
File:Knetfile hooks tabix-0.2.3.patch File:Knetfile hooks tabix-0.2.5.patch File:Knetfile hooks tabix-0.2.6.patch
NOTE: the samtools patch files include one other UCSC-specific change which you may or may not want to keep: they disable the check for empty BZGF record at the end of the BAM file (bgzf_check_EOF). The check adds an extra socket open delay (or for FTP, two extra socket opens). If you want to keep the EOF check, edit the patch file to remove the bam.c section.
NOTE: the tabix patch files include another set of changes to allow linking against both samtools and tabix libraries: tabix's bgzf_* functions are renamed to ti_bgzf_* in order to avoid linking conflicts with samtools' bgzf_* functions, which are nearly identical but not completely.
Apply Patch
To apply the patch, start with a clean samtools (or tabix) build directory, and run these commands:
# samtools: patch -p1 < Knetfile_hooks.0.1.18.patch make
(Replace 0.1.11 with the appropriate version number if necessary.)
# tabix: patch -p1 < Knetfile_hooks_tabix-0.2.6.patch
samtools version notes
samtools-0.1.7
File:Knetfile hooks.0.1.7.patch has been tested with samtools SVN versions 537 (~Feb '10) and 574 (latest as of May 11 2010). In SVN 574, some of the bam.c line numbers have changed by 2 since the patch file was created so patch prints out these messages:
patching file bam.c Hunk #1 succeeded at 72 with fuzz 2 (offset 1 line). Hunk #2 succeeded at 80 with fuzz 2.
-- but the patch is successful and the code builds correctly.
samtools-0.1.8
The knetfile code is fairly stable; however, bam_index has ongoing work which necessitated File:Knetfile hooks.0.1.8.patch.
samtools-0.1.9
A new subdirectory bcftools was added, and bcftools/main.c uses the macro knet_fileno() which is not supported by KNETFILE_HOOKS code. The "bcftools cat" command is implemented only when _USE_KNETFILE is defined and KENTFILE_HOOKS is *not* defined. File:Knetfile hooks.0.1.9.patch.
samtools-0.1.10
File:Knetfile hooks.0.1.10.patch is identical to 0.1.9 but is provided as a separate file for convenience.
samtools-0.1.11 and beyond
File:Knetfile hooks.0.1.11.patch and subsequent patches are nearly identical to 0.1.9 and 0.1.10 but for some line numbers.