KNETFILE HOOKS

From genomewiki
Revision as of 18:12, 15 September 2010 by Hiram (talk | contribs) (fixup reference to knetUdc.c)
Jump to navigationJump to search

Description

This is a description of patch files for the samtools C library (samtools.sourceforge.net) that add a layer of indirection to the external functions in knetfile.c, samtools' network code used to fetch data from URLs of BAM files (http://.../x.bam, ftp://.../x.bam).

The purpose of the new "hooks" into knetfile.c is to enable alternate implementations of knetfile functionality to be substituted for the samtools knetfile implementation. The UCSC Genome Browser source code has network code that supports https and authentication in addition to basic http and ftp (net.c), and URL data caching code (udc.c) that saves accessed portions of URLs on local disk, while checking for remote file updates. When samtools is rebuilt with this patch, the UCSC source code can be rebuilt with environment variable KNETFILE_HOOKS=1 (in addition to USE_BAM=1, see Build Environment Variables to take advantage of the new hooks in knetfile. Samtools calls knetfile functions as usual, but now those knetfile functions call functions from a UCSC wrapper on udc functionality (knetUdc.c).

UCSC source code

UCSC Genome Browser code is available via git Getting_Started_With_Git

Patch files

File:Knetfile hooks.0.1.7.patch
File:Knetfile hooks.0.1.8.patch

NOTE: the patch files include one other UCSC-specific change which you may or may not want to keep: they disable the check for empty BZGF record at the end of the BAM file (bgzf_check_EOF). The check adds an extra socket open delay (or for FTP, two extra socket opens). If you want to keep the EOF check, edit the patch file to remove the bam.c section.

Apply Patch

To apply the patch, start with a clean samtools build directory, and run these commands:

patch < knetfile_hooks.0.1.8.patch
make

samtools-0.1.7

File:Knetfile hooks.0.1.7.patch has been tested with samtools SVN versions 537 (~Feb '10) and 574 (latest as of May 11 2010). In SVN 574, some of the bam.c line numbers have changed by 2 since the patch file was created so patch prints out these messages:

patching file bam.c
Hunk #1 succeeded at 72 with fuzz 2 (offset 1 line).
Hunk #2 succeeded at 80 with fuzz 2.

-- but the patch is successful and the code builds correctly.

samtools-0.1.8

The knetfile code is fairly stable; however, bam_index has ongoing work which necessitated File:Knetfile hooks.0.1.8.patch.