BedTotalSize: Difference between revisions

From genomewiki
Jump to navigationJump to search
No edit summary
(deleted the useless python thing, hiram's version if of course better and faster as well)
 
Line 1: Line 1:
<pre>
you can do this in awk with the single line statement:
#!/usr/bin/env python
 
 
from sys import *
import sys
from re import *
 
if len(argv)==2:
        print " Will read bed-style features from stdin"
        print " Will add all features-lengths together"
        print ""
        print " SYNTAX: "
        print " totalSize "
        exit()
 
 
 
line = sys.stdin.readline()
sum = 0
while line!="":       
    fields = line.split()
    start = int(fields[1])
    stop = int(fields[2])
    sum += (stop-start+1)
    line = sys.stdin.readline()
 
print "Total length of all features: "+str(sum)
</pre>


<pre>
<pre>
#  you could also do this in awk with the single line statement:
awk '{sum += $3-$2}END{printf "total size: %d\n",sum}' file.bed
#
awk '{sum += $3-$2}END{printf "total size: %d\n",sum}' file.bed
#
#  Plus, I don't think you want to add 1 to your stop-start calculation.
#  This relates to the subtle nature of the "0-relative" vs. "1-relative"
#  coordinate systems.  When in 0-relative you don't need the + or - 1's anywhere.
</pre>
</pre>




[[Category:User Developed Scripts]]
[[Category:User Developed Scripts]]

Latest revision as of 09:16, 15 September 2006

you can do this in awk with the single line statement:

awk '{sum += $3-$2}END{printf "total size: %d\n",sum}' file.bed