Visualizing Coordinates

From genomewiki
Jump to navigationJump to search
Regarding strand coordinates, there are generally two ways in which this can be done:

#1. Specify coordinate on positive strand, and then after the fact,
note whether it is actually on the negative strand.  We typically
use this one very much, probably because it makes it easier to
compare coordinates, especially if you don't care what strand it is on.

#2. Specify the strand first, and then use the coordinates of that strand.

Both are in use in general and in different places.
If #2 is used and it is on the negative stand, people use the phrase
that it is in "negative strand coordinates."  

Cases that I can remember that do this are the chain files.
Also, bizarrely enough, in the psl format, although the
main start and end coordinates are in positive strand coords
(probably to allow rapid coordinate compares while looking
for overlaps at the whole-gene level).
the actual block starts, and their order, are in negative strand
coordinates.

To convert from #1 to #2, you generally takes 
start2 = chromSize - end1
end2 = chromSize - start1

To make my graph easier in text,
lets say that S and E are start1 and end1 on pos strand coords,
and s and e are start and end on neg strand coords.

                        e      s                 ...210  (neg strand coords)
                         YYYYYYY
eziSmorhc=Cnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
           ppppppppppppppppppppppppppppppppppppppppppppC=chromSize
                         XXXXXXX
           012...        S      E                        (pos strand coords)

with our zero-based half-open coordinates, the positive strand coordinate
runs from 0 to chromSize-1, that is [0,chromSize) which is also [0,chromSize-1].
Negative strand coordinates also have the same range the negative strand, of course.

So
s = C - E
e = C - S

With form #1, we say it is at S,E but by the way, it is really on the neg strand (-).
With form #2, we say it is on the negative strand (-), at coordinates s,e.

So, do you want the coordinates first, or the strand? Either way can work.

---------

Note that if you use one-based closed coordinates then the picture
looks like this:  coord range both strands: [1,chromSize]
<pre>
                          e     s                ...321  (neg strand coords)
 eziSmorhc=C              YYYYYYY
           nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
           pppppppppppppppppppppppppppppppppppppppppppp
                          XXXXXXX                     C=chromSize
           123...         S     E                        (pos strand coords)

s = C - E + 1
e = C - S + 1

So in these coordinates, there is usually some extra +1 or -1 that is needed 
in coordinate calculations.