Visualizing Coordinates: Difference between revisions

From genomewiki
Jump to navigationJump to search
(Created page with "Regarding strand coordinates, there are generally two ways in which this can be done: #1. Specify coordinate on positive strand, and then after the fact, note whether it is actu...")
 
No edit summary
Line 27: Line 27:
lets say that S and E are start1 and end1 on pos strand coords,
lets say that S and E are start1 and end1 on pos strand coords,
and s and e are start and end on neg strand coords.
and s and e are start and end on neg strand coords.
 
<pre>
                         e      s                ...210  (neg strand coords)
                         e      s                ...210  (neg strand coords)
                         YYYYYYY
                         YYYYYYY
Line 34: Line 34:
                         XXXXXXX
                         XXXXXXX
           012...        S      E                        (pos strand coords)
           012...        S      E                        (pos strand coords)
 
</pre>
with our zero-based half-open coordinates, the positive strand coordinate
with our zero-based half-open coordinates, the positive strand coordinate
runs from 0 to chromSize-1, that is [0,chromSize) which is also [0,chromSize-1].
runs from 0 to chromSize-1, that is [0,chromSize) which is also [0,chromSize-1].
Line 52: Line 52:
Note that if you use one-based closed coordinates then the picture
Note that if you use one-based closed coordinates then the picture
looks like this:  coord range both strands: [1,chromSize]
looks like this:  coord range both strands: [1,chromSize]
 
<pre>
                           e    s                ...321  (neg strand coords)
                           e    s                ...321  (neg strand coords)
  eziSmorhc=C              YYYYYYY
  eziSmorhc=C              YYYYYYY
Line 59: Line 59:
                           XXXXXXX                    C=chromSize
                           XXXXXXX                    C=chromSize
           123...        S    E                        (pos strand coords)
           123...        S    E                        (pos strand coords)
 
</pre>
s = C - E + 1
s = C - E + 1
e = C - S + 1
e = C - S + 1

Revision as of 20:25, 13 February 2014

Regarding strand coordinates, there are generally two ways in which this can be done:

  1. 1. Specify coordinate on positive strand, and then after the fact,

note whether it is actually on the negative strand. We typically use this one very much, probably because it makes it easier to compare coordinates, especially if you don't care what strand it is on.

  1. 2. Specify the strand first, and then use the coordinates of that strand.

Both are in use in general and in different places. If #2 is used and it is on the negative stand, people use the phrase that it is in "negative strand coordinates."

Cases that I can remember that do this are the chain files. Also, bizarrely enough, in the psl format, although the main start and end coordinates are in positive strand coords (probably to allow rapid coordinate compares while looking for overlaps at the whole-gene level). the actual block starts, and their order, are in negative strand coordinates.

To convert from #1 to #2, you generally takes start2 = chromSize - end1 end2 = chromSize - start1

To make my graph easier in text, lets say that S and E are start1 and end1 on pos strand coords, and s and e are start and end on neg strand coords.

                        e      s                 ...210  (neg strand coords)
                         YYYYYYY
eziSmorhc=Cnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
           ppppppppppppppppppppppppppppppppppppppppppppC=chromSize
                         XXXXXXX
           012...        S      E                        (pos strand coords)

with our zero-based half-open coordinates, the positive strand coordinate runs from 0 to chromSize-1, that is [0,chromSize) which is also [0,chromSize-1]. Negative strand coordinates also have the same range the negative strand, of course.

So s = C - E e = C - S

With form #1, we say it is at S,E but by the way, it is really on the neg strand (-). With form #2, we say it is on the negative strand (-), at coordinates s,e.

So, do you want the coordinates first, or the strand? Either way can work.


Note that if you use one-based closed coordinates then the picture looks like this: coord range both strands: [1,chromSize]

                          e     s                ...321  (neg strand coords)
 eziSmorhc=C              YYYYYYY
           nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
           pppppppppppppppppppppppppppppppppppppppppppp
                          XXXXXXX                     C=chromSize
           123...         S     E                        (pos strand coords)

s = C - E + 1 e = C - S + 1

So in these coordinates, there is usually some extra +1 or -1 that is needed in coordinate calculations.