LinkOut from NCBI

From Genecats
Jump to navigationJump to search

How To Create LinkOuts from the NCBI Site to UCSC Genome Browser:


What is LinkOut?

The LinkOut links on the NCBI site are created by the data provider (e.g. UCSC) using an "automated" process. Basically, we create a set of XML files, upload them to their server, and they process them automatically and viola, the links appear! They have lots of help on their website, but it's still a bit of a chore to figure it out. You learn about what was done visiting this file: /cluster/home/qateam/links/howTo

Introductory Email

The UCSC Genome Browser login to NCBIs ftp site: [from: Scott Federhen <federhen@ncbi.nlm.nih.gov>] We've set up your LinkOut ftp account -

 account: ucscgb
 password: <find it in the file>

You can update your files whenever you want - they are in the 'holdings' directory of your ftp site.

ftp ftp-private.ncbi.nih.gov

ftp> cd holdings

ftp> ls -lt

200 PORT command successful.
150 Opening ASCII mode data connection for file list.
-rw-rw-rw-   1 4369     pmman         630 Jul 24 15:12 providerinfo.xml
-rw-rw-rw-   1 4369     pmman        1243 Jul 24 15:12 resource.xml
-rw-rw-rw-   1 4369     pmman        1035 Jul 24 15:12 species.xml

I'll be happy to help whenever you've got any more questions.

Cheers, Scott federhen@ncbi.nlm.nih.gov GenBank Taxonomy & LinkOut

How to edit the files

So, to make any changes to the files, you first need to ftp them down from that ftp site, make the changes, then ftp them back up. Do not delete anything from the files -- only add to what's already there.

You can FTP them down like so:

wget --ftp-user=ucscgb --ftp-password=<find it in the file> ftp://ftp-private.ncbi.nih.gov/holdings/resource.xml

What are the files?

FILE: providerinfo.xml The providerinfo.xml file contains the meta-data about who UCSC GB is. It likely will not need to be changed. HELP for that file: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helplinkout&part=nonbib#nonbib.File_Preparation_Ide

FILE: species.xml The species.xml file appears to be links from the Taxonomy database back to our site.

FILE: resource.xml The resource.xml is the main file that you will add new LinkOuts to. HELP for that file: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helplinkout&part=nonbib#nonbib._File_Preparation_Res

We also have a graphical image (the one that folks click on at the NCBI site to get to our browser). It lives here: http://genome.ucsc.edu/images/ucsc.gif

Example of Linkout at NCBI

You can see an example of how NCBI uses LinkOut on this page

  1. Go to NCBI entry for NR_051979.1 https://www.ncbi.nlm.nih.gov/nuccore/NR_051979.1
  2. Search "UCSC" and you will a "LinkOut to external resources" section
  3. Click the "UCSC Genome Browser" link and you will arrive on our site via the LinkOut:
    1. http://genome.ucsc.edu/cgi-bin/hgTracks?org=human&position=NR_051979
  • This is defined in the resource.xml by the following <Link> ...span... </Link> block:
    <Link> 
    <LinkId>1</LinkId> 
    <ProviderId>3715</ProviderId> 
    <IconUrl>&icon.url;</IconUrl> 
    <ObjectSelector> 
        <Database>Nucleotide</Database> 
        <ObjectList> 
            <Query>(Homo sapiens [orgn] NOT HTG[prop] NOT HTC[prop] NOT STS[prop]) AND (NR_000000:NR_999999[pacc] OR NM_000000:NM_999999[pacc])</Query> 
        </ObjectList> 
    </ObjectSelector> 
    <ObjectUrl> 
        <Base>&base.url;</Base> 
        <Rule>org=human&position=&lo.pacc;</Rule> 
    </ObjectUrl> 
    </Link> 
  • The <ProviderId>3715</ProviderId> call reflects back to the providerinfo.xml file:
   <ProviderId>3715</ProviderId>
   <Name>UCSC Genome Browser</Name> ...more info...
   <Url>http://genome.ucsc.edu/</Url> ..more info...

Where to get help

NCBIs help pages for LinkOut: http://www.ncbi.nlm.nih.gov/projects/linkout/

(we are "other resource providers") http://www.ncbi.nlm.nih.gov/projects/linkout/doc/nonbiblinkout.html

Also Helpful is the DTD: http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=helplinkout&part=files#files.LinkOut_DTD

Generic LinkOut email address: linkout@ncbi.nlm.nih.gov

Internal Notes: /cluster/home/qateam/links/howTo LinkOut_from_NCBI

Access Rights and Whitelisting

The LinkOut team at NCBI performs yearly automatic link checking for the links you submitted to LinkOut. The link checking program will send access one article URL every 5 seconds. The requests can be identified by the following and should be allowed access or whitelisted.

IP proxy: 130.14.25.148 or 130.14.254.25 or 130.14.254.26 User agent : "LinkOut Link Check Utility"