Web Services & Javascript

Introduction and Motivation

We're thinking of reworking the genome browser so that Javascript is responsible for the layout of the page at a high level rather than the C scripts on the server. As part of this we'll be moving the C scripts to more of a web-services oriented architecture, where they'll be returning information in JSON rather than in HTML and GIF images. For the near term at least they probably will return GIF images as well. The goal of this rearchitecting is to make it easier to be more interactive, and to take advantage of the good Javascript user-interface (UI) oriented libraries like jQuery.

As is often the case with software development, the demon is in the details. Nonetheless it is helpful to have a clear, simple high level design to start with - to keep at least the messy details in one location of the code from breeding and exponentially multiplying with messy details in other locations....

Architectural Overview

The overall architecture is based on the flow of data through server side scripts in C at genome.ucsc.edu into Javascript-controlled web pages. The major objects in the system include (some of which have expressions both in the C and the Javascript side) include "user," "dataSource," "genomeAssembly," "track," "item," and "relationship." The user object handles passwords and other security, and the user interface settings. A dataSource represents a place that the system queries for live data. It may include a file or data base on the server, a remote SQL database or file set, a web site that the C scripts know how to query via a web-services interface of some sort, or the user's custom tracks. A genomeAssembly includes a particular genome assembly and a NCBI taxon number. A track is conceptually just a table that _may_ include genome position fields that are hooked to a genome assembly. Tracks may be implemented in a variety of ways, not necessarily as SQL tables. An item is conceptually a row in a track table. A relationship links together fields in the track/table rows.

Abstract notion of a data track

Server Side in C

hgHub

The hgHub server is the core of the system. Each JavaScript client connects with a hub. The hub knows what data is available locally, and how to reach other data source which may include other hubs.

Cart Contents Query

Input

Input
- userId (in cookie) - a 16 character randomly generated alphanumric string specifyig user
- sessionId (in cgi-var) - a 16 char randomly generated alphanumeric string specifying session
- namePattern - regular expression for the names of the variables to return
Output
- JSON struct of var/val pairs: {'db' : 'hg18', 'pixels' : 1000,}

Page Setup Query

To minimize the round trips between the web server and the jQuery client, the initial query to the server returns quite a bit of information. The only parameter to the initial query is the user ID, which is stored in a cookie. The query returns an HTML page which includes the skeleton of the main layout elements, and some JavaScript variable and arrays that the client will interpret to fill in the HTML skeleton. The JavaScript data can be broken into these main parts:

What is the user's state

This is essentially the "cart" data. We should do performance tests to see if it's acceptable just to send the whole cart, which would simplify things. We might need to just send parts of the cart.

Organism/assembly/position

These three variables are basic to controlling what the user sees.

Various UI preferences

Things like whether they want to see the guidelines, the next item arrows, etc.

What data is available to the user

We need to build into the system from the beginning the concept that different users have access to different data.

What organisms/assemblies

The genome browser definitely needs to know what genomes are available.

What tracks

Given a user and a genome, there will be a set of available tracks.

What links in the toolbar

Hopefully the toolbar will be generated somewhat more dynamically and explicitly.

List of displayed tracks

Additional information on tracks that are actually turned on by user.

Queries to Fetch Data for Genomic Regions

Item list

Summary/samples

"Browser" query

Image

Map box

List of items in non-dense tracks with just enough info to generate the map-boxes.

Queries to See What Data Is Available

What Organisms/Assemblies given user

What Tracks given user

Track description

Track metadata

Queries to Fetch Data for One Item

User Settings - The Cart

Client Side in Javascript

Initialization

The starting point of the Javascript will be the canonical jQuery <script>$( function() { init-code-goes-here } ) </script> This function will be responsible for converting lists of tools and tracks to a toolbar and the track controls, and also for setting up the layout for the displayed tracks, and initiating the asynchronous queries to fetch the displayed tracks.

Displaying Tracks

Each track will be in it's own div, and the div will be filled in asynchronously.

Interaction with cart server

The client will get a partial cart (just global, displayed-track, and hgt_ variables) passed in as a Javascript array. The client can also send cart updates and requests for additional cart variables. When the user changes a control, it sends an update to the cart server. When the user displays a track, the cart server gets queried for the cart variables associated with the newly open track.

Drag and Drop Reordering

The initial order of tracks is determined by where they are in the great tree of tracks. If the user drags though, this overrides the initial order. What happens when new tracks are introduced on the server after the user has rearranged the order of existing tracks on the client gets a little tricky. The new tracks end up possibly popping up in surprising places in the rearranged display, but at least they do pop up.