Cell Browser wrangling guided examples

From Genecats
Revision as of 21:35, 1 July 2022 by Mspeir (talk | contribs) (Adding completed cbImportScanpy section, will come back to cbImportSeurat)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This page will walk you through the basics of wrangling two cell browsers using the two main tools: cbImportSeurat and cbImportScanpy. The examples will be divided into three parts, each roughly corresponding to the stages of wrangling a dataset for the Cell Browser. All command-line steps are done on hgwdev.

Using cbImportScanpy

This section is intended to teach you the basics of using cbImportScanpy and how it fits into the wrangling process in general. To do so, we will be importing data from the h5ad file for the liver segment of Tabula Sapiens. We will use an h5ad file, which is written out using the python package AnnData and (almost) always compatible with cbImportScanpy.

Part 1: Directory setup and data export

In this first section, we will go through the process of setting up a directory in which you will download and then import the data for a cell browser.

Ensure that you are in the proper conda environment:

conda activate scanpyenv

Change into a good working directory:

cd /hive/users/${hgwdev_username}/cb

Create a directory for this dataset:

mkdir -p tabula-sapiens-liver/orig/

This command also makes an ‘orig’ directory. In the Cell Browser, we use this to store the unchanged files obtained from the submitter or downloaded from GEO/etc.

Change into that directory:

cd tabula-sapiens-liver/orig/

Copy over the h5ad file we’ll be working with:

cp /hive/data/inside/cells/exampleDatasets/TS_Liver.h5ad .

Go up a directory and export the data from that file:

cd ../
cbImportScanpy -i orig/TS_Liver.h5ad -o . --clusterField=cell_ontology_class

This export should not take more than 3 or 4 minutes. After it completes, you can do an ls and you should see files like meta.tsv or markers.tsv. These and other files will be used as input to cbBuild in the next section of this guide. {image here?}

Part 2: cellbrowser.conf and cbBuild

Now, we will go through the process of modifying the cellbrowser.conf and building a cell browser for this dataset into your public_html directory.

Open the cellbrowser.conf file using vim:

vim cellbrowser.conf

Edit the name and shortLabel fields of your cellbrowser.conf so that it matches the following:

name='tabula-sapiens-liver'
shortLabel='Liver - Tabula Sapiens'

Build the cell browser into your public_html directory

cbBuild -o ~/public_html/cb

Look at your cell browser! It should be at https://hgwdev.gi.ucsc.edu/~${hgwdev_username}/cb/ (e.g. )

When looking at the cell browser for this dataset, do you notice any changes that should be made to make it more user-friendly? Maybe ‘layouts’ that need to be removed because they're uninformative? Or sample text that needs to be changed? We’ll talk more about polishing up the dataset in the next part. {image here?}

Part 3: desc.conf and final polish

Finally, we’ll cover filling out a desc.conf with some basic information about this dataset as well as polishing up any last visual details for this dataset.

Open the desc.conf file using vim:

vim desc.conf

Edit the following lines in your desc.conf to read:

 
title = "Liver Subset - Tabula Sapiens"
abstract = """
Liver subset of the Tabula Sapiens dataset covering over 5000 cells.
"""
paper_url="https://www.science.org/doi/10.1126/science.abl4896 The Tabula Sapiens Consortium. Science. 2022."
other_url="https://tabula-sapiens-portal.ds.czbiohub.org/ Tabula Sapiens Website"

Annotate marker genes

cbMarkerAnnotate markers.tsv markers.annotated.tsv

Make these changes to the cellbrowser.conf:

#    {
#        "file": "scvi_umap_coords.tsv",
#        "shortLabel": "scvi_umap"
#    },
#    {
#        "file": "scvi_coords.tsv",
#        "shortLabel": "scvi"
#    },
#    {
#        "file": "pca_coords.tsv",
#        "shortLabel": "pca"
#    }

markers = [{"file": "markers.annotated.tsv", "shortLabel":"Cluster Markers"}]

Rebuild the dataset

cbBuild -o ~/public_html/cb

Check it out: https://hgwdev.gi.ucsc.edu/~${hgwdev_username}/cb/. Make other changes to the cellbrowser.conf and desc.conf files to see how they affect the display. (Don't forget to rebuild the dataset between those changes!) {image here?}

Using cbImportSeurat

This section will be much the same as the previous section on cbImportScanpy, although this time using cbImportSeurat. [Something about RDS files here?]

Part 1: Directory setup and data export

Part 2: cellbrowser.conf and cbBuild

Part 3: desc.conf and final polish