Cell Browser wrangling guided examples: Difference between revisions

Revision as of 20:42, 5 July 2022

This page will walk you through the basics of wrangling two cell browsers using the two main tools: cbImportSeurat and cbImportScanpy. The examples will be divided into three parts, each roughly corresponding to the stages of wrangling a dataset for the Cell Browser. All command-line steps are done on hgwdev.

Using cbImportScanpy

This section is intended to teach you the basics of using cbImportScanpy and how it fits into the wrangling process in general. To do so, we will be importing data from the h5ad file for the liver segment of Tabula Sapiens. We will use an h5ad file, which is written out using the python package AnnData and (almost) always compatible with cbImportScanpy.

Part 1: Directory setup and data export

In this first section, we will go through the process of setting up a directory in which you will download and then import the data for a cell browser.

Ensure that you are in the proper conda environment:

conda activate scanpyenv

Change into a good working directory:

cd /hive/users/${hgwdev_username}/cb

Create a directory for this dataset:

mkdir -p tabula-sapiens-liver/orig/

This command also makes an ‘orig’ directory. In the Cell Browser, we use this to store the unchanged files obtained from the submitter or downloaded from GEO/etc.

Change into that directory:

cd tabula-sapiens-liver/orig/

Copy over the h5ad file we’ll be working with:

cp /hive/data/inside/cells/exampleDatasets/TS_Liver.h5ad .

Go up a directory and export the data from that file:

cd ../
cbImportScanpy -i orig/TS_Liver.h5ad -o . --clusterField=cell_ontology_class

This export should not take more than 3 or 4 minutes. After it completes, you can do an ls and you should see files like meta.tsv or markers.tsv. These and other files will be used as input to cbBuild in the next section of this guide. {image here?}

Part 2: cellbrowser.conf and cbBuild

Now, we will go through the process of modifying the cellbrowser.conf and building a cell browser for this dataset into your public_html directory.

Open the cellbrowser.conf file using vim:

vim cellbrowser.conf

Edit the name and shortLabel fields of your cellbrowser.conf so that it matches the following:

name='tabula-sapiens-liver'
shortLabel='Liver - Tabula Sapiens'

Build the cell browser into your public_html directory

cbBuild -o ~/public_html/cb

Look at your cell browser! It should be at https://hgwdev.gi.ucsc.edu/~${hgwdev_username}/cb/ (e.g. )

When looking at the cell browser for this dataset, do you notice any changes that should be made to make it more user-friendly? Maybe ‘layouts’ that need to be removed because they're uninformative? Or sample text that needs to be changed? We’ll talk more about polishing up the dataset in the next part. {image here?}

Part 3: desc.conf and final polish

Finally, we’ll cover filling out a desc.conf with some basic information about this dataset as well as polishing up any last visual details for this dataset.

Open the desc.conf file using vim:

vim desc.conf

Edit the following lines in your desc.conf to read:

 
title = "Liver Subset - Tabula Sapiens"
abstract = """
Liver subset of the Tabula Sapiens dataset covering over 5000 cells.
"""
paper_url="https://www.science.org/doi/10.1126/science.abl4896 The Tabula Sapiens Consortium. Science. 2022."
other_url="https://tabula-sapiens-portal.ds.czbiohub.org/ Tabula Sapiens Website"

Annotate marker genes

cbMarkerAnnotate markers.tsv markers.annotated.tsv

Make these changes to the cellbrowser.conf:

#    {
#        "file": "scvi_umap_coords.tsv",
#        "shortLabel": "scvi_umap"
#    },
#    {
#        "file": "scvi_coords.tsv",
#        "shortLabel": "scvi"
#    },
#    {
#        "file": "pca_coords.tsv",
#        "shortLabel": "pca"
#    }

markers = [{"file": "markers.annotated.tsv", "shortLabel":"Cluster Markers"}]

Rebuild the dataset

cbBuild -o ~/public_html/cb

Check it out: https://hgwdev.gi.ucsc.edu/~${hgwdev_username}/cb/. Make other changes to the cellbrowser.conf and desc.conf files to see how they affect the display. (Don't forget to rebuild the dataset between those changes!) {image here?}

Using cbImportSeurat

In this section we'll walk through how to create a cell browser starting with a Seurat RDS file, which is quite similar to using cbImportScanpy.

Part 1: Directory setup and data export

In this section, we'll set up the required directory structure for this new dataset and export the data from the RDS file.

Ensure that you are in the proper conda environment:

conda activate seuratenv

Change into a good working directory:

cd /hive/users/${hgwdev_username}/cb

Create a directory for this dataset:

mkdir -p mouse-dev-neocortex/orig/

This command also makes an ‘orig’ directory which we use to store the unchanged files obtained from the submitter or downloaded from GEO/etc.

Change into that directory:

cd mouse-dev-neocortex/orig/

Copy over the RDS file we’ll be working with:

cp /hive/data/inside/cells/exampleDatasets/Li_et_al_2020_UCSC_seurat_object.rds .

Go up a directory so that you are now just in the mouse-dev-neocortex directory. Now export the data from the RDS file:

cd ../
cbImportSeurat -i orig/Li_et_al_2020_UCSC_seurat_object.rds -o . --clusterField=clusters

The options we've specified for cbImportSeurat are:

-i: the name of the input RDS file
-o: the output directory (with '.' indicating the current directory)
--clusterField: the name we want to use as the default cluster labels (and calculate markers for)

(You can run cbImportSeurat with no arguments to see the full usage message.)

This export may take up to 30 minutes. After it completes, you can do an ls and you should see files like meta.tsv or markers.tsv. These and other files will be used as input to cbBuild in the next section of this guide. {image here?}

Cell Browser wrangling guided examples: Difference between revisions

Revision as of 20:42, 5 July 2022

Contents

Using cbImportScanpy

Part 1: Directory setup and data export

Part 2: cellbrowser.conf and cbBuild

Part 3: desc.conf and final polish

Using cbImportSeurat

Part 1: Directory setup and data export

Part 2: cellbrowser.conf and cbBuild

Part 3: desc.conf and final polish

Navigation menu

Page actions

Page actions

Personal tools

Genecats Wiki Navigation

Search

Media Wiki Navigation

Tools

@@ Line 109: / Line 109: @@
 ==Using cbImportSeurat==
-This section will be much the same as the previous section on cbImportScanpy, although this time using cbImportSeurat. [Something about RDS files here?]
+In this section we'll walk through how to create a cell browser starting with a Seurat RDS file, which is quite similar to using cbImportScanpy.
 ===Part 1: Directory setup and data export===
+In this section, we'll set up the required directory structure for this new dataset and export the data from the RDS file.
+Ensure that you are in the proper conda environment:
+ conda activate seuratenv
+Change into a good working directory:
+ cd /hive/users/${hgwdev_username}/cb
+Create a directory for this dataset:
+ mkdir -p mouse-dev-neocortex/orig/
+This command also makes an ‘orig’ directory which we use to store the unchanged files obtained from the submitter or downloaded from GEO/etc.
+Change into that directory:
+ cd mouse-dev-neocortex/orig/
+Copy over the RDS file we’ll be working with:
+ cp /hive/data/inside/cells/exampleDatasets/Li_et_al_2020_UCSC_seurat_object.rds .
+Go up a directory so that you are now just in the <code>mouse-dev-neocortex</code> directory. Now export the data from the RDS file:
+ cd ../
+ cbImportSeurat -i orig/Li_et_al_2020_UCSC_seurat_object.rds -o . --clusterField=clusters
+The options we've specified for cbImportSeurat are:
+* <code>-i</code>: the name of the input RDS file
+* <code>-o</code>: the output directory (with '.' indicating the current directory)
+* <code>--clusterField</code>: the name we want to use as the default cluster labels (and calculate markers for)
+(You can run <code>cbImportSeurat</code> with no arguments to see the full usage message.)
+This export may take up to 30 minutes. After it completes, you can do an ls and you should see files like meta.tsv or markers.tsv. These and other files will be used as input to cbBuild in the next section of this guide.
+{image here?}
 ===Part 2: cellbrowser.conf and cbBuild===
 ===Part 3: desc.conf and final polish===