Cell Browser scripts

From Genecats
Jump to navigationJump to search

h5adMetaInfo

This script prints out a summary of the metadata in an h5ad file. Specifically, it prints out the unique values for each metadata column header field and the total number of unique values for each column of the metadata file.

To use this tool, you will need to be in your Scanpy environment:

h5adMetaInfo -f [number of fields to display] h5ad

rdsMetaInfo

This script is similar to the one above however it prints out a summary of the metadata for an rds file.

To use this tool, you will need to be in your Seurat environment:

rdsMetaInfo -f [number of fields to display] rds

Both scripts were developed in order to get a glance at the information stored in metadata files. These scripts save time in loading the files into either Python or R depending on the file type and extracting the data using a series of commands. Ultimately, this reduces typing the same commands over again. These scripts were inspired by Jim’s useful tool tabInfo below which prints out a summary of the metadata in a tsv file.

tabInfo

This script works like h5adMetaInfo and rdsMetaInfo, but for tab-separated files. With no extra options, the script will just tell you the number of columns and rows. Use the -vals= option to get output similar to the two scripts mentioned.

Example:

tabInfo -vals=5 meta.tsv

columnCorrection

Use this script to replace values in the specified field with values from the input file. If a value isn't in the input file, the line is returned unchanged. It has three required options: (1) metaFile, which can be any tsv or csv that you want to change values in, (2) replaceFile, which has the original value in column 1 and the replacement value in column 2, and (3) fieldToReplace, which is the name of the field in the metaFile header that you want values replaced in.

Example:

columnCorrection markers.tsv cluster_fixes.tsv cluster_name 
columnCorrection meta.tsv ctype_fixes.tsv celltype

addTags

Can add new tags and their corresponding values or update existing tags with new values. Input is a tab-separated file with a header line. The first column is the dataset name (e.g. cortex-dev) and subsequent columns are the values associated with the tag in header. If there are multiple values for a given tag, separate them with a comma.

Example input:

dataset         diseases
adultPancreas   Healthy
gbm             Glioblastoma
quake-gbm       Glioblastoma, Healthy Control

Example command:

addTags dataset.disease.tsv

The page Managing_cellbrowser.conf_tag_values_for_multiple_datasets has more details about using this script in conjunction with getTagVals.

getTagVals

Allows you to get the values from cellbrowser.conf or desc.conf for a single tag or set of tags. Input is a tab-separated file where the first column is a dataset name (e.g. cortex-dev). Also requires you to specify a tag or tags. If there are other columns in the file, then these will be carried through to the output file without change.

Examples:

getTagVals my_datasets.tsv body_parts
getTagVals my_datasets.tsv "body_parts diseases"
getTagVals -d my_datasets.tsv "title abstract"

The page Managing_cellbrowser.conf_tag_values_for_multiple_datasets has more details about using this script in conjunction with addTags.

cbBuildMulti

Build multiple cell browsers at once. Input is a list of dataset names (e.g. cortex-dev). This can either be in a file with one dataset name per line (file can contain other columns as long as the dataset name is the first column) or a whitespace-separated list of datasets enclosed in quotes.

Examples:

cbBuildMulti my.datasets.txt
cbBuildMulti "adultPancreas aging-human-skin"
cbBuildMulti "adultPancreas"

colorConverter

Covert between rgb colors to hex and hex colors to rgb. Input file should be csv or tsv. Input file is a two-column file, where column 2 contains the color values you want to convert and column 1 is typically something like cluster labels. You will need to be in an environment where you have 'webcolors' installed.

Example command:

colorConverter myCellTypeColors.hex.tsv > myCellTypeColors.rgb.tsv

It auto-detects if the colors in the input file are hex vs rgb and will convert to the one not in the file.

colorExporter

Export colors from an h5ad file. Takes an input h5ad file and an output file name. Sometimes the names of the color arrays in uns don't match the corresponding metadata field name, e.g. celltype_label vs celltype_colors, and so the -c option allows you to specify a file containing the associations (column 1 is the metadata field and column 2 is the color array name).

To use the tool, you'll need to be in your scanpy environment.

Example:

colorExporter -i TS_Liver.h5ad -o colors.tsv

datasetDiffs

Running the script with the -r/--run option will show which datasets have differences between cells-test/cells-beta/cells. To show the number of datasets on each machine, use the -s/--stats option. To show the names of hidden datasets, use the -d/--hidden option. Running the script without any arguments shows the usage message.

Example:

datasetDiffs -r

updateNewsSec

Will update the news section on the Cell Browser 'Overview' tab. Will also update rr.datasets.txt and sitemap.cells.txt. Automatically runs M-F at 5:10 am on a cronjob via the 'otto' user, but you can run it manually if you'd like.

The only option for the script is -r/--run which tells the script to run. Running the script without any arguments shows the usage message.

Example:

updateNewsSec -r

makeCbHub

Make a set of trackDb stanzas for a hub from a directory of bigWig and/or bigBed files. See Making_a_hub_for_a_cell_browser for a few more details and examples.

generateColorHtml

Generate a list of html colors to put on your hub html page.

Example:

generateColorHtml colors.tsv

You can then take the resulting html and put it into the 'Display Conventions and Configuration' section of your track description page. See Making_a_hub_for_a_cell_browser for more details and examples.

rowsToCols

Use this tool to transpose a matrix. The Cell Browser wants genes as the rows and cells as the columns. If you get a matrix that is opposite of that (cells on rows and genes on columns), you can use this tool to transpose that to the format the Cell Browser accepts. The -tab indicates the matrix is tap-separated, but for a comma-separated matrix use -fs=,.

Example:

rowsToCols -tab exprMatrix.tsv exprMatrix.transposed.tsv