Advanced Cell Browser Topics: Difference between revisions

From Genecats
Jump to navigationJump to search
(First very rough pass at this page)
 
No edit summary
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Generating coordinates using cbScanpy==
Content of this page has moved to other, individual pages:


Rarely, you will be wrangling a dataset where you need to generate the layout coordinates. This is most easily done with cbScanpy.
[[Generating_coordinates_using_cbScanpy]]


===Setting up your scanpy.conf ===
[[Setting_up_rclone_for_the_Cell_Browser]]
Create one with the default values by running:
cbScanpy –init


We recommend turning off most of the cell filtering steps as we assume that the authors/submitters have already done the appropriate filtering and the default settings for these filters can be overzealous (e.g. removing 75% or more of the cells in some cases). Make the following changes to the scanpy.conf:
[[Renaming_a_Cell_Browser_dataset]]
 
doTimeCells=False
doFilterMito=False
doFilterGenes=False
 
====Context-dependent changes====
Some changes only make sense depending on the the particulars of your dataset.
 
Are the values in your matrix already normalized/logged? (If values include decimals and the max value is low, e.g. 6.0-10.0, then it probably is. Then set
doExp=True
 
Does your dataset have more than 20,000 cells? Only run UMAP:
doLayouts=[“umap”]
 
===Running cbScanpy===
Once you have your scanpy.conf set up, it’s time to actually run cbScanpy.
 
cbScanpy -e orig/<expr_mat_file> -m orig/<meta_file> -o . -n <short_name> --skipMatrix --inCluster=<field_name>
 
If your scanpy.conf is not in the same directory as where you’re running cbScanpy you’ll need to specify that with the ‘-c’ option.
 
After that completes, run cbBuild and check out the results in the Cell Browser. Hopefully things separate out into relatively distinct clusters. If not, you can try adjusting the settings in scanpy.conf and trying again or asking the submitters/authors for input.
 
==Setting up rclone==
===Installation===
You can install this using conda in a new environment or one of your existing ones (e.g. scanpyenv). Here we’ll set it up in a separate environment.
 
Conda create
Conda activate
Conda install
 
Get it working with…
 
Google Drive
Box
Link to full list?
 
Downloading a file
Other gotchas for cb work?
    File has to be in your drive/box/whatever.
    Cant remember is there a way to download a public file?
 
==Wrangling a bulk RNA dataset==
 
==Renaming a dataset==
Note: a dataset’s shortname should (almost) never be changed after being pushed to the main site. People bookmark things and URLs make their way into publications and we want to try our hardest not to break those.
 
These steps allow you to change a dataset’s shortname, but not have to go through the often lengthy process of rebuilding a dataset from scratch.
 
First, rename the directory in datasets:
 
cd /hive/data/inside/cells/datasets/
mv {old_name} {new_name}
 
Then, rename the directory in htdocs-cells
 
cd /usr/local/apache/htdocs-cells/
mv {old_name} {new_name}
 
Best to do this soon after you rename the other dir so that the same mv command isn’t too far back in your history.
 
Finally, rebuild the dataset:
cd - # Note that this will take you back to the last directory you were in
cbBuild -o alpha
 
This is necessary because the old name is still present in various dataset.json files, so rebuilding will replace the old names with the new ones.

Latest revision as of 18:22, 3 August 2022