Implements *CellAnnotator (aka *CAT/starCAT), annotating scRNA-Seq with predefined gene expression programs
Project description
starCAT 
Implements starCellAnnoTator (AKA starCAT), annotating scRNA-Seq with predefined gene expression programs
Citation
If you use starCAT, please cite our manuscript.
Installation
You can install starCAT and its dependencies via the Python Package Index.
pip install starcatpy
We tested it with scikit-learn 1.3.2, AnnData 0.9.2, and python 3.8. To run the tutorials, you also need jupyter or jupyterlab as well as scanpy and cnmf:
pip install jupyterlab scanpy cnmf
Basic starCAT usage
Please see our tutorials in python and R. A sample pipeline using a pre-built reference programs (TCAT.V1) is shown below.
# Load default TCAT reference from starCAT databse
tcat = starCAT(reference='TCAT.V1')
# tcat.ref.iloc[:5, :5]
# A1BG AARD AARSD1 ABCA1 ABCB1
# CellCycle-G2M 2.032614 22.965553 17.423538 3.478179 2.297279
# Translation 35.445282 0.000000 9.245893 0.477994 0.000000
# HLA 18.192997 14.632670 2.686475 3.937182 0.000000
# ISG 0.436212 0.000000 18.078197 17.354506 0.000000
# Mito 10.293049 0.000000 52.669895 14.615502 3.341488
# Load cell x genes counts data
adata = tcat.load_counts(datafn)
# Run starCAT
# expects the input data to be raw counts and to be stored in adata.X
# rather than adata.layers['counts']
usage, scores = tcat.fit_transform(adata)
usage.iloc[0:2, 0:4]
# CellCycle-G2M Translation HLA ISG
# CATGCCTAGTCGATAA-1-gPlexA4 0.000039 0.001042 0.001223 0.000162
# AAGACCTGTAGCGTCC-1-gPlexC6 0.000246 0.100023 0.002991 0.042354
scores.iloc[0:2, :]
# ASA Proliferation ASA_binary \
# CATGCCTAGTCGATAA-1-gPlexA4 0.001556 0.00052 False
# AAGACCTGTAGCGTCC-1-gPlexC6 0.012503 0.01191 False
# Proliferation_binary Multinomial_Label
# CATGCCTAGTCGATAA-1-gPlexA4 False CD8_TEMRA
# AAGACCTGTAGCGTCC-1-gPlexC6 False CD4_Naive
starCAT also can be run in the command line.
starcat --reference "TCAT.V1" --counts {counts_fn} --output-dir {output_dir} --name {outuput_name}
- --reference - name of a default reference to download (ex. TCAT.V1) OR filepath containing a reference set of GEPs by genes (*.tsv/.csv/.txt), default is 'TCAT.V1'
- --counts - filepath to input (cell x gene) counts matrix as a matrix market (.mtx.gz), tab delimited text file, or anndata file (.h5ad)
- --scores - optional path to yaml file for calculating score add-ons, not necessary for pre-built references
- --output-dir - the output directory. all output will be placed in {output-dir}/{name}...'. default directory is '.'
- --name - the output analysis prefix name, default is 'starCAT'
For code to reproduce figures and analyses from our manuscript, please refer to the TCAT analysis Github.
Alternate implementation
For small datasets (smaller than ~50,000 cells or 700 MB), try running starCAT without installing any packages on our website.
Creating your own reference
We provide example scripts for constructing custom starCAT references from a single cNMF run or multiple cNMF runs.
Please let us know if you are interested in making your reference publically available for others to use analogous to our TCAT.V1 reference. You can email me at dkotliar@broadinstitute.org
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file starcatpy-1.0.10.tar.gz.
File metadata
- Download URL: starcatpy-1.0.10.tar.gz
- Upload date:
- Size: 18.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ff1b7e7a6d3e9432a7a8443bff810a44780bda188722388a6565ae09c03d4186
|
|
| MD5 |
88cb12d60a772af2533b4b1adcfb74b2
|
|
| BLAKE2b-256 |
268fb99f2a6e4d0d9596e3c2a9b0415ec564cfb9ac69e1ce1a32b2d9ca217045
|
File details
Details for the file starcatpy-1.0.10-py3-none-any.whl.
File metadata
- Download URL: starcatpy-1.0.10-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
311a11d152ad627d575f2fc87b9c154c65597c713fdeba721bdb3055b09ea182
|
|
| MD5 |
fc2d7bffcafca7a5f63636d287d79147
|
|
| BLAKE2b-256 |
3644a3cee010560a79fbe94ffa33857c2463acabd947647d87abd5c29bae0f46
|