Skip to main content

Impute Flow Cytometry values between overlapping panels with XGBoost regression.

Project description

pyInfinityFlow

pyInfinityFlow is a Python package that enables imputation of hundreds of features from Flow Cytometry using XGBoost regression1. It is an adaptation of the original implementation in R2 with the goal of optimizing the workflow for large datasets by increasing the speed and memory efficiency of the analysis pipeline.

The package includes tools to read and write FCS files, following the FCS3.1 file standard, into AnnData objects, allowing for easy downstream analysis of single-cell data with Scanpy3 and UMAP4.

Read more about the pyInfinityFlow package on its Read the Docs page!

Graphical Summary

Graphical Summary

Recommended Installation

It is recommended to set up a virtual environment to install the package.

Creating a new conda environment and installing pyInfinityFlow:

conda create -n pyInfinityFlow python=3.8
conda activate pyInfinityFlow

pip install pyInfinityFlow

Then pyInfinityFlow will be installed in a conda environment named 'pyInfinityFlow'.

Quickstart

To run the pyInfinityFlow pipeline, we can use this command:

pyInfinityFlow --data_dir /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset/ \
    --out_dir /media/kyle_ssd1/example_outputs/ \
    --backbone_annotation /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset_backbone_anno.csv \
    --infinity_marker_annotation /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset_infinity_marker_anno.csv

Selected References

1 Chen, T., Guestrin, C. XGBoost: A scalable tree boosting system, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, 2016), pp. 785–794.

2 Becht, E., Tolstrup, D., Dutertre, C. A., Morawski, P. A., Campbell, D. J., Ginhoux, F., ... & Headley, M. B. (2021). High-throughput single-cell quantification of hundreds of proteins using conventional flow cytometry and machine learning. Science advances, 7(39), eabg0505.

3 Wolf, F. A., Angerer, P., & Theis, F. J. (2018). SCANPY: large-scale single-cell gene expression data analysis. Genome biology, 19(1), 1-5.

4 McInnes, L., Healy, J., & Melville, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyinfinityflow-1.0.0.tar.gz (45.3 kB view details)

Uploaded Source

Built Distribution

pyInfinityFlow-1.0.0-py3-none-any.whl (50.8 kB view details)

Uploaded Python 3

File details

Details for the file pyinfinityflow-1.0.0.tar.gz.

File metadata

  • Download URL: pyinfinityflow-1.0.0.tar.gz
  • Upload date:
  • Size: 45.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.19

File hashes

Hashes for pyinfinityflow-1.0.0.tar.gz
Algorithm Hash digest
SHA256 cf9f4b35f927dfa97c76e925b6ea516944bc8fa7c9f0c9599d5c0cdffcb73a2e
MD5 37c2ac6656a6abbd18bb5c1dd4d84bdc
BLAKE2b-256 ccfa03a29452e02de3a81231abfc6fb2440f8f48b5b6fe50430ef10fc05fbe9a

See more details on using hashes here.

File details

Details for the file pyInfinityFlow-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyInfinityFlow-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d504715a582e6d850efb07b68f92df74737782f02ac437b9122f63348d6ec5e9
MD5 d8b184e12dab0de4c1aa1e5a6624305d
BLAKE2b-256 6bd8e1b67df93d2f52556170aa576a2f104e449636a17348fb7022b0926bb8fe

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page