Skip to main content

Machine learning library for single-cell data analysis

Project description

Cellarium ML: distributed single-cell data analysis.


Cellarium ML is a PyTorch Lightning-based library for distributed single-cell data analysis. It provides a set of tools for training deep learning models on large-scale single-cell datasets, including distributed data loading, model training, and evaluation. Cellarium ML is designed to be modular and extensible, allowing users to easily define custom models, data transformations, and training pipelines.

Code organization

The code is organized as follows:

  • cellarium/ml/callbacks: Contains custom PyTorch Lightning callbacks.

  • cellarium/ml/core: Includes essential Cellarium ML components: - CellariumModule: A PyTorch Lightning Module tasked with defining and configuring the model, training step, and optimizer. - CellariumAnnDataDataModule: A PyTorch Lightning DataModule designed for setting up a multi-GPU DataLoader for a collection of AnnData objects. - CellariumPipeline: A Module List that pipes the input data through a series of transforms and a model.

  • cellarium/ml/data: Contains Distributed AnnData Collection and multi-GPU Iterable Dataset implementations.

  • cellarium/ml/lr_schedulers: Contains custom learning rate schedulers.

  • cellarium/ml/models: Features Cellarium ML models: - Models must subclass CellariumModel and implement the .reset_parameters method. - The .forward method should return a dictionary containing the computed loss under the loss key. - Optionally, hooks such as .on_train_start, .on_epoch_end, and .on_batch_end can be implemented to be triggered by the CellariumModule during training phases.

  • cellarium/ml/preprocessing: Provides pre-processing functions.

  • cellarium/ml/transforms: Contains data transformation modules: - Each transform is a subclass of torch.nn.Module. - The .forward method should output a dictionary where the keys correspond to the input arguments of subsequent transforms and the model.

  • cellarium/ml/utilities: Contains utility functions for various submodules.

  • cellarium/ml/cli.py: Implements the cellarium-ml CLI. Models must be registered here to be accessible via the CLI.

Installation

To install from the pip:

$ pip install cellarium-ml

To install the developer version from the source:

$ git clone https://github.com/cellarium-ai/cellarium-ml.git
$ cd cellarium-ml
$ make install               # runs pip install -e .[dev]

For developers

To run the tests:

$ make test                  # runs single-device tests
$ TEST_DEVICES=2 make test   # runs multi-device tests

To automatically format the code:

$ make format               # runs ruff formatter and fixes linter errors

To run the linters:

$ make lint                  # runs ruff linter and checks for formatter errors

To build the documentation:

$ make docs                  # builds the documentation at docs/build/html

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cellarium_ml-0.0.7.tar.gz (65.6 kB view details)

Uploaded Source

Built Distribution

cellarium_ml-0.0.7-py3-none-any.whl (68.4 kB view details)

Uploaded Python 3

File details

Details for the file cellarium_ml-0.0.7.tar.gz.

File metadata

  • Download URL: cellarium_ml-0.0.7.tar.gz
  • Upload date:
  • Size: 65.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for cellarium_ml-0.0.7.tar.gz
Algorithm Hash digest
SHA256 14cd59afd8ddbb83d55c65d042dab55128ca8e899f9f72117e535beaa888393b
MD5 7df56271f124f41f14e39a785d5edb55
BLAKE2b-256 7cf722fa6cad342502d736fe66e795f622f451e940147d44e98b3d0dc8ec931d

See more details on using hashes here.

File details

Details for the file cellarium_ml-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: cellarium_ml-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 68.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for cellarium_ml-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 69ff679dade637cd087162f30b8f28c14a27d7a8a889c8f1f4b82c09babb6509
MD5 7515e4953d5bed31f11ce3039dfa21c2
BLAKE2b-256 f9a5d5f63dfeedf57b351df29c3e1d7ab90d7878d957b2e1d88e9c87b244e520

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page