Machine learning library for single-cell data analysis
Project description
Cellarium ML: distributed single-cell data analysis.
Cellarium ML is a PyTorch Lightning-based library for distributed single-cell data analysis. It provides a set of tools for training deep learning models on large-scale single-cell datasets, including distributed data loading, model training, and evaluation. Cellarium ML is designed to be modular and extensible, allowing users to easily define custom models, data transformations, and training pipelines.
Code organization
The code is organized as follows:
cellarium/ml/callbacks: Contains custom PyTorch Lightning callbacks.
cellarium/ml/core: Includes essential Cellarium ML components: - CellariumModule: A PyTorch Lightning Module tasked with defining and configuring the model, training step, and optimizer. - CellariumAnnDataDataModule: A PyTorch Lightning DataModule designed for setting up a multi-GPU DataLoader for a collection of AnnData objects. - CellariumPipeline: A Module List that pipes the input data through a series of transforms and a model.
cellarium/ml/data: Contains Distributed AnnData Collection and multi-GPU Iterable Dataset implementations.
cellarium/ml/lr_schedulers: Contains custom learning rate schedulers.
cellarium/ml/models: Features Cellarium ML models: - Models must subclass CellariumModel and implement the .reset_parameters method. - The .forward method should return a dictionary containing the computed loss under the loss key. - Optionally, hooks such as .on_train_start, .on_epoch_end, and .on_batch_end can be implemented to be triggered by the CellariumModule during training phases.
cellarium/ml/preprocessing: Provides pre-processing functions.
cellarium/ml/transforms: Contains data transformation modules: - Each transform is a subclass of torch.nn.Module. - The .forward method should output a dictionary where the keys correspond to the input arguments of subsequent transforms and the model.
cellarium/ml/utilities: Contains utility functions for various submodules.
cellarium/ml/cli.py: Implements the cellarium-ml CLI. Models must be registered here to be accessible via the CLI.
Installation
To install from the pip:
$ pip install cellarium-ml
To install the developer version from the source:
$ git clone https://github.com/cellarium-ai/cellarium-ml.git $ cd cellarium-ml $ make install # runs pip install -e .[dev]
For developers
To run the tests:
$ make test # runs single-device tests $ TEST_DEVICES=2 make test # runs multi-device tests
To automatically format the code:
$ make format # runs ruff formatter and fixes linter errors
To run the linters:
$ make lint # runs ruff linter and checks for formatter errors
To build the documentation:
$ make docs # builds the documentation at docs/build/html
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cellarium_ml-0.0.7.tar.gz
.
File metadata
- Download URL: cellarium_ml-0.0.7.tar.gz
- Upload date:
- Size: 65.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14cd59afd8ddbb83d55c65d042dab55128ca8e899f9f72117e535beaa888393b |
|
MD5 | 7df56271f124f41f14e39a785d5edb55 |
|
BLAKE2b-256 | 7cf722fa6cad342502d736fe66e795f622f451e940147d44e98b3d0dc8ec931d |
File details
Details for the file cellarium_ml-0.0.7-py3-none-any.whl
.
File metadata
- Download URL: cellarium_ml-0.0.7-py3-none-any.whl
- Upload date:
- Size: 68.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 69ff679dade637cd087162f30b8f28c14a27d7a8a889c8f1f4b82c09babb6509 |
|
MD5 | 7515e4953d5bed31f11ce3039dfa21c2 |
|
BLAKE2b-256 | f9a5d5f63dfeedf57b351df29c3e1d7ab90d7878d957b2e1d88e9c87b244e520 |