A flexible framework for machine learning pipelines
Project description
๐ฌ LabChain
The Modern ML Experimentation Framework
Build, experiment, and deploy ML pipelines with confidence
Documentation โข Quick Start โข Examples โข Contributing
๐ฏ What is LabChain?
LabChain is a production-ready ML experimentation framework that combines the flexibility of research with the rigor of production deployment. Stop fighting with boilerplate code and focus on what matters: your models.
โจ Why LabChain?
|
๐งฉ Modular by Design
|
๐ Production Ready
|
|
๐ Reproducible
|
โก Experimental Features
|
๐ Quick Start
Installation
pip install framework3
Your First Pipeline (2 minutes)
from labchain import Container, F3Pipeline
from labchain.plugins.filters import StandardScalerPlugin, KnnFilter
from labchain.plugins.metrics import F1, Precission, Recall
from labchain.base import XYData
from sklearn.datasets import load_iris
# Load data
iris = load_iris()
X = XYData.mock(iris.data)
y = XYData.mock(iris.target)
# Build pipeline
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
KnnFilter(n_neighbors=5)
],
metrics=[F1("weighted"), Precission("weighted"), Recall("weighted")]
)
# Train and evaluate
pipeline.fit(X, y)
predictions = pipeline.predict(X)
results = pipeline.evaluate(X, y, predictions)
print(results)
# {'F1': 0.95, 'Precision': 0.95, 'Recall': 0.95}
That's it! ๐ You just built, trained, and evaluated an ML pipeline.
๐ก Key Features
๐๏ธ Modular Architecture
# Mix and match components like LEGO blocks
from labchain.plugins.filters import (
PCAPlugin,
StandardScalerPlugin,
ClassifierSVMPlugin
)
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
PCAPlugin(n_components=2),
ClassifierSVMPlugin(kernel='rbf')
]
)
๐ Smart Caching
from labchain.plugins.filters import Cached
# Cache expensive operations automatically
pipeline = F3Pipeline(
filters=[
Cached(
filter=ExpensivePreprocessor(),
cache_data=True,
cache_filter=True
),
MyModel()
]
)
๐ Hyperparameter Optimization
from labchain import WandbOptimizer
# Optimize with Weights & Biases
optimizer = WandbOptimizer(
project="my-experiment",
scorer=F1(),
method="bayes",
n_trials=50
)
# Define search space
pipeline = F3Pipeline(
filters=[
KnnFilter().grid({
'n_neighbors': [3, 5, 7, 9]
})
]
)
optimizer.optimize(pipeline)
optimizer.fit(X_train, y_train)
โก Remote Injection (Experimental)
Deploy pipelines without deploying code:
# On your laptop
@Container.bind(persist=True)
class MyCustomFilter(BaseFilter):
def predict(self, x):
return x * 2
Container.storage = S3Storage(bucket="my-models")
Container.ppif.push_all()
# On production server (no source code needed!)
from labchain.base import BasePlugin
pipeline = BasePlugin.build_from_dump(config, Container.ppif)
predictions = pipeline.predict(data) # Just works! โจ
๐ Distributed Processing (Experimental)
from labchain import HPCPipeline
# Automatic Spark distribution
pipeline = HPCPipeline(
app_name="distributed-training",
filters=[Filter1(), Filter2(), Filter3()]
)
pipeline.fit(large_dataset)
๐ Examples
Classification with Cross-Validation
from labchain import F3Pipeline, KFoldSplitter
from labchain.plugins.filters import StandardScalerPlugin, ClassifierSVMPlugin
from labchain.plugins.metrics import F1, Precission, Recall
pipeline = F3Pipeline(
filters=[
StandardScalerPlugin(),
ClassifierSVMPlugin(kernel='rbf', C=1.0)
],
metrics=[F1(), Precission(), Recall()]
).splitter(
KFoldSplitter(n_splits=5, shuffle=True, random_state=42)
)
pipeline.fit(X_train, y_train)
results = pipeline.evaluate(X_test, y_test, pipeline.predict(X_test))
Parallel Processing
from labchain import LocalThreadPipeline
from labchain.plugins.filters import Filter1, Filter2, Filter3
# Process filters in parallel
pipeline = LocalThreadPipeline(
filters=[
Filter1(), # Runs in parallel
Filter2(), # Runs in parallel
Filter3() # Runs in parallel
]
)
# Results are concatenated automatically
predictions = pipeline.predict(X)
Custom Components
from labchain import Container
from labchain.base import BaseFilter, XYData
@Container.bind()
class MyCustomFilter(BaseFilter):
def __init__(self, threshold: float = 0.5):
super().__init__(threshold=threshold)
def fit(self, x: XYData, y: XYData = None):
# Your training logic
pass
def predict(self, x: XYData) -> XYData:
# Your prediction logic
return XYData.mock(x.value > self.threshold)
# Use it like any other filter
pipeline = F3Pipeline(filters=[MyCustomFilter(threshold=0.7)])
Version Control & Rollback
# Version 1
@Container.bind(persist=True)
class MyModel(BaseFilter):
def predict(self, x):
return x * 1
Container.ppif.push_all()
hash_v1 = Container.pcm.get_class_hash(MyModel)
# Version 2
@Container.bind(persist=True)
class MyModel(BaseFilter):
def predict(self, x):
return x * 2
Container.ppif.push_all()
hash_v2 = Container.pcm.get_class_hash(MyModel)
# Rollback to V1
ModelV1 = Container.ppif.get_version("MyModel", hash_v1)
๐ Documentation
| Resource | Description |
|---|---|
| ๐ Quick Start Guide | Get up and running in 5 minutes |
| ๐ Tutorials | Step-by-step guides and examples |
| ๐ API Reference | Complete API documentation |
| โก Remote Injection | Deploy without code (experimental) |
| ๐๏ธ Architecture | Deep dive into design principles |
| ๐ก Best Practices | Production-ready patterns |
๐ ๏ธ Supported Components
Filters
Pipelines
|
Optimizers
Storage
|
๐ฆ Roadmap
- Core pipeline functionality
- Automatic caching system
- Hyperparameter optimization
- Distributed processing (Spark)
- Remote injection (experimental)
- Multi-cloud storage backends (GCS, Azure)
- Real-time inference API
- AutoML capabilities
- Model registry integration
- Kubernetes deployment templates
๐ค Contributing
We โค๏ธ contributions! Here's how you can help:
Ways to Contribute
- ๐ Report bugs by opening an issue
- ๐ก Suggest features in discussions
- ๐ Improve documentation
- ๐ง Submit pull requests
- โญ Star the repo to show support
Development Setup
# Clone the repository
git clone https://github.com/manucouto1/LabChain.git
cd LabChain
# Install dependencies
pip install -r requirements.txt
# Run tests
pytest tests/
# Build documentation
cd docs && mkdocs serve
Guidelines
- Follow PEP 8 style guide
- Add tests for new features
- Update documentation
- Keep commits atomic and well-described
๐ Community & Support
- ๐ Issue Tracker - Report bugs and request features
- ๐ง Email - Contact the maintainers
- ๐ Documentation - Comprehensive guides
๐ License
This project is licensed under the AGPL-3.0 License - see the LICENSE file for details.
What this means:
- โ Use LabChain for free in your projects
- โ Modify and distribute the code
- โ ๏ธ If you modify and distribute LabChain, you must release your changes under AGPL-3.0
- โ ๏ธ If you use LabChain in a network service, you must make the source available
Made with โ and Python
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file framework3-1.2.11.tar.gz.
File metadata
- Download URL: framework3-1.2.11.tar.gz
- Upload date:
- Size: 104.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
418c25927eb3c5451a2801cae5ce7ba4f6ffcaed8b5fcbbddb011f89015586e3
|
|
| MD5 |
4b2722dab08c4ceba86fd15e6425f5bc
|
|
| BLAKE2b-256 |
4db6bf16da31a9eb4edc1efab07ba6f8fadd5488e5bc396d9538a9474a19c59a
|
File details
Details for the file framework3-1.2.11-py3-none-any.whl.
File metadata
- Download URL: framework3-1.2.11-py3-none-any.whl
- Upload date:
- Size: 148.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e62fea96ad7a111091b94cf0404070fb56209873a95886166c0b8c41ca210c79
|
|
| MD5 |
fae90ad176f635e0d3962bbfe0b594f9
|
|
| BLAKE2b-256 |
9e2830bd4a0da6c18d9697578bd9b07412215f3c3c57f2620ccd231b7009c6ae
|