Minimal framework for ML modeling. Supports advanced dataset operations and streamlined training.
Project description
Overview
Minimal framework for ML modeling, supporting advanced dataset operations and streamlined training workflows.
Install
The trainlib package can be installed from PyPI:
pip install trainlib
Development
- Initialize/synchronize the project with
uv sync, creating a virtual environment with base package dependencies. - Depending on needs, install the development dependencies with
uv sync --extra dev.
Testing
- To run the unit tests, make sure to first have the test dependencies
installed with
uv sync --extra test, then runmake test. - For notebook testing, run
make install-kernelto make the environment available as a Jupyter kernel (to be selected when running notebooks).
Documentation
- Install the documentation dependencies with
uv sync --extra doc. - Run
make docs-build(optionally preceded bymake docs-clean), and serve locally withmake docs-serve.
Development remarks
-
Across
Trainer/Estimator/Dataset, I've considered aParamSpec-based typing scheme to better orchestrate alignment in theTrainer.train()loop, e.g., so we can statically check whether a dataset appears to be fulfilling the argument requirements for the estimator'sloss()/metrics()methods. Something likeclass Estimator[**P](nn.Module): def loss( self, input: Tensor, *args: P.args, **kwargs: P.kwargs, ) -> Generator: ... class Trainer[**P]: def __init__( self, estimator: Estimator[P], ... ): ...
might be how we begin threading signatures. But ensuring dataset items can match
Pis challenging. You can consider a "packed" object where we obfuscate passing data throughP-signatures:class PackedItem[**P]: def __init__(self, *args: P.args, **kwargs: P.kwargs) -> None: self._args = args self._kwargs = kwargs def apply[R](self, func: Callable[P, R]) -> R: return func(*self._args, **self._kwargs) class BatchedDataset[U, R, I, **P](Dataset): @abstractmethod def _process_item_data( self, item_data: I, item_index: int, ) -> PackedItem[P]: ... def __iter__(self) -> Iterator[PackedItem[P]]: ...
Meaningfully shaping those signatures is what remains, but you can't really do this, not with typical type expression flexibility. For instance, if I'm trying to appropriately type my base
TupleDataset:class SequenceDataset[I, **P](HomogenousDataset[int, I, I, P]): ... class TupleDataset[I](SequenceDataset[tuple[I, ...], "?"]): ...
Here there's no way for me to shape a
ParamSpecto indicate arbitrarily many arguments of a fixed type (Iin this case) to allow me to unpack my item tuples into an appropriatePackedItem.Until this (among other issues) becomes clearer, I'm setting up around a simpler
TypedDicttype variable. We won't have particularly strong static checks for item alignment insideTrainer, but this seems about as good as I can get around the current infrastructure.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file trainlib-0.3.1.tar.gz.
File metadata
- Download URL: trainlib-0.3.1.tar.gz
- Upload date:
- Size: 46.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc1744cf442f6e85b5e820e44e9962cff1a724e27adec19492973c2b91b2e203
|
|
| MD5 |
363017c253bcb1ce2c04c5701db7a9a8
|
|
| BLAKE2b-256 |
77cca6c60a96c7f12e3230acac58e876ab57ddc7ae0c198bd1f27ec1a4b3c913
|
File details
Details for the file trainlib-0.3.1-py3-none-any.whl.
File metadata
- Download URL: trainlib-0.3.1-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.2 {"installer":{"name":"uv","version":"0.11.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3790757d411c81d3113757930f8b6b212669aabdf35bb0137e1d579b77d0b7ef
|
|
| MD5 |
2fab0f93d4416b985b04ee37eb660360
|
|
| BLAKE2b-256 |
bdb6170f53211b619d9c2992c5feacadc4dd0a7c563812555932423f86e9824a
|