High-performance quantitative finance feature engineering library
Project description
ml4t-engineer
Feature engineering for financial machine learning: technical indicators, labeling methods, and alternative bar sampling.
Part of the ML4T Library Ecosystem
This library is one of six interconnected libraries supporting the machine learning for trading workflow described in Machine Learning for Trading:
Together they cover data infrastructure, feature engineering, modeling, signal evaluation, strategy backtesting, and live deployment.
What This Library Does
Transforming raw price data into predictive features is a core task in quantitative research. ml4t-engineer provides:
- 120 technical indicators across 11 categories (momentum, volatility, trend, microstructure, etc.)
- Triple-barrier labeling and other target construction methods from Advances in Financial Machine Learning
- Alternative bar sampling (volume bars, dollar bars, tick imbalance bars)
- A feature registry for discovery and configuration
The library is built on Polars with Numba JIT compilation for numerical operations. 60 indicators are validated against TA-Lib at 1e-6 tolerance.
Installation
pip install ml4t-engineer
Optional dependencies:
pip install ml4t-engineer[ta] # TA-Lib backend
pip install ml4t-engineer[viz] # Visualization
pip install ml4t-engineer[calendars] # Trading calendars
Quick Start
import polars as pl
from ml4t.engineer import compute_features
df = pl.read_parquet("ohlcv.parquet")
# Compute features with default parameters
result = compute_features(df, ["rsi", "macd", "atr", "obv"])
# Or with custom parameters
result = compute_features(df, [
{"name": "rsi", "params": {"period": 20}},
{"name": "bollinger_bands", "params": {"period": 20, "std_dev": 2.0}},
])
Feature Registry
from ml4t.engineer.core.registry import get_registry
registry = get_registry()
print(registry.list_all()) # All 120 features
print(registry.list_by_category("momentum")) # 31 momentum indicators
print(registry.list_ta_lib_compatible()) # 60 TA-Lib validated
print(registry.list_normalized()) # 37 bounded (0-100, -1 to 1)
Feature Categories
| Category | Count | Examples |
|---|---|---|
| Momentum | 31 | RSI, MACD, Stochastic, CCI, ADX, MFI |
| Microstructure | 15 | Kyle Lambda, VPIN, Amihud, Roll spread |
| Volatility | 15 | ATR, Bollinger, Yang-Zhang, Parkinson |
| Statistics | 14 | Variance, Linear Regression, Correlation |
| ML | 14 | Fractional Diff, Entropy, Lag features |
| Trend | 10 | SMA, EMA, WMA, DEMA, TEMA, KAMA |
| Risk | 6 | Max Drawdown, Sortino, CVaR |
| Price Transform | 5 | Typical Price, Weighted Close |
| Regime | 4 | Hurst Exponent, Choppiness Index |
| Volume | 3 | OBV, AD, ADOSC |
| Math | 3 | MAX, MIN, SUM |
Triple-Barrier Labeling
from ml4t.engineer.config import LabelingConfig
from ml4t.engineer.labeling import triple_barrier_labels, atr_triple_barrier_labels
# Fixed barriers
tb_config = LabelingConfig.triple_barrier(
upper_barrier=0.02, # 2% profit target
lower_barrier=0.01, # 1% stop loss
max_holding_period=20, # 20 bars
)
labels = triple_barrier_labels(
df,
config=tb_config,
)
# ATR-based dynamic barriers
atr_config = LabelingConfig.atr_barrier(
atr_tp_multiple=2.0,
atr_sl_multiple=1.0,
atr_period=14,
max_holding_period=20,
)
labels = atr_triple_barrier_labels(
df,
config=atr_config,
)
# Time-based horizons
tb_time_config = LabelingConfig.triple_barrier(
upper_barrier=0.02,
lower_barrier=0.01,
max_holding_period="4h", # 4 hours
)
labels = triple_barrier_labels(
df,
config=tb_time_config,
)
Alternative Bars
from ml4t.engineer.bars import VolumeBarSampler, DollarBarSampler, TickImbalanceBarSampler
# Volume bars (equal volume per bar)
vbars = VolumeBarSampler(volume_threshold=1000).sample(tick_data)
# Dollar bars (equal dollar volume per bar)
dbars = DollarBarSampler(dollar_threshold=1_000_000).sample(tick_data)
# Tick imbalance bars (information-driven)
ibars = TickImbalanceBarSampler(expected_imbalance=100).sample(tick_data)
Documentation
- Features - 120 technical indicators across 11 categories
- Labeling - 7 labeling methods for supervised learning
- Alternative Bars - Information-driven bar sampling
- Feature Discovery - Registry, catalog, and search
- Fractional Differencing - Memory-preserving stationarity
- ML-Readiness - Normalized features and preprocessing
- Preprocessing - Scalers and leakage prevention
- Dataset Builder - Leakage-safe train/test preparation
Technical Characteristics
- Polars-native: All computations use Polars expressions
- Numba-accelerated: JIT compilation for numerical kernels
- TA-Lib validated: 60 indicators validated at 1e-6 tolerance
- AFML-compliant: Labeling methods verified against Advances in Financial Machine Learning
- ML-ready outputs: 37 features produce bounded outputs (0-100, -1 to 1) for direct model input; remaining features work with standard preprocessing (returns, z-scores, robust scaling)
Related Libraries
- ml4t-specs: Shared feed and artifact schema definitions across the ML4T stack
- ml4t-data: Market data acquisition and storage
- ml4t-diagnostic: Signal evaluation and statistical validation
- ml4t-backtest: Event-driven backtesting
- ml4t-live: Live trading with broker integration
Development
git clone https://github.com/applied-ai/ml4t-engineer.git
cd ml4t-engineer
uv sync
uv run pytest tests/ -q
uv run ty check
References
- Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley.
- Lopez de Prado, M. (2020). Machine Learning for Asset Managers. Cambridge.
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ml4t_engineer-0.1.0b8.tar.gz.
File metadata
- Download URL: ml4t_engineer-0.1.0b8.tar.gz
- Upload date:
- Size: 540.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3f9bd6049bf4086e27ba9a60fe5dfb6c720bdd6f6d28c51c58f78c77acd252ce
|
|
| MD5 |
c736afe29a09aca720f51b1309dbd269
|
|
| BLAKE2b-256 |
ef8418dd1f2d1f2921a11dd30acc3891c4dd28e8750cd8c8fd9ee56ab7a1fbf4
|
Provenance
The following attestation bundles were made for ml4t_engineer-0.1.0b8.tar.gz:
Publisher:
release.yml on ml4t/engineer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ml4t_engineer-0.1.0b8.tar.gz -
Subject digest:
3f9bd6049bf4086e27ba9a60fe5dfb6c720bdd6f6d28c51c58f78c77acd252ce - Sigstore transparency entry: 1442688708
- Sigstore integration time:
-
Permalink:
ml4t/engineer@76422e15f140783cf73feea80de6256a180dddbd -
Branch / Tag:
refs/tags/v0.1.0b8 - Owner: https://github.com/ml4t
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@76422e15f140783cf73feea80de6256a180dddbd -
Trigger Event:
push
-
Statement type:
File details
Details for the file ml4t_engineer-0.1.0b8-py3-none-any.whl.
File metadata
- Download URL: ml4t_engineer-0.1.0b8-py3-none-any.whl
- Upload date:
- Size: 363.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f00a140617db8b2977930d760434bdc874bb2f614a65943540a35872fee77e9
|
|
| MD5 |
e56ad9d5d482ef6e6836d5721940904a
|
|
| BLAKE2b-256 |
586d2fd9b0cdcafae12b586232532a30864b72784a77b98c493bd4dcd0d3ae58
|
Provenance
The following attestation bundles were made for ml4t_engineer-0.1.0b8-py3-none-any.whl:
Publisher:
release.yml on ml4t/engineer
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ml4t_engineer-0.1.0b8-py3-none-any.whl -
Subject digest:
1f00a140617db8b2977930d760434bdc874bb2f614a65943540a35872fee77e9 - Sigstore transparency entry: 1442688827
- Sigstore integration time:
-
Permalink:
ml4t/engineer@76422e15f140783cf73feea80de6256a180dddbd -
Branch / Tag:
refs/tags/v0.1.0b8 - Owner: https://github.com/ml4t
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@76422e15f140783cf73feea80de6256a180dddbd -
Trigger Event:
push
-
Statement type: