High-performance feature engineering library for quantitative investment

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

YoungNa

These details have not been verified by PyPI

Project description

QFeatureLib

English | 中文

QFeatureLib is a high-performance, production-grade feature engineering library for quantitative investment. It focuses on financial time series processing with strict handling of future function avoidance, computational efficiency, and rigorous sample splitting.

Key Features

Zero Future Function: All time-series operations use shift=1 by default to prevent data leakage. The library raises FutureFunctionError if you accidentally try to use future information.
High Performance: Pure NumPy implementation with vectorized operations, 10-100x faster than pandas.
Memory Efficient: Uses views instead of copies, supports in-place operations for large-scale panel data.
Quantitative Finance Focused: Specialized for financial scenarios - suspended stock handling, industry neutralization, market cap neutralization, etc.

Installation

pip install qfeaturelib

For development:

pip install qfeaturelib[dev]

Quick Start

import numpy as np
from qfeaturelib import PanelData
from qfeaturelib.standardization import rolling_zscore, cs_zscore
from qfeaturelib.splitting import RollingWindowSplitter

# Create panel data (T=100 days, N=50 stocks, F=5 features)
values = np.random.randn(100, 50, 5)
dates = np.arange(100)
tickers = [f'STOCK_{i:02d}' for i in range(50)]

panel = PanelData(values, dates, tickers)

# Time-series standardization (rolling Z-score with shift=1 to prevent leakage)
zscore_values = rolling_zscore(
    panel.values[..., 0],  # First feature
    window=20,
    shift=1,  # Use past 20 days only, excluding current moment
)

# Cross-sectional standardization (Z-score across all stocks each day)
cs_values = cs_zscore(panel.values[..., 0])

# Sample splitting for backtesting
splitter = RollingWindowSplitter(
    n_samples=100,
    train_ratio=0.6,
    val_ratio=0.2,
    test_ratio=0.2,
)

for split in splitter.split():
    train_data = zscore_values[split.train]
    val_data = zscore_values[split.val]
    test_data = zscore_values[split.test]
    # Train your model...

Core Modules

1. Time-Series Standardization

Operations along the time dimension with rolling windows:

from qfeaturelib.standardization import (
    rolling_zscore,      # Rolling Z-Score
    rolling_robust_zscore,  # Robust Z-Score using Median/MAD
    rolling_minmax,      # Rolling Min-Max scaling
)

# Parameters explained
result = rolling_zscore(
    data,
    window=20,      # Rolling window size
    shift=1,        # Window end offset (shift=1 excludes current moment)
    outlier_method="squash",  # Outlier handling: 'truncate' or 'squash'
    outlier_bounds=(0.01, 0.99),  # Quantile bounds for outliers
)

2. Cross-Sectional Standardization

Operations across all assets at each time point:

from qfeaturelib.standardization import (
    cs_zscore,           # Cross-sectional Z-Score
    cs_robust_zscore,    # Cross-sectional robust Z-Score
    cs_minmax,           # Cross-sectional Min-Max
    cs_rank,             # Cross-sectional rank (percentile)
)

# Support for group-wise operations
result = cs_zscore(data, groups=industry_labels)

3. Sample Splitting Engine

Time-series aware train/validation/test splitting:

from qfeaturelib.splitting import RollingWindowSplitter, ExpandingWindowSplitter

# Rolling window (fixed training size)
rolling_splitter = RollingWindowSplitter(
    n_samples=1000,
    train_ratio=0.6,
    val_ratio=0.2,
    test_ratio=0.2,
    step=100,  # Roll forward 100 samples each iteration
    gap=0,     # Gap between train/val/test to prevent leakage
)

# Expanding window (growing training size)
expanding_splitter = ExpandingWindowSplitter(
    n_samples=1000,
    train_ratio=0.6,
    val_ratio=0.2,
    test_ratio=0.2,
    step=50,   # Expand by 50 samples each iteration
)

# Use split.apply() to split multiple arrays consistently
for split in rolling_splitter.split():
    (X_train, X_val, X_test), (y_train, y_val, y_test) = split.apply([X, y])

4. Missing Value Imputation

from qfeaturelib.imputation import (
    ffill,          # Forward fill
    ffill_limit,    # Forward fill with limit (prevents stale data filling)
    cs_median_fill, # Cross-sectional median fill
    cs_mean_fill,   # Cross-sectional mean fill
)

# Forward fill with maximum 5 consecutive fills
result = ffill_limit(data, limit=5)

5. Feature Neutralization

Remove effects of control factors via regression residuals:

from qfeaturelib.neutralization import (
    neutralize,
    industry_neutralize,
    size_neutralize,
)

# Industry neutralization
neutralized = industry_neutralize(feature, industry_labels)

# Size (market cap) neutralization
neutralized = size_neutralize(feature, log_market_cap)

# Custom control factors
neutralized = neutralize(feature, control_factors, method="ols")

6. Macro Indicators

Special handling for macro-economic indicators without asset dimension:

from qfeaturelib import (
    macro_rolling_zscore,
    adapt_macro_to_panel,
)

# Direct standardization of 1D macro data
gdp_zscore = macro_rolling_zscore(gdp_growth, window=12, shift=1)

# Broadcast to panel format for combination with asset features
gdp_panel = adapt_macro_to_panel(gdp_growth, n_assets=50)  # (T,) -> (T, N)

Performance Benchmarks

On standard test data (T=5000, N=1000, F=50):

Operation	Pandas	QFeatureLib	Speedup
Rolling Z-Score	~5s	~0.1s	50x
Cross-sectional Z-Score	~2s	~0.02s	100x
Rolling Rank	~10s	~0.5s	20x

Design Principles

Safety First: Default shift=1 prevents accidental future function usage
Vectorization: All core computations use NumPy vectorized operations
Memory Efficiency: Return views instead of copies, support in-place operations
Type Safety: Full type annotations, passes mypy strict mode

Related Projects

AssetPanelForest - Supervised clustering for panel data
MASFactorMiner - Factor mining and analysis
GeneralBacktest - Backtesting framework

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Changelog

See CHANGELOG.md for version history and changes.

Support

GitHub Issues: https://github.com/ElenYoung/QFeatureLib/issues
Documentation: https://github.com/ElenYoung/QFeatureLib#readme

Note: This library is part of a quantitative finance ecosystem. When implementing features, consider compatibility with downstream projects.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

YoungNa

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Apr 14, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

qfeaturelib-0.1.0.tar.gz (32.9 kB view details)

Uploaded Apr 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

qfeaturelib-0.1.0-py3-none-any.whl (38.6 kB view details)

Uploaded Apr 14, 2026 Python 3

File details

Details for the file qfeaturelib-0.1.0.tar.gz.

File metadata

Download URL: qfeaturelib-0.1.0.tar.gz
Upload date: Apr 14, 2026
Size: 32.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qfeaturelib-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`da2b748f7d9433b1f56aa09cc21a2684e72a93e659f9f1b3228a86766407cfa4`
MD5	`affb6db814055df7a47d665017a8c900`
BLAKE2b-256	`7dfd65de5412cbe0d7c473d633e40c0cd6ae44e50619ac3d96382cefc9d95636`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qfeaturelib-0.1.0.tar.gz:

Publisher: publish.yml on ElenYoung/QFeatureLib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qfeaturelib-0.1.0.tar.gz
- Subject digest: da2b748f7d9433b1f56aa09cc21a2684e72a93e659f9f1b3228a86766407cfa4
- Sigstore transparency entry: 1296342341
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: ElenYoung/QFeatureLib@73bdc07db1be803f9ec0e0199e5bbd3abb278b57
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ElenYoung
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@73bdc07db1be803f9ec0e0199e5bbd3abb278b57
- Trigger Event: push

File details

Details for the file qfeaturelib-0.1.0-py3-none-any.whl.

File metadata

Download URL: qfeaturelib-0.1.0-py3-none-any.whl
Upload date: Apr 14, 2026
Size: 38.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for qfeaturelib-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`07e3be6c798adc84cc29be31d8c5a595f9d08c2b000d66f79ba939efc8f4b556`
MD5	`27b7e4b8079b5798ee4de722a5065603`
BLAKE2b-256	`8db97585125f7ac12f0cbbac20caf77e5c6f45c647084610ae3a2c1ec81bba2c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for qfeaturelib-0.1.0-py3-none-any.whl:

Publisher: publish.yml on ElenYoung/QFeatureLib

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: qfeaturelib-0.1.0-py3-none-any.whl
- Subject digest: 07e3be6c798adc84cc29be31d8c5a595f9d08c2b000d66f79ba939efc8f4b556
- Sigstore transparency entry: 1296342875
- Sigstore integration time: Apr 14, 2026
Source repository:
- Permalink: ElenYoung/QFeatureLib@73bdc07db1be803f9ec0e0199e5bbd3abb278b57
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/ElenYoung
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@73bdc07db1be803f9ec0e0199e5bbd3abb278b57
- Trigger Event: push

qfeaturelib 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

QFeatureLib

Key Features

Installation

Quick Start

Core Modules

1. Time-Series Standardization

2. Cross-Sectional Standardization

3. Sample Splitting Engine

4. Missing Value Imputation

5. Feature Neutralization

6. Macro Indicators

Performance Benchmarks

Design Principles

Related Projects

License

Contributing

Changelog

Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance