Financial data processing, feature engineering and AI agent toolkit for Python
Project description
From raw market data to ML-ready features in five lines of code.
Documentation: finasys Docs
finasys is a toolkit for financial data processing — not manual wrangling — for ML pipelines and AI agents. It lets you go from raw market data to production-ready features in a few lines of code, whether you're building trading models, running portfolio analysis, or powering financial AI agents.
finasys is Polars-first — every indicator and feature runs as a native Polars expression, making it 10-100x faster than pandas-based alternatives with zero C dependencies (no ta-lib build headaches). It supports 37+ international markets, crypto, forex, commodities, and macro indicators out of the box. Learn more via our official documentation or start contributing via this GitHub repo.
Quick Start
import finasys as fs
# Load stock data (auto-cached with DuckDB)
df = fs.load("AAPL", start="2024-01-01")
# Add technical indicators + returns in one call
df = fs.features.add_all(df)
# Generate an LLM-ready summary
print(fs.agents.summarize(df))
Install
pip install finasys
Optional extras:
pip install finasys[langchain] # LangChain tool integration
pip install finasys[pandas] # Pandas interop
pip install finasys[all] # Everything
Features
Data Sources (fs.load())
- Single
fs.load()entry point for Yahoo Finance, CSV, and Parquet files - Standardized OHLCV column names across all sources
- DuckDB-backed local caching (second call is instant)
- Multi-symbol fetching with automatic alignment
df = fs.load("AAPL", start="2024-01-01")
df = fs.load(["AAPL", "GOOGL", "MSFT"], start="2024-01-01")
df = fs.load("./data/prices.csv")
Feature Engineering (fs.features)
- 15+ technical indicators: RSI, MACD, Bollinger Bands, ATR, VWAP, OBV, Stochastic, ADX, CCI, Williams %R, MFI, ROC, Momentum
- Returns: simple, log, cumulative, drawdown
- Rolling statistics: mean, std, min, max, skew, z-score
- Lag features with built-in look-ahead bias protection
- Calendar features: day of week, month, quarter
- Cross-sectional: rank, percentile, z-score across symbols
All implemented in pure Polars expressions -- no ta-lib C dependency, 10-100x faster than pandas-ta.
df = fs.features.rsi(df, period=14)
df = fs.features.macd(df)
df = fs.features.returns(df, periods=[1, 5, 21])
Target / Label Engineering (fs.features)
- Forward returns for regression targets
- Ternary classification labels (up/flat/down) with configurable thresholds
- Triple-barrier labeling (Lopez de Prado method) -- the gold standard for financial ML
- Volatility-adjusted labels that adapt to the current regime
# Forward returns for regression
df = fs.features.forward_returns(df, periods=[1, 5])
# Classification labels
df = fs.features.classify_returns(df, period=5, thresholds=(-0.01, 0.01))
# Triple-barrier method
df = fs.features.triple_barrier_labels(df, profit_take=0.02, stop_loss=0.02, max_holding=10)
# Volatility-adjusted labels (adapts to regime)
df = fs.features.volatility_adjusted_labels(df, period=5, vol_multiplier=1.0)
Distribution Features (fs.features)
- Rolling kurtosis, skewness, tail ratio -- capture fat-tail dynamics
- Rolling Jarque-Bera normality test
- Z-score of returns vs rolling distribution
df = fs.features.rolling_kurtosis(df, window=30)
df = fs.features.rolling_skewness(df, window=30)
df = fs.features.tail_ratio(df, window=30)
df = fs.features.zscore_returns(df, window=30)
Risk & Performance Metrics (fs.stats)
- Sharpe, Sortino, Calmar ratios
- Value at Risk (historical, parametric, Cornish-Fisher)
- Conditional VaR (Expected Shortfall)
- CAPM alpha/beta, information ratio
- Max drawdown duration tracking
- Dual mode: scalar for reporting, rolling columns for ML features
# Scalar metrics (whole-series)
sharpe = fs.stats.sharpe_ratio(df) # => 1.47
var = fs.stats.value_at_risk(df, confidence=0.95) # => -0.0216
cvar = fs.stats.cvar(df, confidence=0.95) # => -0.0285
# Rolling metrics (ML features)
df = fs.stats.sharpe_ratio(df, window=63) # adds sharpe_63
df = fs.stats.value_at_risk(df, window=63) # adds var_63
Smart Profiler (fs.profiler)
- One-call data quality assessment for financial time series
- Detects: missing dates, price outliers, suspected stock splits, zero-volume days
- Distribution analysis: skewness, kurtosis, Jarque-Bera normality test, tail ratio
- LLM-ready text summaries and JSON-serializable structured reports
# Text summary (great for LLM system prompts)
print(fs.profiler.profile_summary(df))
# DATA PROFILE | 252 rows x 7 columns
# Quality issues: 9 missing dates; 11 price outliers
# Returns distribution: skew=0.501, kurtosis=3.647, non-normal (JB p=0.0000)
# Full structured report
report = fs.profiler.profile(df)
report.quality.missing_dates # ['2024-01-15', '2024-02-19', ...]
report.distribution.is_normal # False
report.to_dict() # JSON-serializable
AI Agent Tools (fs.agents)
- LLM-ready summaries of financial DataFrames
- Tool definitions in OpenAI function-calling format
- Context extraction for RAG-style usage
- Schema descriptions for system prompts
- LangChain integration (optional)
summary = fs.agents.summarize(df)
tools = fs.agents.tools(symbols=["AAPL", "GOOGL"])
from finasys.agents.langchain import get_tools
lc_tools = get_tools(symbols=["AAPL"])
Composable Pipelines (fs.FeatureSet)
Serializable, reproducible feature pipelines with 17 built-in step classes.
pipeline = fs.FeatureSet([
fs.features.RSI(period=14),
fs.features.Returns(periods=[1, 5, 21]),
fs.features.RollingStats(windows=[5, 21]),
fs.features.RollingKurtosis(window=30),
fs.features.ForwardReturns(periods=[1, 5]),
fs.features.TripleBarrier(profit_take=0.02, stop_loss=0.02),
])
df = pipeline.transform(df)
pipeline.save("pipeline.json") # version control your feature engineering
Why finasys?
| finasys | pandas-ta | ta-lib | |
|---|---|---|---|
| Engine | Polars (fast) | pandas (slow) | C library |
| Install | pip install finasys |
pip install pandas-ta |
Requires C build tools |
| ML Targets | Triple-barrier, vol-adjusted labels | None | None |
| Risk Metrics | Sharpe, VaR, CVaR, alpha/beta | None | None |
| Data Profiling | Financial-specific quality checks | None | None |
| AI Agent support | Built-in | None | None |
| Caching | DuckDB auto-cache | None | None |
| Look-ahead protection | Built-in | None | None |
License
Apache-2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file finasys-0.1.3.tar.gz.
File metadata
- Download URL: finasys-0.1.3.tar.gz
- Upload date:
- Size: 44.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a8f4f3444506f1c0b5b3d3b39752ef1ae3a0a89b4aab688ceb903326ec2ccdc
|
|
| MD5 |
c31badbaa6ebad7f23fd9fca62361ea8
|
|
| BLAKE2b-256 |
a9fd875856213f1fb4ebb7eebf77b2c65df675616c9387e7ffbbca3d1ac546d4
|
File details
Details for the file finasys-0.1.3-py3-none-any.whl.
File metadata
- Download URL: finasys-0.1.3-py3-none-any.whl
- Upload date:
- Size: 53.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79c4869401032b5429369deff52c752ecb32669a9c0b1388532e0ae7180cf9e6
|
|
| MD5 |
a2ef715a9e7a933b77d6a0beddb04a7c
|
|
| BLAKE2b-256 |
23b6de649f5a2ee4e682dd3c384af96a6effef299294f91d8f17ed1fe0de1818
|