Skip to main content

Financial data processing, feature engineering and AI agent toolkit for Python

Project description

finasys

From raw market data to ML-ready features in five lines of code.

PyPI Tests Coverage License Python

Documentation: finasys Docs


finasys is a toolkit for financial data processing — not manual wrangling — for ML pipelines and AI agents. It lets you go from raw market data to production-ready features in a few lines of code, whether you're building trading models, running portfolio analysis, or powering financial AI agents.

finasys is Polars-first — every indicator and feature runs as a native Polars expression, making it 10-100x faster than pandas-based alternatives with zero C dependencies (no ta-lib build headaches). It supports 37+ international markets, crypto, forex, commodities, and macro indicators out of the box. Learn more via our official documentation or start contributing via this GitHub repo.

Quick Start

import finasys as fs

# Load stock data (auto-cached with DuckDB)
df = fs.load("AAPL", start="2024-01-01")

# Add technical indicators + returns in one call
df = fs.features.add_all(df)

# Generate an LLM-ready summary
print(fs.agents.summarize(df))

Install

pip install finasys

Optional extras:

pip install finasys[langchain]   # LangChain tool integration
pip install finasys[pandas]      # Pandas interop
pip install finasys[all]         # Everything

Features

Data Sources (fs.load())

  • Single fs.load() entry point for Yahoo Finance, CSV, and Parquet files
  • Standardized OHLCV column names across all sources
  • DuckDB-backed local caching (second call is instant)
  • Multi-symbol fetching with automatic alignment
df = fs.load("AAPL", start="2024-01-01")
df = fs.load(["AAPL", "GOOGL", "MSFT"], start="2024-01-01")
df = fs.load("./data/prices.csv")

Feature Engineering (fs.features)

  • 15+ technical indicators: RSI, MACD, Bollinger Bands, ATR, VWAP, OBV, Stochastic, ADX, CCI, Williams %R, MFI, ROC, Momentum
  • Returns: simple, log, cumulative, drawdown
  • Rolling statistics: mean, std, min, max, skew, z-score
  • Lag features with built-in look-ahead bias protection
  • Calendar features: day of week, month, quarter
  • Cross-sectional: rank, percentile, z-score across symbols

All implemented in pure Polars expressions -- no ta-lib C dependency, 10-100x faster than pandas-ta.

df = fs.features.rsi(df, period=14)
df = fs.features.macd(df)
df = fs.features.returns(df, periods=[1, 5, 21])

Target / Label Engineering (fs.features)

  • Forward returns for regression targets
  • Ternary classification labels (up/flat/down) with configurable thresholds
  • Triple-barrier labeling (Lopez de Prado method) -- the gold standard for financial ML
  • Volatility-adjusted labels that adapt to the current regime
# Forward returns for regression
df = fs.features.forward_returns(df, periods=[1, 5])

# Classification labels
df = fs.features.classify_returns(df, period=5, thresholds=(-0.01, 0.01))

# Triple-barrier method
df = fs.features.triple_barrier_labels(df, profit_take=0.02, stop_loss=0.02, max_holding=10)

# Volatility-adjusted labels (adapts to regime)
df = fs.features.volatility_adjusted_labels(df, period=5, vol_multiplier=1.0)

Distribution Features (fs.features)

  • Rolling kurtosis, skewness, tail ratio -- capture fat-tail dynamics
  • Rolling Jarque-Bera normality test
  • Z-score of returns vs rolling distribution
df = fs.features.rolling_kurtosis(df, window=30)
df = fs.features.rolling_skewness(df, window=30)
df = fs.features.tail_ratio(df, window=30)
df = fs.features.zscore_returns(df, window=30)

Risk & Performance Metrics (fs.stats)

  • Sharpe, Sortino, Calmar ratios
  • Value at Risk (historical, parametric, Cornish-Fisher)
  • Conditional VaR (Expected Shortfall)
  • CAPM alpha/beta, information ratio
  • Max drawdown duration tracking
  • Dual mode: scalar for reporting, rolling columns for ML features
# Scalar metrics (whole-series)
sharpe = fs.stats.sharpe_ratio(df)                         # => 1.47
var = fs.stats.value_at_risk(df, confidence=0.95)           # => -0.0216
cvar = fs.stats.cvar(df, confidence=0.95)                   # => -0.0285

# Rolling metrics (ML features)
df = fs.stats.sharpe_ratio(df, window=63)                   # adds sharpe_63
df = fs.stats.value_at_risk(df, window=63)                  # adds var_63

Smart Profiler (fs.profiler)

  • One-call data quality assessment for financial time series
  • Detects: missing dates, price outliers, suspected stock splits, zero-volume days
  • Distribution analysis: skewness, kurtosis, Jarque-Bera normality test, tail ratio
  • LLM-ready text summaries and JSON-serializable structured reports
# Text summary (great for LLM system prompts)
print(fs.profiler.profile_summary(df))
# DATA PROFILE | 252 rows x 7 columns
# Quality issues: 9 missing dates; 11 price outliers
# Returns distribution: skew=0.501, kurtosis=3.647, non-normal (JB p=0.0000)

# Full structured report
report = fs.profiler.profile(df)
report.quality.missing_dates      # ['2024-01-15', '2024-02-19', ...]
report.distribution.is_normal     # False
report.to_dict()                  # JSON-serializable

AI Agent Tools (fs.agents)

  • LLM-ready summaries of financial DataFrames
  • Tool definitions in OpenAI function-calling format
  • Context extraction for RAG-style usage
  • Schema descriptions for system prompts
  • LangChain integration (optional)
summary = fs.agents.summarize(df)
tools = fs.agents.tools(symbols=["AAPL", "GOOGL"])

from finasys.agents.langchain import get_tools
lc_tools = get_tools(symbols=["AAPL"])

Composable Pipelines (fs.FeatureSet)

Serializable, reproducible feature pipelines with 17 built-in step classes.

pipeline = fs.FeatureSet([
    fs.features.RSI(period=14),
    fs.features.Returns(periods=[1, 5, 21]),
    fs.features.RollingStats(windows=[5, 21]),
    fs.features.RollingKurtosis(window=30),
    fs.features.ForwardReturns(periods=[1, 5]),
    fs.features.TripleBarrier(profit_take=0.02, stop_loss=0.02),
])
df = pipeline.transform(df)
pipeline.save("pipeline.json")  # version control your feature engineering

Why finasys?

finasys pandas-ta ta-lib
Engine Polars (fast) pandas (slow) C library
Install pip install finasys pip install pandas-ta Requires C build tools
ML Targets Triple-barrier, vol-adjusted labels None None
Risk Metrics Sharpe, VaR, CVaR, alpha/beta None None
Data Profiling Financial-specific quality checks None None
AI Agent support Built-in None None
Caching DuckDB auto-cache None None
Look-ahead protection Built-in None None

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finasys-0.1.3.tar.gz (44.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

finasys-0.1.3-py3-none-any.whl (53.5 kB view details)

Uploaded Python 3

File details

Details for the file finasys-0.1.3.tar.gz.

File metadata

  • Download URL: finasys-0.1.3.tar.gz
  • Upload date:
  • Size: 44.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for finasys-0.1.3.tar.gz
Algorithm Hash digest
SHA256 7a8f4f3444506f1c0b5b3d3b39752ef1ae3a0a89b4aab688ceb903326ec2ccdc
MD5 c31badbaa6ebad7f23fd9fca62361ea8
BLAKE2b-256 a9fd875856213f1fb4ebb7eebf77b2c65df675616c9387e7ffbbca3d1ac546d4

See more details on using hashes here.

File details

Details for the file finasys-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: finasys-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 53.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.0

File hashes

Hashes for finasys-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 79c4869401032b5429369deff52c752ecb32669a9c0b1388532e0ae7180cf9e6
MD5 a2ef715a9e7a933b77d6a0beddb04a7c
BLAKE2b-256 23b6de649f5a2ee4e682dd3c384af96a6effef299294f91d8f17ed1fe0de1818

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page