Skip to main content

High-performance quantitative factor analysis and purification toolkit

Project description

AlphaPurify: Factor analytics for quants

AlphaPurify Python library for financial data aggregation, factor construction, IC testing, factor return attribution, full-pipeline backtesting, and large-scale experimentation to help quants rapidly validate ideas.


IC


AlphaPurify is comprised of 4 main modules:

  1. alphapurify.FactorAnalyzer — for IC testing and quantile portfolio analysis to evaluate factor predictive ability.
  2. alphapurify.AlphaPurifier — for factor preprocessing, including 40+ Winsorization, Neutralization, and Standardization methods.
  3. alphapurify.Database — for reading, writing, and aggregating financial and factor datasets.
  4. alphapurify.Exposures — for factor correlation analysis and factor-based return attribution.

Why AlphaPurify?

Compared with traditional factor research tools, You merely just need a Dataframe.

• Optimized for single-machine research

Many independent researchers work on a single laptop where memory overflow and slow computation are common issues.
AlphaPurify is designed with optimized caching, vectorized computation, and multiprocessing wherever possible.

For example, a 15-year daily dataset of the CSI 300 universe can complete full factor evaluation — including long-only, long-short, short portfolios and IC analysis — in around 30 seconds on a typical laptop.

• Adaptive to arbitrary bar frequency

AlphaPurify works with any bar frequency (daily, hourly, minute-level, etc.).
Return aggregation automatically adapts to the data frequency, while allowing users to explicitly specify the horizon if needed.

The framework is carefully designed to strictly prevent look-ahead bias.

• Professional factor preprocessing toolkit

AlphaPurify provides 40+ built-in preprocessing methods for factor research, including common operations such as:

  • winsorization
  • neutralization
  • standardization

This allows researchers to rapidly experiment with different factor cleaning pipelines.

• Lightweight high-performance data backend

AlphaPurify integrates a fast Parquet + DuckDB data layer for factor storage and aggregation.

This avoids the need for configuring complex database systems while still providing high-performance querying and fast factor construction workflows.


Quick Start

1.Install with pip

Users can easily install AlphaPurify by pip according to the following command.

pip install alphapurify

Note: pip will install the latest stable AlphaPurify. However, the main branch of AlphaPurify is in active development. If you want to test the latest scripts or functions in the main branch. Please install AlphaPurify with clone.


2.Load your DataFrame

datetime symbol close volume factor momentum_12_1 vol_60 beta_252
2024-01-01 09:30 AAPL 189.9 120034 0.42 0.15 0.21 1.08
2024-01-01 09:31 AAPL 190.0 98321 0.38 0.16 0.22 1.07
2024-01-01 09:32 AAPL 190.4 101245 0.41 0.17 0.23 1.06
2024-01-01 09:30 MSFT 378.5 84211 -0.15 -0.05 0.18 0.95
2024-01-01 09:31 MSFT 378.9 90122 -0.12 -0.04 0.19 0.96
2024-01-01 09:32 MSFT 379.1 95433 -0.08 -0.03 0.20 0.97

3.Creating reports

from alphapurify import AlphaPurifier, FactorAnalyzer, Pure_Exposures

# preprocess
df = (
    AlphaPurifier(df, factor_col="alpha_003")
    .winsorize(method="mad")
    .standardize(method="zscore")
    .to_result()
)

#backtest
FA = FactorAnalyzer(base_df=df,
                    trade_date_col='datetime',
                    symbol_col='symbol',
                    price_col='close',
                    factor_name='alpha_003')
FA.run()
FA.create_long_return_sheet()
FA.create_long_short_return_sheet()
FA.create_short_return_sheet()
FA.create_single_fac_ic_sheet()

#contributions of other factors
Ex = Pure_Exposures(
    base_df=df,
    trade_date_col='datetime',
    symbol_col='symbol',
    price_col='close',
    factor_name='alpha_003',
    exposure_cols=['momentum_12_1', 'vol_60', 'beta_252'],
)

Ex.run()
Ex.plot_pure_exposures()
Ex.plot_pure_returns()
Ex.plot_pure_exposures_and_returns()
Ex.plot_correlations()

Examples of Outputs

Portfolio for long positions only:

IC

Contributions of other factors:

IC2 IC2 IC2


P.S.

More detailed documentation and examples will be released soon.

Suggestions and improvements are welcome. Feel free to open an issue, submit a pull request, or contact me via email.


Elias Wu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alphapurify-0.1.7.tar.gz (62.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alphapurify-0.1.7-py3-none-any.whl (63.3 kB view details)

Uploaded Python 3

File details

Details for the file alphapurify-0.1.7.tar.gz.

File metadata

  • Download URL: alphapurify-0.1.7.tar.gz
  • Upload date:
  • Size: 62.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for alphapurify-0.1.7.tar.gz
Algorithm Hash digest
SHA256 8a7991b7a0f89056286305e8d8f6e40a0a6441c1dff939f665fe085a666ce9af
MD5 9ffee2588d009382d7998bfd047c9c0b
BLAKE2b-256 8fd9abfff8046f0014e4ff67907502f4348c8334f38e32354f8ba30055bfd49d

See more details on using hashes here.

File details

Details for the file alphapurify-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: alphapurify-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 63.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for alphapurify-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 091a7e9bf6947ea69791b1640ea10b1d2b6af5d819fed952f755624047c9b71a
MD5 8610543c806805eb17999991686d5a0b
BLAKE2b-256 41885e07907c7310c05a669dc5f16911c9d94dab7479df38a95929e451212a20

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page