Skip to main content

High-performance quantitative factor analysis and purification toolkit

Project description

AlphaPurify: Factor analytics for quants

AlphaPurify Python library for financial data aggregation, factor construction, IC testing, factor return attribution, full-pipeline backtesting, and large-scale experimentation to help quants rapidly validate ideas.


IC


AlphaPurify is comprised of 4 main modules:

  1. alphapurify.FactorAnalyzer — for IC testing and quantile portfolio analysis to evaluate factor predictive ability.
  2. alphapurify.AlphaPurifier — for factor preprocessing, including 40+ Winsorization, Neutralization, and Standardization methods.
  3. alphapurify.Database — for reading, writing, and aggregating financial and factor datasets.
  4. alphapurify.Exposures — for factor correlation analysis and factor-based return attribution.

Why AlphaPurify?

Compared with traditional factor research tools, You merely just need a Dataframe.

• Optimized for single-machine research

Many independent researchers work on a single laptop where memory overflow and slow computation are common issues.
AlphaPurify is designed with optimized caching, vectorized computation, and multiprocessing wherever possible.

For example, a 15-year daily dataset of the CSI 300 universe can complete full factor evaluation — including long-only, long-short, short portfolios and IC analysis — in around 30 seconds on a typical laptop.

• Adaptive to arbitrary bar frequency

AlphaPurify works with any bar frequency (daily, hourly, minute-level, etc.).
Return aggregation automatically adapts to the data frequency, while allowing users to explicitly specify the horizon if needed.

The framework is carefully designed to strictly prevent look-ahead bias.

• Professional factor preprocessing toolkit

AlphaPurify provides 40+ built-in preprocessing methods for factor research, including common operations such as:

  • winsorization
  • neutralization
  • standardization

This allows researchers to rapidly experiment with different factor cleaning pipelines.

• Lightweight high-performance data backend

AlphaPurify integrates a fast Parquet + DuckDB data layer for factor storage and aggregation.

This avoids the need for configuring complex database systems while still providing high-performance querying and fast factor construction workflows.


Quick Start

1.Install with pip

Users can easily install AlphaPurify by pip according to the following command.

pip install alphapurify

Note: pip will install the latest stable AlphaPurify. However, the main branch of AlphaPurify is in active development. If you want to test the latest scripts or functions in the main branch. Please install AlphaPurify with clone.


2.Load your DataFrame

datetime symbol close volume factor momentum_12_1 vol_60 beta_252
2024-01-01 09:30 AAPL 189.9 120034 0.42 0.15 0.21 1.08
2024-01-01 09:31 AAPL 190.0 98321 0.38 0.16 0.22 1.07
2024-01-01 09:32 AAPL 190.4 101245 0.41 0.17 0.23 1.06
2024-01-01 09:30 MSFT 378.5 84211 -0.15 -0.05 0.18 0.95
2024-01-01 09:31 MSFT 378.9 90122 -0.12 -0.04 0.19 0.96
2024-01-01 09:32 MSFT 379.1 95433 -0.08 -0.03 0.20 0.97

3.Creating reports

from alphapurify import AlphaPurifier, FactorAnalyzer, Pure_Exposures

# preprocess
df = (
    AlphaPurifier(df, factor_col="alpha_003")
    .winsorize(method="mad")
    .standardize(method="zscore")
    .to_result()
)

#backtest
FA = FactorAnalyzer(base_df=df,
                    trade_date_col='datetime',
                    symbol_col='symbol',
                    price_col='close',
                    factor_name='alpha_003')
FA.run()
FA.create_long_return_sheet()
FA.create_long_short_return_sheet()
FA.create_short_return_sheet()
FA.create_single_fac_ic_sheet()

#contributions of other factors
Ex = Pure_Exposures(
    base_df=df,
    trade_date_col='datetime',
    symbol_col='symbol',
    price_col='close',
    factor_name='alpha_003',
    exposure_cols=['momentum_12_1', 'vol_60', 'beta_252'],
)

Ex.run()
Ex.plot_pure_exposures()
Ex.plot_pure_returns()
Ex.plot_pure_exposures_and_returns()
Ex.plot_correlations()

Examples of Outputs

Portfolio for long positions only:

IC

Contributions of other factors:

IC2 IC2 IC2


P.S.

More detailed documentation and examples will be released soon.

Suggestions and improvements are welcome. Feel free to open an issue, submit a pull request, or contact me via email.


Elias Wu

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

alphapurify-0.1.6.tar.gz (60.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

alphapurify-0.1.6-py3-none-any.whl (61.1 kB view details)

Uploaded Python 3

File details

Details for the file alphapurify-0.1.6.tar.gz.

File metadata

  • Download URL: alphapurify-0.1.6.tar.gz
  • Upload date:
  • Size: 60.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for alphapurify-0.1.6.tar.gz
Algorithm Hash digest
SHA256 f722e1f7c8297f459a160cceaf72188f41bcde55f6a059da17878a542495e65a
MD5 b83ceb698dff2cd1078d595090ef8228
BLAKE2b-256 dabd1f08c870156212fc4eba46855f19a4cb46a28e998e542f2994ea4824343e

See more details on using hashes here.

File details

Details for the file alphapurify-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: alphapurify-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 61.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for alphapurify-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 069f8290d3e64466dc9b5825f803cf9ae00d65be2f608ae08b5f6c2f599b2161
MD5 72347c31a2c7dedba54330ff29cf8ebd
BLAKE2b-256 0a64d5192447b70beb8071a8c69b980c032d364f8ec6a25f9cdb0c4a3af85859

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page