Skip to main content

High-performance Pythonic backtesting engine with Apache Parquet storage

Project description

Zipline Refresh

A high-performance Pythonic backtesting engine for algorithmic trading strategies

Python 3.10+ PyPI Tests License


Zipline is a Pythonic event-driven system for backtesting, originally developed by Quantopian. This Refresh fork modernizes the storage layer, eliminates legacy dependencies, and delivers significant performance improvements.

Documentation  ·  Website  ·  Report Bug


What's New in Refresh

Storage: bcolz → Apache Parquet

The legacy bcolz storage layer has been fully replaced with Apache Parquet via PyArrow:

bcolz (legacy) Parquet (new)
Format Custom binary + Cython Standard columnar, zstd compressed
Daily bars One ctable per field Single .parquet file per bundle
Minute bars Fixed-stride padding for early closes Actual trading minutes only
Dependencies bcolz (unmaintained, build issues) pyarrow (actively maintained)
Data types uint32 (lossy for prices) float64 (full precision)

Performance Optimizations

Profiling-driven optimizations on the backtest hot path (50 assets, 780 bars/session):

Optimization Speedup Detail
Lazy per-field loading 3.2x single field Load only requested OHLCV fields instead of all 5
NumPy int64 searchsorted 40x per lookup Replace DatetimeIndex.get_loc() with np.searchsorted on int64 arrays
Vectorized last-traded 17x np.flatnonzero instead of Python loop for get_last_traded_dt
Batch resample aggregation 5x lifetimes Vectorized _lifetimes_map and batch load_raw_arrays in DailyHistoryAggregator

Net result: pandas DatetimeIndex overhead reduced from 46% to 6.5% of hot-path time. Overall backtest data layer is ~2x faster.

Benchmark: Time breakdown (50 assets x 780 bars)
Before optimization:
  pandas DatetimeIndex             46.0%  ██████████████████████████████████████████████
  get_value (reader)               13.0%  █████████████
  memoize/lazyval                  10.0%  ██████████

After optimization:
  pandas DatetimeIndex              6.5%  ██████
  get_value (reader)               26.7%  ██████████████████████████
  memoize/lazyval                  12.9%  ████████████
  numpy operations                 12.2%  ████████████

Total hot-path time: 0.44s → 0.24s (1.8x faster)
Per-bar latency: 0.6ms → 0.3ms

Features

  • Event-Driven Architecture — Realistic simulation with proper order lifecycle, slippage, and commission models
  • Pipeline API — Factor-based screening with 20+ built-in technical factors (RSI, MACD, Bollinger, Ichimoku, etc.) and easy CustomFactor extensibility
  • Factor Compositionrank(), zscore(), demean(), winsorize(), top(N) with groupby for sector-neutral strategies
  • PyData Integration — pandas DataFrames in/out, compatible with matplotlib, scipy, statsmodels, scikit-learn
  • Multi-Country Support — 42 country domains with proper trading calendars via exchange_calendars
  • Minute & Daily Resolution — Full minute-level backtesting with proper market open/close handling

Installation

Zipline supports Python >= 3.10 and is compatible with current versions of NumFOCUS libraries.

Using pip

pip install zipline-refresh

From source

git clone https://github.com/teleclaws/zipline-refresh.git
cd zipline-refresh
pip install -e .

See the documentation for detailed instructions.

Quickstart

Example 1: RSI Long/Short Pipeline Strategy

Use the Pipeline API to rank stocks by RSI and build a long/short portfolio — rebalanced daily:

from zipline.api import attach_pipeline, order_target_percent, pipeline_output, schedule_function
from zipline.finance import commission, slippage
from zipline.pipeline import Pipeline
from zipline.pipeline.factors import RSI


def make_pipeline():
    rsi = RSI()
    return Pipeline(
        columns={"longs": rsi.top(3), "shorts": rsi.bottom(3)},
    )


def initialize(context):
    attach_pipeline(make_pipeline(), "my_pipeline")
    schedule_function(rebalance)
    context.set_commission(commission.PerShare(cost=0.001, min_trade_cost=1.0))
    context.set_slippage(slippage.VolumeShareSlippage())


def before_trading_start(context, data):
    context.pipeline_data = pipeline_output("my_pipeline")


def rebalance(context, data):
    pipeline_data = context.pipeline_data
    longs = pipeline_data.index[pipeline_data.longs]
    shorts = pipeline_data.index[pipeline_data.shorts]

    for asset in longs:
        order_target_percent(asset, 1.0 / 3.0)
    for asset in shorts:
        order_target_percent(asset, -1.0 / 3.0)

    for asset in context.portfolio.positions:
        if asset not in longs and asset not in shorts and data.can_trade(asset):
            order_target_percent(asset, 0)

Example 2: Multi-Factor Ranking

Combine multiple factors with ranking and normalization:

from zipline.pipeline import Pipeline
from zipline.pipeline.factors import AverageDollarVolume, Returns, RSI


def make_pipeline():
    # Factor definitions
    momentum = Returns(window_length=20).rank()
    mean_reversion = -Returns(window_length=5).rank()
    rsi_signal = RSI().rank()

    # Composite score (equal-weighted)
    composite = (momentum + mean_reversion + rsi_signal).rank()

    # Liquidity filter
    liquid = AverageDollarVolume(window_length=30).top(100)

    return Pipeline(
        columns={
            "score": composite,
            "longs": composite.top(10, mask=liquid),
            "shorts": composite.bottom(10, mask=liquid),
        },
        screen=liquid,
    )

Data Ingestion

Zipline supports CSV-based data bundles for any market:

# In ~/.zipline/extension.py
from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities

register(
    "my-data",
    csvdir_equities(["daily"], "/path/to/csv/dir"),
    calendar_name="XNYS",
)
# Ingest and run
zipline ingest -b my-data
zipline run -f strategy.py --start 2020-1-1 --end 2024-1-1 -o results.pickle --no-benchmark -b my-data

More examples in the examples directory.

Compatibility Notes

Release 3.05 — Compatible with NumPy 2.0 (requires pandas >= 2.2.2)

Release 3.0 — Updated to pandas >= 2.0 and SQLAlchemy > 2.0

Release 2.4 — Updated to exchange_calendars >= 4.2

Contributing

This project is sponsored by Kavout.

Found a bug or have a suggestion? Open an issue.

License

Apache 2.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipline_refresh-0.1.dev6708.tar.gz (13.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zipline_refresh-0.1.dev6708-cp312-cp312-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file zipline_refresh-0.1.dev6708.tar.gz.

File metadata

  • Download URL: zipline_refresh-0.1.dev6708.tar.gz
  • Upload date:
  • Size: 13.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for zipline_refresh-0.1.dev6708.tar.gz
Algorithm Hash digest
SHA256 0d6c3cf1830a187c6852f2b841b48b9476dc909911498e88bee759e0e2a620f3
MD5 b506eb0320da1591dd03a30afc2664fb
BLAKE2b-256 2cd565b476cea98c3c3dceac90fc6239011e7b7f6195d1b7151200491623db6b

See more details on using hashes here.

File details

Details for the file zipline_refresh-0.1.dev6708-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for zipline_refresh-0.1.dev6708-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 62177b72ec84258f2e3193babecb675d16f3c885f1bb525e12fb2f316c3dde3c
MD5 9b1e88b07957bb61758d9bd5ccf13ebf
BLAKE2b-256 543fd5fd458d42bff9fe4752f48aca0f13b0f13628196ebfb2f74b048ff23be3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page