Skip to main content

High-performance Pythonic backtesting engine with Apache Parquet storage

Project description

Zipline Refresh

A high-performance Pythonic backtesting engine for algorithmic trading strategies

Python 3.10+ PyPI Tests License


Zipline is a Pythonic event-driven system for backtesting, originally developed by Quantopian. This Refresh fork modernizes the storage layer, eliminates legacy dependencies, and delivers significant performance improvements.

Documentation  ·  Website  ·  Report Bug


What's New in Refresh

Storage: bcolz → Apache Parquet

The legacy bcolz storage layer has been fully replaced with Apache Parquet via PyArrow:

bcolz (legacy) Parquet (new)
Format Custom binary + Cython Standard columnar, zstd compressed
Daily bars One ctable per field Single .parquet file per bundle
Minute bars Fixed-stride padding for early closes Actual trading minutes only
Dependencies bcolz (unmaintained, build issues) pyarrow (actively maintained)
Data types uint32 (lossy for prices) float64 (full precision)

Performance Optimizations

Profiling-driven optimizations on the backtest hot path (50 assets, 780 bars/session):

Optimization Speedup Detail
Lazy per-field loading 3.2x single field Load only requested OHLCV fields instead of all 5
NumPy int64 searchsorted 40x per lookup Replace DatetimeIndex.get_loc() with np.searchsorted on int64 arrays
Vectorized last-traded 17x np.flatnonzero instead of Python loop for get_last_traded_dt
Batch resample aggregation 5x lifetimes Vectorized _lifetimes_map and batch load_raw_arrays in DailyHistoryAggregator

Net result: pandas DatetimeIndex overhead reduced from 46% to 6.5% of hot-path time. Overall backtest data layer is ~2x faster.

Benchmark: Time breakdown (50 assets x 780 bars)
Before optimization:
  pandas DatetimeIndex             46.0%  ██████████████████████████████████████████████
  get_value (reader)               13.0%  █████████████
  memoize/lazyval                  10.0%  ██████████

After optimization:
  pandas DatetimeIndex              6.5%  ██████
  get_value (reader)               26.7%  ██████████████████████████
  memoize/lazyval                  12.9%  ████████████
  numpy operations                 12.2%  ████████████

Total hot-path time: 0.44s → 0.24s (1.8x faster)
Per-bar latency: 0.6ms → 0.3ms

Features

  • Event-Driven Architecture — Realistic simulation with proper order lifecycle, slippage, and commission models
  • Pipeline API — Factor-based screening with 20+ built-in technical factors (RSI, MACD, Bollinger, Ichimoku, etc.) and easy CustomFactor extensibility
  • Factor Compositionrank(), zscore(), demean(), winsorize(), top(N) with groupby for sector-neutral strategies
  • PyData Integration — pandas DataFrames in/out, compatible with matplotlib, scipy, statsmodels, scikit-learn
  • Multi-Country Support — 42 country domains with proper trading calendars via exchange_calendars
  • Minute & Daily Resolution — Full minute-level backtesting with proper market open/close handling

Installation

Zipline supports Python >= 3.10 and is compatible with current versions of NumFOCUS libraries.

Using pip

pip install zipline-refresh

From source

git clone https://github.com/teleclaws/zipline-refresh.git
cd zipline-refresh
pip install -e .

See the documentation for detailed instructions.

Quickstart

The following implements a simple dual moving average crossover strategy:

from zipline.api import order_target, record, symbol


def initialize(context):
    context.i = 0
    context.asset = symbol('AAPL')


def handle_data(context, data):
    context.i += 1
    if context.i < 300:
        return

    short_mavg = data.history(context.asset, 'price', bar_count=100, frequency="1d").mean()
    long_mavg = data.history(context.asset, 'price', bar_count=300, frequency="1d").mean()

    if short_mavg > long_mavg:
        order_target(context.asset, 100)
    elif short_mavg < long_mavg:
        order_target(context.asset, 0)

    record(AAPL=data.current(context.asset, 'price'),
           short_mavg=short_mavg,
           long_mavg=long_mavg)

Run it

# Ingest data from NASDAQ (requires free API key from https://data.nasdaq.com)
export QUANDL_API_KEY="your_key_here"
zipline ingest -b quandl

# Run backtest
zipline run -f dual_moving_average.py --start 2014-1-1 --end 2018-1-1 -o dma.pickle --no-benchmark

More examples in the zipline/examples directory.

Compatibility Notes

Release 3.05 — Compatible with NumPy 2.0 (requires pandas >= 2.2.2)

Release 3.0 — Updated to pandas >= 2.0 and SQLAlchemy > 2.0

Release 2.4 — Updated to exchange_calendars >= 4.2

Contributing

This project is sponsored by Kavout.

Found a bug or have a suggestion? Open an issue.

License

Apache 2.0. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zipline_refresh-0.1.dev6707.tar.gz (13.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zipline_refresh-0.1.dev6707-cp312-cp312-macosx_11_0_arm64.whl (5.0 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file zipline_refresh-0.1.dev6707.tar.gz.

File metadata

  • Download URL: zipline_refresh-0.1.dev6707.tar.gz
  • Upload date:
  • Size: 13.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for zipline_refresh-0.1.dev6707.tar.gz
Algorithm Hash digest
SHA256 09615ba4b00da44cf07a05ba8fcaef8062df88af58aab2ab1c974acde2200489
MD5 3c6b9bbb28d5bd23750d0ed2d08f26f4
BLAKE2b-256 84e6355fe26badff024216a24fb41793a0a163ebb07eee4fbf76520704362560

See more details on using hashes here.

File details

Details for the file zipline_refresh-0.1.dev6707-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for zipline_refresh-0.1.dev6707-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d795e4c355358fb4201b2cb8367c3eda85270d1031d55aadac0c884eec4591d3
MD5 30aad95f63a2a1c20308848844b4ecd3
BLAKE2b-256 6f949c7554a8bbe4001eebf9992083bce117c1942298f05b76cea635844c8847

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page