Skip to main content

Data pipeline for algo-trading, getting and storing both real-time and historical data made easy.

Project description

PFeed: Data Pipeline for Algo-Trading, Getting and Storing Real-Time and Historical Data Made Easy.

GitHub stars PyPI downloads PyPI PyPI - Support Python Versions Jupyter Book Badge Poetry

Problem

Starting algo-trading requires reliable, clean data. However, the time-consuming and mundane tasks of data cleaning and storage often discourage traders from embarking on their algo-trading journey.

Solution

By leveraging modern data engineering tools, pfeed handles the tedious data work and outputs backtesting-ready data, accelerating traders to get to the strategy development phase.


PFeed (/piː fiːd/) is a data pipeline for algorithmic trading, serving as a bridge between raw data sources and traders by automating the process of data collection, cleaning, transformation, and storage, loading clean data into a local data lake for quantitative analysis.

Core Features

  • Unified approach for interacting with various data sources and obtaining historical and live data
  • ETL data pipline for transforming raw data to clean data and storing it in MinIO (optional)
  • Fast data downloading, utilizing Ray for parallelization
  • Supports multiple data tools (e.g. Pandas, Polars, Dask, Spark, DuckDB, Daft)
  • Integrates with Prefect to control data flows
  • Listens to PFund's trade engine and adds trade history to a local database Timescaledb (optional)

It is designed to be used alongside PFund — A Complete Algo-Trading Framework for Machine Learning, TradFi, CeFi and DeFi ready.


Table of Contents

Installation

Using Poetry (Recommended)

# [RECOMMENDED]: Download data (e.g. Bybit and Yahoo Finance) + Data tools (e.g. pandas, polars) + Data storage (e.g. MinIO) + Boosted performance (e.g. Ray)
poetry add "pfeed[all]"

# [Download data + Data tools + Data storage]
poetry add "pfeed[df,data]"

# [Download data + Data tools]
poetry add "pfeed[df]"

# [Download data only]:
poetry add pfeed

# update to the latest version:
poetry update pfeed

Using Pip

# same as above, you can choose to install "pfeed[all]", "pfeed[df,data]", "pfeed[df]" or "pfeed"
pip install "pfeed[all]"

# install the latest version:
pip install -U pfeed

Checking your installation

$ pfeed --version

Quick Start

1. Get Historical Data in Dataframe (No storage)

Get Bybit's data in dataframe, e.g. 1-minute data (data is downloaded on the fly if not stored locally)

import pfeed as pe

feed = pe.BybitFeed(data_tool='polars')

df = feed.get_historical_data(
    'BTC_USDT_PERP',
    resolution='1minute',  # 'raw' or '1tick'/'1t' or '2second'/'2s' etc.
    start_date='2024-03-01',
    end_date='2024-03-01',
)

Printing the first few rows of df:

ts product resolution open high low close volume
0 2024-03-01 00:00:00 BTC_USDT_PERP 1m 61184.1 61244.5 61175.8 61244.5 159.142
1 2024-03-01 00:01:00 BTC_USDT_PERP 1m 61245.3 61276.5 61200.7 61232.2 227.242
2 2024-03-01 00:02:00 BTC_USDT_PERP 1m 61232.2 61249 61180 61184.2 91.446

By using pfeed, you are just a few lines of code away from a standardized dataframe, how convenient!

2. Download Historical Data on the Command Line Interface (CLI)

# download data, default data type (dtype) is 'raw' data
pfeed download -d BYBIT -p BTC_USDT_PERP --start-date 2024-03-01 --end-date 2024-03-08

# download multiple products BTC_USDT_PERP and ETH_USDT_PERP and minute data
pfeed download -d BYBIT -p BTC_USDT_PERP -p ETH_USDT_PERP --dtypes minute

# download all perpetuals data from bybit
pfeed download -d BYBIT --ptypes PERP

# download all the data from bybit (CAUTION: your local machine probably won't have enough space for this!)
pfeed download -d BYBIT

# store data into MinIO (need to start MinIO by running `pfeed docker-compose up -d` first)
pfeed download -d BYBIT -p BTC_USDT_PERP --use-minio

# enable debug mode and turn off using Ray
pfeed download -d BYBIT -p BTC_USDT_PERP --debug --no-ray

3. Download Historical Data in Python

import pfeed as pe

# compared to the CLI approach, this approach is more convenient for downloading multiple products
pe.download(
    data_source='bybit',
    pdts=[
        'BTC_USDT_PERP',
        'ETH_USDT_PERP',
        'BCH_USDT_PERP',
    ],
    dtypes=['raw'],  # data types, e.g. 'raw', 'tick', 'second', 'minute' etc.
    start_date='2024-03-01',
    end_date='2024-03-08',
    use_minio=False,
)

List Current Config

# list the current config:
pfeed config --list

# change the data storage location to your local project's 'data' folder:
pfeed config --data-path ./data

# for more commands:
pfeed --help

Run PFeed's docker-compose.yml

# same as 'docker-compose', only difference is it has pointed to pfeed's docker-compose.yml file
pfeed docker-compose [COMMAND]

# e.g. start services
pfeed docker-compose up -d

# e.g. stop services
pfeed docker-compose down

Supported Data Sources

Data Source Get Historical Data Download Historical Data Get Live/Paper Data Stream Live/Paper Data
Yahoo Finance 🟢
Bybit 🟢 🟢 🟡 🔴
*Interactive Brokers (IB) 🔴 🔴 🔴
*FirstRate Data 🔴 🔴
Databento 🔴 🔴 🔴 🔴
Polygon 🔴 🔴 🔴 🔴
Binance 🔴 🔴 🔴 🔴
OKX 🔴 🔴 🔴 🔴

🟢 = finished
🟡 = in progress
🔴 = todo
⚪ = not applicable
* = paid data

Supported Data Tools

Data Tools Supported
Pandas 🟢
Polars 🟢
Dask 🔴
Spark 🔴
DuckDB 🔴
Daft 🔴

Related Projects

  • PFund — A Complete Algo-Trading Framework for Machine Learning, TradFi, CeFi and DeFi ready. Supports Vectorized and Event-Driven Backtesting, Paper and Live Trading
  • PyTrade.org - A curated list of Python libraries and resources for algorithmic trading.

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

This framework is intended for educational and research purposes only. It should not be used for real trading without understanding the risks involved. Trading in financial markets involves significant risk, and there is always the potential for loss. Your trading results may vary. No representation is being made that any account will or is likely to achieve profits or losses similar to those discussed on this platform.

The developers of this framework are not responsible for any financial losses incurred from using this software. This includes but not limited to losses resulting from inaccuracies in any financial data output by PFeed. Users should conduct their due diligence, verify the accuracy of any data produced by PFeed, and consult with a professional financial advisor before engaging in real trading activities.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pfeed-0.0.2.dev1.tar.gz (41.0 kB view details)

Uploaded Source

Built Distribution

pfeed-0.0.2.dev1-py3-none-any.whl (52.9 kB view details)

Uploaded Python 3

File details

Details for the file pfeed-0.0.2.dev1.tar.gz.

File metadata

  • Download URL: pfeed-0.0.2.dev1.tar.gz
  • Upload date:
  • Size: 41.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.4 Darwin/23.6.0

File hashes

Hashes for pfeed-0.0.2.dev1.tar.gz
Algorithm Hash digest
SHA256 8af408955d16694eaa3c4139f27d7e77325e50c91e717bd93d3408d293813714
MD5 81280fd3108a242bcc64125db3b52ab0
BLAKE2b-256 9279425c09837076d19a12df4d41d927b11d3e182fef9d325e13056eb6d728d0

See more details on using hashes here.

File details

Details for the file pfeed-0.0.2.dev1-py3-none-any.whl.

File metadata

  • Download URL: pfeed-0.0.2.dev1-py3-none-any.whl
  • Upload date:
  • Size: 52.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.12.4 Darwin/23.6.0

File hashes

Hashes for pfeed-0.0.2.dev1-py3-none-any.whl
Algorithm Hash digest
SHA256 cfec1e9f842e76881e88f70be0b91e348954b0367962920f168e1f677bb9f980
MD5 b30729de5d1774d6846de132755ee70a
BLAKE2b-256 22a8d5f740ef69e3148c87e7becd0a4a3c2f77dd138603ab9d5206c873f4b8e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page