Skip to main content

Cryptoasset data library

Project description

CryptoDataPy

Better data beats advanced algorithms


CryptoDataPy is a python library which makes it easy to build high quality data pipelines for the analysis of digital assets. By providing easy access to over 100,000 time series for thousands of assets, it facilitates the pre-processing of a wide range of data from different sources.

Cryptoassets generate a huge amount of market, on-chain and off-chain data. But unlike legacy financial markets, this data is often fragmented, unstructured and dirty. By extracting data from various sources, pre-processing it into a user-friendly (tidy) format, detecting and repairing 'bad' data, and allowing for easy storage and retrieval, CryptoDataPy allows you to spend less time gathering and cleaning data, and more time analyzing it.

Our data includes:

  • Market: market prices of varying granularity (e.g. tick, trade and bar data, aka OHLC), for spot, futures and options markets, as well as funding rates for the analysis of cryptoasset returns.
  • On-chain: network health and usage data, circulating supply, asset holder positions and cost-basis, for the analysis of underlying crypto network fundamentals.
  • Off-chain: news, social media, developer activity, web traffic and search for project interest and sentiment, as well as traditional financial market and macroeconomic data for broader financial and economic conditions.

The library's intuitive interface facilitates each step of the ETL/ETL (extract-transform-load) process:

  • Extract: Extracting data from a wide range of data sources and file formats.
  • Transform:
    • Wrangling data into a pandas DataFrame in a structured and user-friendly format, a.k.a tidy data.
    • Detecting, scrubbing and repairing 'bad' data (e.g. outliers, missing values, 0s, etc.) to improve the accuracy and reliability of machine learning/predictive models.
  • Load: Storing clean and ready-for-analysis data and metadata for easy access.

Installation

$ pip install cryptodatapy

Usage

CryptoDataPy allows you to pull ready-to-analyze data from a variety of sources with only a few lines of code.

First specify which data you want with a DataRequest:

# import DataRequest
from cryptodatapy.extract.datarequest import DataRequest
# specify parameters for data request: tickers, fields, start date, end_date, etc.
data_req = DataRequest(
    source='glassnode',  # name of data source
    tickers=['btc', 'eth'], # list of asset tickers, in CryptoDataPy format, defaults to 'btc'
    fields=['close', 'add_act', 'hashrate'],  # list of fields, in CryptoDataPy, defaults to 'close'
    freq=None,  # data frequency, defaults to daily  
    quote_ccy=None,  # defaults to USD/USDT
    exch=None,  # defaults to exchange weighted average or Binance
    mkt_type= 'spot',  # defaults to spot
    start_date=None,  # defaults to start date for longest series
    end_date=None,  # defaults to most recent 
    tz=None,  # defaults to UTC time
    cat=None,  # optional, should be specified when asset class is not crypto, eg. 'fx', 'rates', 'macro', etc.
)

Then get the data :

# import GetData
from cryptodatapy.extract.getdata import GetData
# get data
GetData(data_req).get_series()

With the same data request parameters, you can retrieve the same data from a different source:

# modify data source parameter
data_req = DataRequest(
  source='coinmetrics',           
  tickers=['btc', 'eth'], 
  fields=['close', 'add_act', 'hashrate'], 
  req='d',
  start_date='2016-01-01')
# get data
GetData(data_req).get_series()

For more detailed code examples and interactive tutorials see here.

Supported Data Sources

Contributing

Interested in contributing? Check out the contributing guidelines and contact us at info@systamental.com. Please note that this project is s released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

cryptodatapy was created by Systamental. It is licensed under the terms of the Apache License 2.0 license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cryptodatapy-0.3.3.tar.gz (334.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cryptodatapy-0.3.3-py3-none-any.whl (380.8 kB view details)

Uploaded Python 3

File details

Details for the file cryptodatapy-0.3.3.tar.gz.

File metadata

  • Download URL: cryptodatapy-0.3.3.tar.gz
  • Upload date:
  • Size: 334.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.9.13 Darwin/25.1.0

File hashes

Hashes for cryptodatapy-0.3.3.tar.gz
Algorithm Hash digest
SHA256 a75d65415cd2daedf1c153e3c2dfcf9e791a100e518e9f002e748222c7c73eec
MD5 6c7f84f9cc02ea9271632a0a2bc4de6f
BLAKE2b-256 7aeaea2604ba0b7e801b645790cdea606812db2a891d5bcf64bd71dcfbe9908d

See more details on using hashes here.

File details

Details for the file cryptodatapy-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: cryptodatapy-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 380.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.9.13 Darwin/25.1.0

File hashes

Hashes for cryptodatapy-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e4fcb8ae69dfc2baac4cb049c4ec25b137620c2012db4703eea52272ed3edf2b
MD5 8563c10b1e2400524946bfb9dc29bdcb
BLAKE2b-256 39bffc8d558ddf8bc9f50dbc41f676f2cd6fc964bb1197379d1206660b2d79a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page