Skip to main content

A tool to download historical trading data from Bybit public API.

Project description

Bybit Historical Data Downloader

License: MIT PyPI version Python Version

This script downloads historical market data (trades, klines, etc.) for specified cryptocurrency pairs from the Bybit public data repository (https://public.bybit.com/). It allows filtering by date range, coin pairs, and data types, organizing the downloaded and extracted CSV files into a structured directory format.

Features

  • Command-line Interface: Configure downloads using CLI arguments.
  • Flexible Filtering:
    • Specify start and end dates (--start-date, --end-date).
    • Select specific coin pairs (e.g., BTCUSDT,ETHUSDT) or download all (--coins ALL).
    • Choose data types (e.g., trading,spot) or download all (--data-types ALL).
  • Optimized Coin Access: Direct access to specified coins without scanning all directories when specific coins are requested.
  • Intelligent Name Detection: Extracts coin name from both file name and directory path.
  • Organized Output: Data is saved to a specified output directory (--output-dir, default ./data), with subdirectories for each data type (e.g., ./data/trading/, ./data/spot/).
  • Recursive Directory Traversal: Handles different directory structures found in the Bybit repository (e.g., flat file lists, coin-based subdirectories, year-based subdirectories).
  • Automatic Extraction: Downloads .csv.gz archives, extracts them to .csv, and removes the archives.
  • Skip Existing: Avoids re-downloading and extracting files if the .csv file already exists.
  • Basic Logging: Provides informative output about the download process.

Requirements

  • Python 3.8+
  • Libraries: requests, beautifulsoup4

Installation (using Poetry)

  1. Install Poetry: Follow the instructions at https://python-poetry.org/docs/#installation
  2. Clone the repository:
    git clone https://github.com/suenot/bybit-history
    cd bybit-history
    
  3. Install dependencies:
    poetry install
    

Usage

Using Poetry Script (Recommended)

poetry run start --start-date <YYYY-MM-DD> --coins <COINS> [OPTIONS]

Alternative Method

poetry run python bybit_data_downloader.py --start-date <YYYY-MM-DD> --coins <COINS> [OPTIONS]

Required Arguments:

  • --start-date <YYYY-MM-DD>: The earliest date for data to download.
  • --coins <COINS>: Comma-separated list of coin pairs (e.g., BTCUSDT,ETHUSDT) or ALL to attempt downloading all found pairs.

Optional Arguments:

  • --end-date <YYYY-MM-DD>: The latest date for data to download. If omitted, downloads up to the most recent available data.
  • --data-types <TYPES>: Comma-separated list of data types (e.g., trading,spot) or ALL. Defaults to trading. Known types: trading, spot, kline_for_metatrader4, premium_index, spot_index.
  • --output-dir <PATH>: Directory to save the data. Defaults to ./data.
  • --base-url <URL>: Base URL for the Bybit public data. Defaults to https://public.bybit.com/.
  • --version: Show script version and exit.
  • --help: Show help message and exit.

Example:

poetry run start --start-date 2023-01-01 --end-date 2023-01-31 --coins BTCUSDT,ETHUSDT --data-types trading,spot --output-dir ./bybit_data

Algorithm Overview

flowchart TD
    A[Start] --> B{Parse CLI Args};
    B --> C[Create Output Directory];
    C --> D{Fetch Base URL HTML};
    D --> E{Extract Data Type Links};
    E --> F{Loop Through Requested Data Types};
    F -- For Each Type --> G[Create Data Type Directory];
    G --> H{Coins = ALL?};
    H -- No --> DirectAccess[Direct Access to Specific Coins];
    DirectAccess --> I1{Loop Through Requested Coins};
    I1 -- For Each Coin --> J1[Construct Coin URL];
    J1 --> K1{Coin Exists on Server?};
    K1 -- Yes --> L1(Call process_directory);
    L1 --> I1;
    K1 -- No --> M1[Log Warning & Skip];
    M1 --> I1;
    I1 -- Loop Finished --> F;
    H -- Yes --> N[Construct Type URL & Path];
    N --> O(Call process_directory);
    O --> F;
    F -- Loop Finished --> Z[End];

    subgraph process_directory [process_directory]
        L1 & O --> P{Fetch Directory HTML};
        P --> Q{Find .csv.gz Links};
        Q --> R{Loop Through Files};
        R -- For Each File --> T{Extract Date};
        T --> U{Filter by Date?};
        U -- Yes --> V{Extract Coin from Name?};
        V -- Yes --> W{Filter by Coin?};
        W -- Yes --> X{Build Paths};
        X --> Y{File Exists?};
        Y -- No --> AA(Call download_and_extract);
        Y -- Yes --> R;
        AA --> R;
        W -- No --> X; 
        V -- No --> BB{Extract Coin from Path?};
        BB -- Yes --> W;
        BB -- No --> X;
        U -- No --> R;
        R -- Loop Finished --> CC{Find Subdirectory Links};
        CC --> DD{Loop Through Subdirs};
        DD -- For Each Subdir --> EE{Build Next URL & Path};
        EE --> FF{Filter by Coin?};
        FF -- Yes --> GG(Recursive Call process_directory);
        GG --> DD;
        FF -- No --> DD;
        DD -- Loop Finished --> HH[Return Counts];
    end

    subgraph download_and_extract [download_and_extract]
        AA --> AA1{Download .gz?};
        AA1 -- Success --> AA2{Extract .csv?};
        AA2 -- Success --> AA3{Remove .gz};
        AA3 --> AA4[Return Success];
        AA1 -- Fail --> AA5[Log Error & Cleanup];
        AA2 -- Fail --> AA5;
        AA5 --> AA6[Return Failure];
    end

License

This project is licensed under the MIT License - see the LICENSE file for details (if available).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bybit_history-0.1.2.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bybit_history-0.1.2-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file bybit_history-0.1.2.tar.gz.

File metadata

  • Download URL: bybit_history-0.1.2.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/24.1.0

File hashes

Hashes for bybit_history-0.1.2.tar.gz
Algorithm Hash digest
SHA256 83b18c5036b46b793a8411ff150f62b9339621dbe801f2dd7c78617ae49e4ad3
MD5 ccc0a88aba30b19960bedb0b7064674d
BLAKE2b-256 b651b88d28e51955ea9105825408d835019f7845e67cf867cc6ca39080df25dd

See more details on using hashes here.

File details

Details for the file bybit_history-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: bybit_history-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/24.1.0

File hashes

Hashes for bybit_history-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 a9f3ae5d812f1f1369e4a68f78116eb9c6886c81496f415a5e9a38d8ec5fb22b
MD5 3446dc66642491205d55845b2f53e051
BLAKE2b-256 acad24b9db9c30495b5eb5f00ab3e440b81c633c5a7dad82b9d5374685e717e3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page