Skip to main content

A tool to download historical trading data from Bybit public API.

Project description

Bybit Historical Data Downloader

License: MIT PyPI version Python Version GitHub

This script downloads historical market data (trades, klines, etc.) for specified cryptocurrency pairs from the Bybit public data repository (https://public.bybit.com/). It allows filtering by date range, coin pairs, and data types, organizing the downloaded and extracted CSV files into a structured directory format.

Features

  • Command-line Interface: Configure downloads using CLI arguments.
  • Flexible Filtering:
    • Specify start and end dates (--start-date, --end-date).
    • Select specific coin pairs (e.g., BTCUSDT,ETHUSDT) or download all (--coins ALL).
    • Choose data types (e.g., trading,spot) or download all (--data-types ALL).
  • Optimized Coin Access: Direct access to specified coins without scanning all directories when specific coins are requested.
  • Intelligent Name Detection: Extracts coin name from both file name and directory path.
  • Organized Output: Data is saved to a specified output directory (--output-dir, default ./data), with subdirectories for each data type (e.g., ./data/trading/, ./data/spot/).
  • Recursive Directory Traversal: Handles different directory structures found in the Bybit repository (e.g., flat file lists, coin-based subdirectories, year-based subdirectories).
  • Automatic Extraction: Downloads .csv.gz archives, extracts them to .csv, and removes the archives.
  • Skip Existing: Avoids re-downloading and extracting files if the .csv file already exists.
  • Basic Logging: Provides informative output about the download process.

Requirements

  • Python 3.8+
  • Libraries: requests, beautifulsoup4

Installation (using Poetry)

  1. Install Poetry: Follow the instructions at https://python-poetry.org/docs/#installation
  2. Clone the repository:
    git clone https://github.com/suenot/bybit-history
    cd bybit-history
    
  3. Install dependencies:
    poetry install
    

Usage

Using Poetry Script (Recommended)

poetry run start --start-date <YYYY-MM-DD> --coins <COINS> [OPTIONS]

Alternative Method

poetry run python bybit_data_downloader.py --start-date <YYYY-MM-DD> --coins <COINS> [OPTIONS]

Required Arguments:

  • --start-date <YYYY-MM-DD>: The earliest date for data to download.
  • --coins <COINS>: Comma-separated list of coin pairs (e.g., BTCUSDT,ETHUSDT) or ALL to attempt downloading all found pairs.

Optional Arguments:

  • --end-date <YYYY-MM-DD>: The latest date for data to download. If omitted, downloads up to the most recent available data.
  • --data-types <TYPES>: Comma-separated list of data types (e.g., trading,spot) or ALL. Defaults to trading. Known types: trading, spot, kline_for_metatrader4, premium_index, spot_index.
  • --output-dir <PATH>: Directory to save the data. Defaults to ./data.
  • --base-url <URL>: Base URL for the Bybit public data. Defaults to https://public.bybit.com/.
  • --version: Show script version and exit.
  • --help: Show help message and exit.

Example:

poetry run start --start-date 2023-01-01 --end-date 2023-01-31 --coins BTCUSDT,ETHUSDT --data-types trading,spot --output-dir ./bybit_data

Algorithm Overview

flowchart TD
    A[Start] --> B{Parse CLI Args};
    B --> C[Create Output Directory];
    C --> D{Fetch Base URL HTML};
    D --> E{Extract Data Type Links};
    E --> F{Loop Through Requested Data Types};
    F -- For Each Type --> G[Create Data Type Directory];
    G --> H{Coins = ALL?};
    H -- No --> DirectAccess[Direct Access to Specific Coins];
    DirectAccess --> I1{Loop Through Requested Coins};
    I1 -- For Each Coin --> J1[Construct Coin URL];
    J1 --> K1{Coin Exists on Server?};
    K1 -- Yes --> L1(Call process_directory);
    L1 --> I1;
    K1 -- No --> M1[Log Warning & Skip];
    M1 --> I1;
    I1 -- Loop Finished --> F;
    H -- Yes --> N[Construct Type URL & Path];
    N --> O(Call process_directory);
    O --> F;
    F -- Loop Finished --> Z[End];

    subgraph process_directory [process_directory]
        L1 & O --> P{Fetch Directory HTML};
        P --> Q{Find .csv.gz Links};
        Q --> R{Loop Through Files};
        R -- For Each File --> T{Extract Date};
        T --> U{Filter by Date?};
        U -- Yes --> V{Extract Coin from Name?};
        V -- Yes --> W{Filter by Coin?};
        W -- Yes --> X{Build Paths};
        X --> Y{File Exists?};
        Y -- No --> AA(Call download_and_extract);
        Y -- Yes --> R;
        AA --> R;
        W -- No --> X; 
        V -- No --> BB{Extract Coin from Path?};
        BB -- Yes --> W;
        BB -- No --> X;
        U -- No --> R;
        R -- Loop Finished --> CC{Find Subdirectory Links};
        CC --> DD{Loop Through Subdirs};
        DD -- For Each Subdir --> EE{Build Next URL & Path};
        EE --> FF{Filter by Coin?};
        FF -- Yes --> GG(Recursive Call process_directory);
        GG --> DD;
        FF -- No --> DD;
        DD -- Loop Finished --> HH[Return Counts];
    end

    subgraph download_and_extract [download_and_extract]
        AA --> AA1{Download .gz?};
        AA1 -- Success --> AA2{Extract .csv?};
        AA2 -- Success --> AA3{Remove .gz};
        AA3 --> AA4[Return Success];
        AA1 -- Fail --> AA5[Log Error & Cleanup];
        AA2 -- Fail --> AA5;
        AA5 --> AA6[Return Failure];
    end

License

This project is licensed under the MIT License - see the LICENSE file for details (if available).

Documentation

Подробная документация по использованию и реализации доступна:

  • Основная документация - В этом README файле описаны основные функции и примеры использования
  • Исходный код - Доступен для просмотра в репозитории GitHub
  • Примеры API - Можно посмотреть в файле example_of_api.md
  • Схема алгоритма - Представлена в разделе "Algorithm Overview" выше

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bybit_history-0.1.3.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bybit_history-0.1.3-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file bybit_history-0.1.3.tar.gz.

File metadata

  • Download URL: bybit_history-0.1.3.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/24.1.0

File hashes

Hashes for bybit_history-0.1.3.tar.gz
Algorithm Hash digest
SHA256 00cfb265be9eb9bed38c45aae411695fd889064baf17a8b1498d8ed4b7b41cb4
MD5 3b67ed0247a07f261baf525f7dd43391
BLAKE2b-256 01b6466c821cae5610dbaf2616ec09c5fcdafb01458e730b486bfc468a80a9c9

See more details on using hashes here.

File details

Details for the file bybit_history-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: bybit_history-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.13.2 Darwin/24.1.0

File hashes

Hashes for bybit_history-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 32470a4d3db729316b6d47ed9f35ecab1ac1c2db11345b9f3feccaf1c3060113
MD5 8827ed938a37b4c831931cd9fdb8fce6
BLAKE2b-256 0e0b8c312c8bb32020b90b1a89b476a6789e9e9eaba83c9bde143edd05fa6b4e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page