Skip to main content

Turns a collection of historical Betfair data into a queryable SQL database.

Project description

betfair-database

test Coverage Status PyPI version Python version License

Turns a collection of historical Betfair data into a queryable SQL database.

Installation

Install the package from PyPI:

pip install betfairdatabase

On some platforms, it may be required to install tzdata which contains IANA time zone database:

pip install tzdata

Usage

Getting started

  1. Index the folder holding historical Betfair data to turn it into a database.
  2. Use SQL queries to select data.
import betfairdatabase as bfdb

path_to_data = "./my_betfair_data"
bfdb.index(path_to_data)  # Create an index to convert the folder into a database

# Select all greyhound races in Sheffield
dataset = bfdb.select(
    path_to_data, where="eventTypeId='4339' AND eventVenue='Sheffield'"
)
for market in dataset:
    print(
        market["marketDataFilePath"],  # Path to the stream data file
        market["marketMetadataFilePath"],  # Path to the market metadata file
    )

Both the self-recorded and official Betfair data files are supported. The historical data can be grouped and divided into any subfolder hierarchy, but it must follow this convention:

  1. Market metadata (market catalogue or market definition) is stored in a JSON file named <market id>.json.
  2. Market data file (containing stream data) is stored in the same folder as the market metadata file. It shares the same basename <market id> and ends with .zip, .gz or .bz2, or it has no extension (uncompressed data).

A sample database structure is shown below:

my_betfair_data/
├── arbitrary_folder/
    ├── 1.22334455.json  # Market metadata file
    ├── 1.22334455  # Uncompressed market data file
    ├── 1.55667788.json  # Market metadata file
    └── 1.55667788.zip  # Compressed market data file

If a market metadata file is missing, it will be created from the most recent market definition found in the market data file. If no market definition is present in the data file, it will not be possible to index the file.

Retrieving data

select() method accepts the following arguments:

  • database_dir: Main directory of the database initialised with index().
  • columns: A list of columns (field names) to retrieve. If omitted, all columns are returned. View a list of available columns by calling betfairdatabase.columns().
  • where: SQL query to execute.
  • limit: Maximum number of results to return. If omitted, all results are returned.
  • return_dict: If True (default), results are returned as a dictionary where keys are column names and values are data. If False, results are returned as tuples containing only data. The second option is faster but makes data harder to work with.

Below are several examples of selecting and filtering data:

import betfairdatabase as bfdb

path_to_data = "./my_betfair_data"

# Return all market ids and paths to data files in the database
bfdb.select(path_to_data, columns=["marketId", "marketDataFilePath"])

# Return full market metadata for horse racing win markets
bfdb.select(path_to_data, where="eventTypeId='7' AND marketType='WIN'")

# Return full market metadata for a maximum of 100 BSP markets
bfdb.select(path_to_data, where="bspMarket=true", limit=100)

# Return a maximum of 250 data file paths for horse and greyhound racing
bfdb.select(
    path_to_data,
    columns=["marketDataFilePath"],
    where="eventTypeId IN ('7', '4339') AND marketType='WIN'",
    limit=250,
)

Inserting data

Database can be updated with new files using insert method. This is much faster and more efficient than reindexing the whole database on each update. Files are moved by default, but they can also be copied if copy=True argument is provided.

import betfairdatabase as bfdb

bfdb.insert("./my_betfair_data", "./my_capture_dir")

Exporting data

Database index can be exported to a CSV file with the export() method. This is useful for debugging, visualising data and post-processing it with external tools.

import betfairdatabase as bfdb

csv_file = bfdb.export("./my_betfair_data", "./my_data_dump")
print(csv_file)  # Prints: ./my_data_dump/my_betfair_data.csv

Removing missing data

Throughout the course of database's lifetime, indexed files may get removed. clean() method checks for the presence of indexed market data files and removes the missing entries from the index, avoiding the need to reindex the whole database on every single file removal. However, reindexing the database may be the faster option when a large number of files has been removed.

import betfairdatabase as bfdb

bfdb.clean("./my_betfair_data")

Checking database size

To quickly check the number of indexed markets in the database, run:

import betfairdatabase as bfdb

bfdb.size("./my_betfair_data")

Object-oriented interface

All of the above methods can also be accessed through OOP interface via BetfairDatabase class. This is useful when performing multiple operations on the same database as the database directory needs to be provided only once.

from betfairdatabase import BetfairDatabase

db = BetfairDatabase("./my_betfair_data")
db.index()
db.select()
db.insert("./my_capture_dir")
db.export()
db.clean()
db.size()

Command line interface

The package also installs a bfdb command line app, which provides access to the following methods:

bfdb index "./my_database_dir"  # Index a database
bfdb export "./my_database_dir" "./my_db_dump.csv" # Export a database
bfdb insert "./my_database_dir" "./my_captured_data"  # Update the database
bfdb clean "./my_database_dir"  # Clean the database
bfdb size "./my_database_dir"  # Check database size

The amount of displayed information is controlled with the following options:

  • -v/--verbose: Increases the amount of displayed messages. Useful for debugging.
  • --no-progress-bar: Hides progress bars. Useful when logging output to a file.
  • -q/--quiet: Suppress printing to terminal, including error messages. Also hides progress bars.

For more information about the command line interface, run:

bfdb --help

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

betfairdatabase-1.3.1.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

betfairdatabase-1.3.1-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file betfairdatabase-1.3.1.tar.gz.

File metadata

  • Download URL: betfairdatabase-1.3.1.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for betfairdatabase-1.3.1.tar.gz
Algorithm Hash digest
SHA256 47954dfe42a09a720845e27357243021ac5923cc552056cc5e4a09c48505e1fd
MD5 50f433222fe7dfc95eb4144ba9fa35b5
BLAKE2b-256 20e6bba5c838dd8b20b503c3b2c6cddd1a1d9b3f297ce91f0e8a5a7d408aa4e7

See more details on using hashes here.

File details

Details for the file betfairdatabase-1.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for betfairdatabase-1.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cce4bf635e6b616fa7486a2f75b155bd88a1d9a4bc3d1813dbcdbcbaac33e30a
MD5 55aaa75cd1c691aa8062458d669fd8e2
BLAKE2b-256 cfeddb0487be3996152de013608e0c74d57ce878fcd1189a705dc8f66f7696de

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page