keibascraper is a simple scraping library for netkeiba.com

Project description

Keiba Scraper

keibascraper is a Python library designed to parse data from netkeiba.com, a prominent Japanese horse racing website. It allows users to programmatically extract detailed information about races, entries, results, odds, and horses. Please note that depending on your usage, this may impose a significant load on netkeiba.com.

Features
Installation
Dependencies
Usage
API Reference
- load Function
- race_list Function
Contributing
License

Features

Flexible Data Loading: Supports loading of various data types such as race entries, results, odds, and horse information.
Configurable Parsing: Utilizes JSON configuration files to define parsing rules, making it easy to adapt to changes in the source website.
Error Handling: Provides robust error handling to manage network issues and data inconsistencies.
Caching: Implements caching mechanisms to improve performance and reduce redundant network requests.

Installation

keibascraper is available on PyPI and can be installed using pip:

$ python -m pip install keibascraper

Supported Python Versions: keibascraper officially supports Python 3.8 and above.

Dependencies

requests: For handling HTTP requests.
BeautifulSoup4: For parsing HTML content.
jq: For parsing JSON content using jq expressions.

Usage

To use keibascraper, import the library and use the load function to fetch and parse data from netkeiba.com. The load function requires two parameters: the data type and the entity ID.

Loading Entry Data (出走データ)

import keibascraper

# Load entry data for a specific race ID
race_id = "202206050811"  # Example race ID
race_info = keibascraper.load("entry", race_id)

# Access race information
print(race_info)
# Output: {'race_id': '202206050811', 'race_name': 'Example Race', ... 'entry': [{'horse_number': 1, 'horse_name': 'Horse A', ...}, {...}, ...]}

Loading Result Data (結果データ)

import keibascraper

# Load result data for a specific race ID
race_id = "202206050811"  # Example race ID
race_info = keibascraper.load("result", race_id)

# Access race information
print(race_info)
# Output: {'race_id': '202206050811', 'race_name': 'Example Race', ... 'entry': [{'rank': 1, 'horse_name': 'Horse A', 'rap_time': 120.5, ...}, {...}, ...]}

Loading Odds Data (オッズデータ)

import keibascraper

# Load odds data for a specific race ID
race_id = "202206050811"  # Example race ID
odds_data = keibascraper.load("odds", race_id)

# Access odds information
print(odds_data)
# Output: [{'horse_number': 1, 'win': 3.5, 'show_min': 1.2, 'show_max': 1.5, ...}, {...}, ...]

Loading Horse Data (血統データ/出走履歴データ)

import keibascraper

# Load horse data for a specific horse ID
horse_id = "2010101234"  # Example horse ID
horse_info = keibascraper.load("horse", horse_id)

# Access horse information
print(horse_info)
# Output: {'horse_id': '2010101234', 'horse_name': 'Horse A', 'father_name': 'Sire A', ... 'entry': [{'race_date': '2022-06-05', 'race_name': 'Example Race', 'rank': 1, ...}, {...}, ...]}

Bulk Data Loading

To load multiple races in bulk, you can use the race_list function to retrieve a list of race IDs for a specific year and month.

import keibascraper

# Get list of race IDs for July 2022
race_ids = keibascraper.race_list(2022, 7)

# Loop through race IDs and load entry data
for race_id in race_ids:
    race_info, entry_list = keibascraper.load("entry", race_id)
    # Process the data as needed

API Reference

`load` Function

keibascraper.load(data_type, entity_id)

Description: Loads data from netkeiba.com based on the specified data type and entity ID.
Parameters:
- data_type (str): Type of data to load. Supported types are 'entry', 'result', 'odds', and 'horse'.
- entity_id (str): Identifier for the data entity (e.g., race ID, horse ID).
Returns:
- For 'entry' and 'result': Returns a dict {race_info, [data_list]}.
- For 'odds': Returns a list odds_data.
- For 'horse': Returns a dict {horse_info, [history_list]}.
Raises:
- ValueError: If an unsupported data type is provided.
- RuntimeError: If data loading or parsing fails.

`race_list` Function

keibascraper.race_list(year, month)

Description: Retrieves a list of race IDs for the specified year and month.
Parameters:
- year (int): The target year.
- month (int): The target month.
Returns:
- A list of race IDs (list).

Contributing

Contributions are welcome! If you have suggestions or find bugs, please open an issue or submit a pull request on the GitHub repository.

When contributing, please follow these guidelines:

Coding Standards: Follow PEP 8 style guidelines.
Testing: Ensure that your code passes existing tests and add new tests for your changes.
Documentation: Update documentation and docstrings as needed.

License

This project is licensed under the terms of the Apache-2.0 license. See the LICENSE file for details.

Disclaimer: This library is intended for personal use and educational purposes. Scraping data from websites may violate their terms of service. Please ensure that you comply with netkeiba.com's terms and conditions when using this library.

Project details

Release history Release notifications | RSS feed

3.1.4

Dec 13, 2025

3.1.3

Dec 14, 2024

3.1.2

Dec 7, 2024

3.1.1

Dec 7, 2024

3.1.0

Dec 6, 2024

This version

2.1.1

Nov 30, 2024

2.1.0

Nov 30, 2024

2.0.0

Nov 30, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

keibascraper-2.1.1.tar.gz (22.8 kB view details)

Uploaded Nov 30, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

keibascraper-2.1.1-py3-none-any.whl (28.9 kB view details)

Uploaded Nov 30, 2024 Python 3

File details

Details for the file keibascraper-2.1.1.tar.gz.

File metadata

Download URL: keibascraper-2.1.1.tar.gz
Upload date: Nov 30, 2024
Size: 22.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for keibascraper-2.1.1.tar.gz
Algorithm	Hash digest
SHA256	`548c01ce0ff1670c89e375fb9064f213e56d48c96dc82f57aa272adaeb6968c4`
MD5	`6c56984291c2d7652f69752161b73a3c`
BLAKE2b-256	`ebdb577d22a7436b7cd82a939d83a2ac07d62e33517e37d8cf89b18762a085b4`

See more details on using hashes here.

File details

Details for the file keibascraper-2.1.1-py3-none-any.whl.

File metadata

Download URL: keibascraper-2.1.1-py3-none-any.whl
Upload date: Nov 30, 2024
Size: 28.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for keibascraper-2.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ebe5813a58a0c5492b941ee248a82de00d9250fb57065a856ee97752eece69c`
MD5	`983fdc2748020608920324dc0941b77b`
BLAKE2b-256	`be83348239a155f5debf690782df5c5501e2c4030b9ab9d98a69adaac746fc11`

See more details on using hashes here.

keibascraper 2.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Keiba Scraper

Table of Contents

Features

Installation

Dependencies

Usage

Loading Entry Data (出走データ)

Loading Result Data (結果データ)

Loading Odds Data (オッズデータ)

Loading Horse Data (血統データ/出走履歴データ)

Bulk Data Loading

API Reference

`load` Function

`race_list` Function

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

keibascraper 2.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Keiba Scraper

Table of Contents

Features

Installation

Dependencies

Usage

Loading Entry Data (出走データ)

Loading Result Data (結果データ)

Loading Odds Data (オッズデータ)

Loading Horse Data (血統データ/出走履歴データ)

Bulk Data Loading

API Reference

load Function

race_list Function

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`load` Function

`race_list` Function