Skip to main content

A Python application to scrape and manage odds data from OddsPortal website.

Project description

OddsHarvester

Scrape sports betting odds from OddsPortal.com with ease

Extract upcoming & historical odds across 10 sports, 100+ leagues, and dozens of betting markets.
Powered by Playwright browser automation. Output to JSON, CSV, or S3.


PyPI version License: MIT Build Status Scraper Health codecov Python


Quick Start

# Install
pip install oddsharvester

# Or clone & setup with uv
git clone https://github.com/jordantete/OddsHarvester.git && cd OddsHarvester
pip install uv && uv sync

# Scrape upcoming football matches
oddsharvester upcoming -s football -d 20250301 -m 1x2 --headless

# Scrape historical Premier League odds
oddsharvester historic -s football -l england-premier-league --season 2024-2025 -m 1x2 --headless

Features

Feature Description
Upcoming Scrape upcoming matches Fetch odds and event details for upcoming sports matches by date or league
Historic Scrape historical odds Retrieve past odds and match results for any season
Multi-market Advanced parsing Structured data: dates, teams, scores, venues, and per-bookmaker odds
Storage Flexible output JSON, CSV (local), or direct upload to AWS S3
Docker Container-ready Run seamlessly in Docker with environment variable configuration
Proxy Proxy support Route through SOCKS/HTTP proxies for geolocation and anti-blocking

Supported Sports & Markets

Sport Markets
⚽ Football 1x2 btts double_chance draw_no_bet over/under european_handicap asian_handicap
🎾 Tennis match_winner total_sets_over/under total_games_over/under asian_handicap correct_score
🏀 Basketball 1x2 moneyline asian_handicap over/under
🏉 Rugby League 1x2 home_away double_chance draw_no_bet over/under handicap
🏉 Rugby Union 1x2 home_away double_chance draw_no_bet over/under handicap
🏒 Ice Hockey 1x2 home_away double_chance draw_no_bet btts over/under
⚾ Baseball moneyline over/under
🏈 American Football 1x2 moneyline over/under asian_handicap
🤾 Handball 1x2 home_away double_chance draw_no_bet over/under handicap
🏐 Volleyball home_away total_sets_over/under total_points_over/under asian_handicap correct_score

100+ leagues supported across all sports: Premier League, La Liga, Serie A, NBA, NFL, MLB, NHL, ATP/WTA Grand Slams, and many more.


CLI Usage

OddsHarvester has two main commands: upcoming and historic. They share most options, with a few command-specific ones.

oddsharvester upcoming

Scrape odds for upcoming matches — by date, by league, or by specific match URL.

# By date
oddsharvester upcoming -s football -d 20250301 -m 1x2 --headless

# By league (scrapes all upcoming matches for that league)
oddsharvester upcoming -s football -l england-premier-league -m 1x2,btts --headless

# Multiple leagues
oddsharvester upcoming -s football -l england-premier-league,spain-laliga -m 1x2 --headless

# Specific match URLs
oddsharvester upcoming -s football --match-link "https://www.oddsportal.com/football/..." -m 1x2

# Preview mode (faster — average odds only, no individual bookmakers)
oddsharvester upcoming -s football -d 20250301 -m over_under --preview-only --headless

oddsharvester historic

Scrape historical odds and results for past seasons.

# Single league & season
oddsharvester historic -s football -l england-premier-league --season 2022-2023 -m 1x2 --headless

# Current season
oddsharvester historic -s football -l england-premier-league --season current -m 1x2 --headless

# Limit pagination
oddsharvester historic -s football -l england-premier-league --season 2022-2023 -m 1x2 --max-pages 3 --headless

# Output as CSV
oddsharvester historic -s football -l england-premier-league --season 2024-2025 -m 1x2 -f csv -o premier_league_odds --headless

CLI Options Reference

Core Options

Option Short Description Default
--sport -s Sport to scrape (football, tennis, basketball, etc.) required
--date -d Target date in YYYYMMDD format
--league -l Comma-separated league slugs (e.g. england-premier-league)
--market -m Comma-separated markets (e.g. 1x2,btts)
--match-link Specific match URL (repeatable). Overrides --sport, --date, --league

upcoming only: --date is required unless --league or --match-link is provided. --date and --league can be combined to filter the league's upcoming matches down to a specific calendar day. When combining both, the reference timezone for resolving the date is --timezone if provided, otherwise UTC.

historic only:

Option Description Default
--season Season: YYYY, YYYY-YYYY, or current required
--max-pages Max number of result pages to scrape unlimited

Output Options

Option Short Description Default
--storage local or remote (S3) local
--format -f json or csv json
--output -o Output file path scraped_data
--append Append to the output file instead of overwriting it (--no-append to opt out explicitly) --no-append

Browser & Scraping Options

Option Short Description Default
--headless Run browser in headless mode False
--concurrency -c Concurrent scraping tasks 3
--request-delay Delay (sec) between match requests 1.0
--user-agent Custom browser user agent
--locale Browser locale (e.g. fr-BE)
--timezone Browser timezone (e.g. Europe/Brussels)
--base-url Scrape a regional OddsPortal mirror instead of www.oddsportal.com (e.g. https://www.centroquote.it). Page structure is identical; only the domain changes. Regional mirrors may expose a different/larger set of bookmakers. Recommended: pair with --locale/--timezone matching the region. Env var: OH_BASE_URL.

Proxy Options

Option Description
--proxy-url Proxy URL (http://... or socks5://...)
--proxy-user Proxy username
--proxy-pass Proxy password

Tip: For best results, match --locale and --timezone to your proxy's region.

Advanced Options

Option Description Default
--target-bookmaker Filter odds for a specific bookmaker
--odds-history Include historical odds movement per match False
--odds-format Odds display format Decimal Odds
--preview-only Fast mode — average odds only, no bookmaker details False
--bookies-filter Bookmaker filter: all, classic, or crypto all
--period Match period (sport-specific: full-time, halves, etc.) sport default
Preview Mode vs Full Mode
Aspect Full Mode Preview Mode
Speed Slower (interactive) Faster (passive)
Data All submarkets + bookmakers Visible submarkets + avg odds
Bookmakers Individual bookmaker odds Average odds only
Odds History Available Not available
Structure By bookmaker By submarket (avg odds)

Preview mode (--preview-only) is useful for quick exploration, testing data format, or light monitoring with reduced resource usage.


Environment Variables

All CLI options can be set via environment variables — useful for Docker or CI/CD.

View all environment variables
Variable CLI Option Description
OH_SPORT --sport Sport to scrape
OH_LEAGUES --league Comma-separated leagues
OH_MARKETS --market Comma-separated markets
OH_STORAGE --storage Storage type (local/remote)
OH_FORMAT --format Output format (json/csv)
OH_FILE_PATH --output Output file path
OH_APPEND --append Append to the output file instead of overwriting
OH_HEADLESS --headless Run in headless mode
OH_CONCURRENCY --concurrency Number of concurrent tasks
OH_REQUEST_DELAY --request-delay Delay between requests (sec)
OH_PROXY_URL --proxy-url Proxy server URL
OH_PROXY_USER --proxy-user Proxy username
OH_PROXY_PASS --proxy-pass Proxy password
OH_USER_AGENT --user-agent Custom browser user agent
OH_LOCALE --locale Browser locale
OH_TIMEZONE --timezone Browser timezone ID
OH_BASE_URL --base-url Regional OddsPortal mirror base URL
export OH_SPORT=football
export OH_HEADLESS=true
export OH_PROXY_URL=http://proxy.example.com:8080

oddsharvester upcoming -d 20250301 -m 1x2

Installation

With pip (from PyPI)

pip install oddsharvester

From source (with uv)

git clone https://github.com/jordantete/OddsHarvester.git
cd OddsHarvester
pip install uv
uv sync
Manual setup (venv + pip or poetry)
python3 -m venv .venv
source .venv/bin/activate    # Unix/macOS
# .venv\Scripts\activate     # Windows

pip install . --use-pep517
# or: poetry install

Verify installation:

oddsharvester --help

Docker

# Build
docker build -t odds-harvester:local .

# Run (CLI args are appended to the ENTRYPOINT `python3 -m oddsharvester`)
docker run --rm odds-harvester:local upcoming -s football -d 20250301 -m 1x2 --headless

# Run and keep the JSON output on the host (mount a volume + use -o)
# On macOS+colima, prefer a path under $HOME (e.g. $PWD); /tmp is not shared by default.
docker run --rm -v "$PWD/_docker_out:/out" odds-harvester:local \
  upcoming -s football -d 20250301 -m 1x2 --headless -o /out/result.json

# Or with environment variables
docker run --rm \
  -e OH_SPORT=football \
  -e OH_HEADLESS=true \
  odds-harvester:local upcoming -d 20250301 -m 1x2

Contributing

Contributions are welcome! Submit an issue or pull request. Please follow the project's coding standards and include clear descriptions for any changes.

License

MIT License

Disclaimer

This package is intended for educational purposes only. The author is not affiliated with or endorsed by oddsportal.com. Use responsibly and ensure compliance with their terms of service and applicable laws.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oddsharvester-0.3.0.tar.gz (86.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oddsharvester-0.3.0-py3-none-any.whl (94.1 kB view details)

Uploaded Python 3

File details

Details for the file oddsharvester-0.3.0.tar.gz.

File metadata

  • Download URL: oddsharvester-0.3.0.tar.gz
  • Upload date:
  • Size: 86.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oddsharvester-0.3.0.tar.gz
Algorithm Hash digest
SHA256 89636106e3cebc2c150b8b890d457165d63ba34878abe79e45c53d48984841cc
MD5 ba1d369561f73489165c989c268de9e9
BLAKE2b-256 a6ff5dfd61f9792252ad33e7ad38d27ba45f172953991a78ed96614cd280d96a

See more details on using hashes here.

Provenance

The following attestation bundles were made for oddsharvester-0.3.0.tar.gz:

Publisher: release.yml on jordantete/OddsHarvester

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file oddsharvester-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: oddsharvester-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 94.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for oddsharvester-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e3b95d589cac3f72aadd5c8fd5b1ef0fcfa6b4bcc44bd00e2a7397190b1706ac
MD5 20cd3f696adbd78acb38bcae38239d59
BLAKE2b-256 cec0895fb81b9344b66e2aabdbc51103219e1fdb87cb3f3941dc893ec0dfae8b

See more details on using hashes here.

Provenance

The following attestation bundles were made for oddsharvester-0.3.0-py3-none-any.whl:

Publisher: release.yml on jordantete/OddsHarvester

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page