Python wrapper for MLB Stats API
Project description
PyMLB StatsAPI
A clean, Pythonic wrapper for MLB Stats API endpoints with automatic schema-driven parameter validation.
โจ Features
- ๐ฏ Clean API: Parameters are intelligently routed to path or query params based on the schema configuration
- ๐ชถ Lean: Only requires
requests- no heavy dependencies - ๐ Schema-driven: All endpoints and methods generated from JSON schemas
- Sourced from https://beta-statsapi.mlb.com/docs/ (now offline)
- โ Type-safe: Automatic parameter validation from API schemas
- ๐ Dynamic: Zero hardcoded models - updates via schema changes only
- ๐งช Well-tested: Comprehensive unit tests with pytest and BDD test suite with stub capture/replay
- ๐ Self-documenting: Auto-generated docstrings from API schemas
- ๐ Fast: Stub-based testing runs in <1 second
๐ Quick Start
Installation
# With pip
pip install pymlb-statsapi
# With uv (recommended)
uv add pymlb-statsapi
Basic Usage
from pymlb_statsapi import api
# Get today's game schedule
response = api.Schedule.schedule(sportId=1, date="2024-10-27")
data = response.json()
# Get latest live game data (no timecode = most recent)
response = api.Game.liveGameV1(game_pk=747175)
data = response.json()
# Get game data at specific time
response = api.Game.liveGameV1(game_pk=747175, timecode="20241027_233000")
data = response.json()
# Get team information
response = api.Team.team(teamId=147, season=2024)
team_data = response.json()
# Save response to gzipped file with metadata
result = response.gzip(prefix="mlb-data")
print(f"Saved to: {result['path']}")
print(f"Captured at: {result['timestamp']}")
๐ง Smart Parameter Validation
Parameters accept both integers and strings - the library handles type conversion automatically:
# These are equivalent - use whichever is more convenient
api.Game.liveGameV1(game_pk=747175) # Integer (Pythonic)
api.Game.liveGameV1(game_pk="747175") # String (API format)
api.Team.team(teamId=147) # Integer
api.Team.team(teamId="147") # String
# The MLB API sometimes returns IDs as strings in responses
# You can pass them directly without conversion:
games = api.Schedule.schedule(sportId=1, date="2024-10-27").json()
for game in games['dates'][0]['games']:
# game['gamePk'] is an integer from the API
live_data = api.Game.liveGameV1(game_pk=game['gamePk'])
# Or if you have a string ID from elsewhere:
game_id = "747175" # From database, user input, etc.
live_data = api.Game.liveGameV1(game_pk=game_id) # Works!
Why this matters: The MLB Stats API returns some fields as integers and others as strings. This flexible parameter handling means you never need to worry about type conversion - just pass what you have!
๐ Documentation
๐ Start Here: Schema Reference
The Schema Reference is the heart of this library - browse all 21 MLB Stats API endpoints with detailed parameter docs and working examples for every method.
Additional Resources
- Full Documentation - Complete guide on ReadTheDocs
- API Reference - Implementation documentation
- Examples - Check the
examples/directory for working code samples - Testing Guide - See the Testing documentation
๐๏ธ Architecture
Config-Driven Design
All MLB API endpoints are defined as JSON schemas rather than hardcoded. These schemas were sourced from the MLB Stats API Beta documentation site (https://beta-statsapi.mlb.com/docs/), which is no longer publicly available:
pymlb_statsapi/resources/schemas/statsapi/stats_api_1_0/
โโโ schedule.json โ Schedule endpoint with methods
โโโ game.json โ Game endpoint (live feed, boxscore, etc.)
โโโ team.json โ Team endpoint (roster, stats, etc.)
โโโ person.json โ Person endpoint (player data)
โโโ ...
Each schema defines which parameters are path parameters vs query parameters. Method paths are mapped in endpoint-model.json:
{
"schedule": {
"schedule": {
"path": "/v1/schedule",
"name": "schedule"
},
"tieGames": {
"path": "/v1/schedule/games/tied",
"name": "tieGames"
}
}
}
Clean API: Intelligent Parameter Routing
The library automatically routes parameters to path or query parameters based on schema configuration:
# Parameters are routed correctly based on the schema
response = api.Game.liveGameV1(game_pk=747175, timecode="20241027_233000")
# Resolves to: /api/v1/game/747175/feed/live?timecode=20241027_233000
# game_pk โ path parameter, timecode โ query parameter
response = api.Schedule.schedule(sportId=1, date="2024-10-27")
# Resolves to: /api/v1/schedule?sportId=1&date=2024-10-27
# Both are query parameters
# Latest game data (omit optional timecode)
response = api.Game.liveGameV1(game_pk=747175)
# Resolves to: /api/v1/game/747175/feed/live
Key Components
Dynamic Factory (factory.py):
- Generates endpoint classes and methods from schemas at runtime
- Creates clean function signatures with proper parameter handling
- Handles method overloading (e.g.,
seasons()with/withoutseasonId)
Registry (registry.py):
- Central
apisingleton that loads all endpoints - Provides discovery API for exploring available methods
API Response (factory.py: APIResponse):
- Wraps
requests.Responsewith metadata - Provides
.json(),.save_json(),.get_path(),.get_uri()methods - Generates consistent resource paths for file storage
๐ Examples
Working with Different Endpoints
from pymlb_statsapi import api
# Schedule queries
response = api.Schedule.schedule(
sportId=1,
date="2024-10-27",
teamId=147
)
# Get all teams
response = api.Team.teams(sportId=1, season=2024)
# Get player information
response = api.Person.people(personId=660271)
# Get season information (overloaded method)
response = api.Season.seasons(sportId=1) # All seasons
response = api.Season.seasons(seasonId=2024) # Specific season
# Get game stats
response = api.Stats.stats(
group="hitting",
stats="season",
season=2024,
sportId=1
)
File Storage
# Auto-generate file path
result = response.save_json(prefix="mlb-data")
print(f"Saved to: {result['path']}")
print(f"Bytes written: {result['bytes_written']}")
# Explicit file path
response.save_json("/path/to/file.json")
# Gzipped JSON
response.gzip(prefix="mlb-data")
URI Generation for Different Protocols
# File protocol (default)
uri = response.get_uri(protocol="file", prefix="mlb-data")
# Result: file:///path/to/.var/local/mlb_statsapi/mlb-data/schedule/schedule/date=2025-06-01.json
# S3 protocol (requires PYMLB_STATSAPI__S3_BUCKET env var)
uri = response.get_uri(protocol="s3", prefix="raw-data", gzip=True)
# Result: s3://my-bucket/raw-data/schedule/schedule/date=2025-06-01.json.gz
# Redis protocol
uri = response.get_uri(protocol="redis", prefix="mlb")
# Result: redis://localhost:6379/0/mlb/schedule/schedule/date=2025-06-01
API Discovery
# List all available endpoints
print(api.get_endpoint_names())
# ['schedule', 'game', 'team', 'person', 'season', ...]
# List methods for an endpoint
endpoint = api.get_endpoint("schedule")
print(endpoint.get_method_names())
# ['schedule', 'tieGames', 'postseason', ...]
# Get detailed method information
info = api.get_method_info("schedule", "schedule")
print(info["path"]) # /v1/schedule
print(info["summary"]) # View schedule info
print(info["path_params"]) # []
print(info["query_params"]) # [{"name": "sportId", ...}, ...]
๐งช Testing
Unit Tests
# Run unit tests
pytest
# With coverage
pytest --cov=pymlb_statsapi --cov-report=html
# Specific test file
pytest tests/unit/pymlb_statsapi/model/test_factory.py
BDD Tests with Stubs (Fast)
# Run all BDD tests with stubs (completes in <1 second)
behave
# Or explicitly
STUB_MODE=replay behave
Capture Fresh Stubs
# Capture stubs by making real API calls
STUB_MODE=capture behave
# Capture stubs for specific endpoint
STUB_MODE=capture behave features/schedule.feature
Run Specific BDD Tests
# Test specific feature
behave features/game.feature
# Verbose output
behave -v features/season.feature
# Test with specific tag
behave --tags=@game
๐ ๏ธ Development
Setup
# Clone repository
git clone https://github.com/power-edge/pymlb_statsapi.git
cd pymlb_statsapi
# Install dependencies
uv sync
# Install pre-commit hooks
pre-commit install
Code Quality
# Linting
ruff check .
# Auto-fix linting issues
ruff check --fix .
# Formatting
ruff format .
# Run all pre-commit hooks
pre-commit run --all-files
# Security scan
bandit -r pymlb_statsapi/
Building
# Build package
hatch build
# Or with uv
uv build
# Version is auto-generated from git tags via hatch-vcs
Publishing
For local publishing (requires .env configuration):
# Set up credentials
cp .env.example .env
# Edit .env with your PyPI tokens
# Publish to TestPyPI (for testing)
./scripts/publish.sh testpypi
# Publish to PyPI (production)
./scripts/publish.sh pypi
For automated publishing, use GitHub Actions:
- Push a version tag (e.g.,
v1.2.0) to trigger automatic PyPI publishing - Configure
PYPI_TOKENsecret in GitHub repository settings
๐ Test Coverage
- 30 unit tests with pytest covering core functionality
- 39 BDD scenarios covering all major endpoints
- 277 test steps with path/query parameter variations
- Stub-based testing for fast, deterministic CI/CD
- Real-world data using completed games (October 2024 World Series)
๐ Support This Project
If you find this library useful, consider supporting its development:
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ค Contributing
Contributions are welcome! Please feel free to fork the repository and submit a Pull Request.
We have a git helper for common operations, see scripts/git.sh
================================
Git Workflow Helper
================================
1) Format & commit changes
2) Create release (bump version & tag)
3) Push to remote
4) Full release (format, commit, bump, tag, push, build)
5) Status
6) Exit
๐ Links
๐ Acknowledgments
- Built on the excellent MLB Stats API
- Inspired by various MLB data projects in the community
- Thanks to all contributors!
Made with โค๏ธ by the PyMLB StatsAPI Team
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymlb_statsapi-1.4.3.tar.gz.
File metadata
- Download URL: pymlb_statsapi-1.4.3.tar.gz
- Upload date:
- Size: 1.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64b20f49b61dda9a930613d9249d4c901d6f07ad4bd18deba4df5cefe8fc9063
|
|
| MD5 |
41bae403d5803d0d6c3693ff937ca368
|
|
| BLAKE2b-256 |
ea9193da724ae254394eb84c9403dec38a6673f978b567eae29d78a7d36e66a0
|
File details
Details for the file pymlb_statsapi-1.4.3-py3-none-any.whl.
File metadata
- Download URL: pymlb_statsapi-1.4.3-py3-none-any.whl
- Upload date:
- Size: 141.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e52ecc263f8be01c0d5bde1cd7177e350917c635ec192a9b0f60c2b2c8bdd607
|
|
| MD5 |
30930abf490793cfc62c28f8a9d237ef
|
|
| BLAKE2b-256 |
5dd6f2f204079a168f9bb3119b45ea529153cf8ce734abf7ce96da8d1b3a2e6e
|