Add your description here
Project description
USA Cycling Results Parser (usac_velodata)
A Python package that scrapes and parses USA Cycling event results from the legacy USA Cycling results page and API. The package extracts event details, race results, categories, rankings, and historical data, returning structured data in multiple formats.
🚀 Features
- Event Data: Fetch event lists by state and year from USA Cycling API
- Result Parsing: Extract race results from the legacy USA Cycling results page
- Comprehensive Data: Get event details, race categories, and rider information
- Historical Data: Support for fetching data across multiple years
- Flyer Fetching: Download event flyers in various formats (PDF, HTML, DOC)
- Flexible Output: Export data in multiple formats (Pydantic models, JSON, CSV)
- Resilient Fetching: Built-in retry mechanism and rate limiting
- Efficient Caching: Local storage of results to minimize requests
- Type Safety: Fully type-annotated API with Pydantic validation
📦 Installation
Standard installation
using uv
uv pip install usac_velodata
Using Pip
pip install usac_velodata
For development
git clone https://github.com/vincentdavis/pyusacycling.git
cd usac_velodata
uv build
🔍 Usage Examples
Using the Python API
from usac_velodata import USACyclingClient
# Initialize client
client = USACyclingClient()
# Get events for a state and year
events = client.get_events(state="CO", year=2023)
# Get details for an event by permit number
event_details = client.get_event_details(permit="2020-26")
# Get race results for a specific permit
all_result = client.get_complete_event_data(permit="2020-26", include_results=True)
# Get race results for a specific permit
race_results = client.get_race_results(race_id="1337864")
# Fetch a flyer for an event
flyer = client.fetch_flyer(permit="2023-123", storage_dir="./flyers")
# Fetch multiple flyers in batch - Download for all states
flyers_stats = client.fetch_flyers_batch(
start_year=2020,
end_year=2020,
limit=100,
storage_dir="./flyers"
)
# List stored flyers
flyer_list = client.list_flyers(storage_dir="./flyers")
# Export to JSON
json_data = race_results.json()
# Export to CSV
with open("results.csv", "w") as f:
f.write(race_results.to_csv())
# Configure caching
client = USACyclingClient(
cache_enabled=True,
cache_dir="./data/cache",
cache_expiry=86400 # 24 hours
)
# Using rate limiting
client = USACyclingClient(
rate_limit=10, # Max 10 requests per minute
backoff_factor=2.0 # Exponential backoff factor for retries
)
Using the Command-Line Interface
The package includes a command-line interface for quick access to data:
# Get events for a state
usac_velodata events --state CA --year 2023 --output json
# Get race results for a permit
usac_velodata results --permit 2023-123 --output csv
# Fetch a flyer for an event
usac_velodata fetch-flyer --permit 2023-123 --storage-dir ./flyers
# Fetch multiple flyers in batch
usac_velodata fetch-flyers --start-year 2022 --end-year 2023 --limit 100 --storage-dir ./flyers
# List stored flyers
usac_velodata list-flyers --storage-dir ./flyers --output json --pretty
Detailed CLI Usage
The usac_velodata package can be used directly from the command line:
python -m usac_velodata [command] [options]
Available commands:
events: Fetch events by state and yeardetails: Get detailed information about a specific eventdisciplines: List available disciplinescategories: List available race categoriesresults: Get race results for a specific eventcomplete: Get complete event information including resultsfetch-flyer: Fetch a flyer for a specific eventfetch-flyers: Fetch multiple flyers in batchlist-flyers: List stored flyers
Global Options
--version: Show the version and exit--cache-dir PATH: Directory to store cached data--no-cache: Disable caching of results--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}: Set logging level
Events Command
python -m usac_velodata events --state CA [--year 2023] [--output {json,csv}] [--pretty]
--state: Two-letter state code (required)--year: Year to search (defaults to current year)--output: Output format (json or csv)--pretty: Pretty-print JSON output
Results Command
# Using race ID (detailed results for a specific race)
python -m usac_velodata results --race-id 1337864 [--output {json,csv}] [--pretty]
# Using permit (returns event details)
python -m usac_velodata results --permit 2023-123 [--output {json,csv}] [--pretty]
Flyer Commands
# Fetch a flyer for a specific event
python -m usac_velodata fetch-flyer --permit 2023-123 --storage-dir ./flyers [--use-s3] [--s3-bucket my-bucket]
# Fetch multiple flyers in batch
python -m usac_velodata fetch-flyers --start-year 2022 --end-year 2023 [--limit 100] [--delay 5] --storage-dir ./flyers
# List stored flyers
python -m usac_velodata list-flyers --storage-dir ./flyers [--output {json,csv}] [--pretty]
Complete Command
python -m usac_velodata complete --permit 2023-123 [--no-results] [--output {json,csv}] [--pretty]
--permit: Event permit number (required)--no-results: Don't include race results--output: Output format (json or csv)--pretty: Pretty-print JSON output
Note: If complete data cannot be fetched (due to network or parsing issues), the command will automatically fall back to returning basic event details.
Examples
# Get events in California for 2023 in CSV format
python -m usac_velodata events --state CA --year 2023 --output csv
# Get detailed information about an event with pretty-printed JSON
python -m usac_velodata details --permit 2023-123 --pretty
# Get race results with caching disabled
python -m usac_velodata results --permit 2023-123 --output json --no-cache
# Get complete event information with debug logging
python -m usac_velodata complete --permit 2023-123 --log-level DEBUG
# Fetch a flyer and store it in S3
python -m usac_velodata fetch-flyer --permit 2023-123 --use-s3 --s3-bucket my-bucket
# Fetch multiple flyers with a 5-second delay between requests
python -m usac_velodata fetch-flyers --start-year 2023 --end-year 2023 --delay 5 --storage-dir ./flyers
# List stored flyers in JSON format
python -m usac_velodata list-flyers --storage-dir ./flyers --output json --pretty
📘 API Reference
Client
USACyclingClient(
cache_enabled: bool = True,
cache_dir: Optional[str] = None,
cache_expiry: int = 86400,
rate_limit: int = 10,
backoff_factor: float = 1.0,
logger: Optional[logging.Logger] = None
)
Methods
| Method | Description |
|---|---|
get_events(state, year) |
Get events for a state and year |
get_event_details(permit) |
Get details for an event by permit number |
get_race_results(permit) |
Get race results for a permit |
fetch_flyer(permit, storage_dir, use_s3) |
Fetch a flyer for an event |
fetch_flyers_batch(start_year, end_year, limit, storage_dir) |
Fetch multiple flyers in batch |
list_flyers(storage_dir) |
List stored flyers |
Models
Event: Represents a cycling eventRaceCategory: Represents a race categoryRider: Represents a race participantRaceResult: Represents race results
🏗️ Architecture
The package is structured around these main components:
- Client: Main interface for users, coordinates the workflow
- Parsers: Extract structured data from HTML and JSON responses
- Models: Pydantic models for type-validated data structures
- Utils: Helper functions for caching, logging, and rate limiting
🛠️ Development
Setup Development Environment
pip install -e ".[dev]"
Running Tests
pytest
Code Style
This project uses Black, isort, and flake8 for code formatting and linting:
# Format code
black usac_velodata tests
isort usac_velodata tests
# Check code style
flake8 usac_velodata tests
mypy usac_velodata
❓ Troubleshooting
Common Issues
- Rate Limiting: If you encounter "429 Too Many Requests" errors, reduce your rate_limit setting
- Parsing Errors: HTML structure may change; check for updates or submit an issue
- Missing Results: Some events may not have results published yet
- S3 Storage: Make sure boto3 is installed and AWS credentials are configured if using S3 storage
Logging
Enable detailed logging for troubleshooting:
import logging
logging.basicConfig(level=logging.DEBUG)
client = USACyclingClient()
👥 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
python -m unittest discover -v
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file usac_velodata-2025.3.1.tar.gz.
File metadata
- Download URL: usac_velodata-2025.3.1.tar.gz
- Upload date:
- Size: 90.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
846f43d0b304b51385e38b1bb9040515026694ab04c0bea144d425a43a93514f
|
|
| MD5 |
73ea5fb7de46f84c5c9962807552b814
|
|
| BLAKE2b-256 |
e4ac5fea27a123fbe647021d367066156078eb3b482c46dc208eab384dbf8506
|
File details
Details for the file usac_velodata-2025.3.1-py3-none-any.whl.
File metadata
- Download URL: usac_velodata-2025.3.1-py3-none-any.whl
- Upload date:
- Size: 45.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce8eded718e879861d189b9f9645aecac895791b30437567b9e7b0ae9218f5f2
|
|
| MD5 |
64c5c877faca7a02c132114b385b897f
|
|
| BLAKE2b-256 |
494c4c0b4b1dcc0ee2f6bed6ac8104998e909da233d0bf8fcf8f9d246029d92f
|