Skip to main content

A Python web scraper and REST API for extracting classical music opera events from Bachtrack.com

Project description

BachtrackAPI

A Python web scraper and REST API for extracting classical music opera events from Bachtrack.com.

Overview

BachtrackAPI provides two complementary ways to access opera event data:

  1. Scraper Module - Direct web scraping of Bachtrack opera listings
  2. FastAPI Backend - RESTful API endpoints for searching and filtering events

Search by work ID (e.g., 12285 for Gianni Schicchi) or freetext (e.g., "La Traviata").

Installation

pip install -r requirements.txt

Quick Start

1. Using the Scraper Directly

from scraper.scraper import BachtrackScraper

scraper = BachtrackScraper()

# Search by work ID
events = scraper.search_operas(12285)  # Gianni Schicchi
print(f"Found {len(events)} events")

# Search by freetext
events = scraper.search_operas("La Traviata")
for event in events:
    print(f"{event['title']} - {event['city']} @ {event['venue']}")

Output:

Found 28 events
Gianni Schicchi - Berlin @ Deutsche Oper
Gianni Schicchi - Winterthur @ Stadttheater Winterthur
...

2. Using the FastAPI Backend

Start the server:

uvicorn api.main:app --reload

Example API Requests:

# Freetext search
curl "http://localhost:8000/api/v1/events/get_operas?q=gianni%20schicchi"

# Work ID search
curl "http://localhost:8000/api/v1/events/get_operas?q=12285"

# POST search
curl -X POST "http://localhost:8000/api/v1/events/search" \
  -H "Content-Type: application/json" \
  -d '{"work_id": 12285}'

Response:

{
  "query": "12285",
  "total_results": 28,
  "results": [
    {
      "title": "Gianni Schicchi",
      "city": "Berlin",
      "date": "2026-04-05T00:00:00",
      "venue": "Deutsche Oper",
      "detail_url": "https://bachtrack.com/opera-event/..."
    }
  ]
}

Available Endpoints

  • GET /api/v1/events/get_operas?q=<search> - Raw scraper output
  • GET /api/v1/events/search?work_id=<id> - Search by work ID
  • GET /api/v1/events/search?q=<term> - Freetext search
  • POST /api/v1/events/search - JSON body search
  • GET /docs - Interactive API documentation
  • GET /health - Health check

Testing

# Run scraper tests
python tests/test_scraper.py

# Run full API integration tests
pytest tests/test_api.py -v -s

Project Structure

scraper/scraper.py           # Core scraping logic
api/
  ├── main.py                    # FastAPI app factory
  ├── routes/events.py           # API endpoints
  ├── models/event.py            # Pydantic models
  └── services/opera_service.py  # Service layer
tests/                           # Unit and integration tests

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bachtrackapi-0.1.2.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bachtrackapi-0.1.2-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file bachtrackapi-0.1.2.tar.gz.

File metadata

  • Download URL: bachtrackapi-0.1.2.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for bachtrackapi-0.1.2.tar.gz
Algorithm Hash digest
SHA256 143c33cf846942fc853fff72f50213a346c2c5012f4c85afe7b09213c5d50592
MD5 fd42435c01dc3f6e44be6ba51b113431
BLAKE2b-256 9daaadf1ba6527c71c1eae08aa82dcc1ef494de288da09bac739563693db7e74

See more details on using hashes here.

File details

Details for the file bachtrackapi-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: bachtrackapi-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for bachtrackapi-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e3ef3f226dfcf82f581cbaa17e61814fc896f6d547a5b9ab37914ac6e429502d
MD5 a567bb9d3a094598fdc554203f54a924
BLAKE2b-256 a24eaeebfdd0e60bf18fb2dc224fa12fc74d40f64626ba39af78ff9d2c36e3b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page