Skip to main content

A Python web scraper and REST API for extracting classical music opera events from Bachtrack.com

Project description

BachtrackAPI

A Python web scraper and REST API for extracting classical music opera events from Bachtrack.com.

Overview

BachtrackAPI provides two complementary ways to access opera event data:

  1. Scraper Module - Direct web scraping of Bachtrack opera listings
  2. FastAPI Backend - RESTful API endpoints for searching and filtering events

Search by work ID (e.g., 12285 for Gianni Schicchi) or freetext (e.g., "La Traviata").

Installation

pip install -r requirements.txt

Quick Start

1. Using the Scraper Directly

from scraper.scraper import BachtrackScraper

scraper = BachtrackScraper()

# Search by work ID
events = scraper.search_operas(12285)  # Gianni Schicchi
print(f"Found {len(events)} events")

# Search by freetext
events = scraper.search_operas("La Traviata")
for event in events:
    print(f"{event['title']} - {event['city']} @ {event['venue']}")

Output:

Found 28 events
Gianni Schicchi - Berlin @ Deutsche Oper
Gianni Schicchi - Winterthur @ Stadttheater Winterthur
...

2. Using the FastAPI Backend

Start the server:

uvicorn api.main:app --reload

Example API Requests:

# Freetext search
curl "http://localhost:8000/api/v1/events/get_operas?q=gianni%20schicchi"

# Work ID search
curl "http://localhost:8000/api/v1/events/get_operas?q=12285"

# POST search
curl -X POST "http://localhost:8000/api/v1/events/search" \
  -H "Content-Type: application/json" \
  -d '{"work_id": 12285}'

Response:

{
  "query": "12285",
  "total_results": 28,
  "results": [
    {
      "title": "Gianni Schicchi",
      "city": "Berlin",
      "date": "2026-04-05T00:00:00",
      "venue": "Deutsche Oper",
      "detail_url": "https://bachtrack.com/opera-event/..."
    }
  ]
}

Available Endpoints

  • GET /api/v1/events/get_operas?q=<search> - Raw scraper output
  • GET /api/v1/events/search?work_id=<id> - Search by work ID
  • GET /api/v1/events/search?q=<term> - Freetext search
  • POST /api/v1/events/search - JSON body search
  • GET /docs - Interactive API documentation
  • GET /health - Health check

Testing

# Run scraper tests
python tests/test_scraper.py

# Run full API integration tests
pytest tests/test_api.py -v -s

Project Structure

scraper/scraper.py           # Core scraping logic
api/
  ├── main.py                    # FastAPI app factory
  ├── routes/events.py           # API endpoints
  ├── models/event.py            # Pydantic models
  └── services/opera_service.py  # Service layer
tests/                           # Unit and integration tests

License

MIT License - see LICENSE file for details

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bachtrackapi-0.1.0.tar.gz (10.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

bachtrackapi-0.1.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file bachtrackapi-0.1.0.tar.gz.

File metadata

  • Download URL: bachtrackapi-0.1.0.tar.gz
  • Upload date:
  • Size: 10.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for bachtrackapi-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b0f92c2a7957479e4f92e77ac14b49a9385edde05d9052384cc939b597f8a076
MD5 63c61eab7eb1bf8c5c3f4cd013884ec1
BLAKE2b-256 a02bded35ddd5d507b5cfebf5e089fc8716a9a197d2bde5ee7cc032d94f52b98

See more details on using hashes here.

File details

Details for the file bachtrackapi-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: bachtrackapi-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.9

File hashes

Hashes for bachtrackapi-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4db931bc180ee932379de7825dcdfcdaa72ca049949958811f26c3ee397262a8
MD5 604fedbcc9349071f428f0e038471a15
BLAKE2b-256 dee59f515da62d4aa80a854a3b2efef887b1b48d46112c1d209e882c5bee8e56

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page