Audiobook scraper — search and stream from Librivox, LoyalBooks, and more

These details have not been verified by PyPI

Project links

Project description

audiobooker

Search and stream free audiobooks from multiple web sources. Parallel search, fuzzy scoring, a unified AudioBook dataclass, local cache, SQLite index, and a mediavocab Release converter — one API regardless of where the book comes from.

Install

pip install audiobooker

# Optional extras
pip install audiobooker[youtube]   # YouTube channel/playlist sources (tutubo)
pip install audiobooker[stealth]   # curl_cffi TLS-fingerprint transport
pip install audiobooker[test]      # pytest + vcrpy (dev only)

Quick start

from audiobooker import search

for book in search("Lovecraft", max_per_source=5, timeout=30):
    print(f"[{book.score:.2f}] [{book.source}] {book.title}")
    print(f"  authors={[f'{a.first_name} {a.last_name}'.strip() for a in book.authors]}")
    print(f"  streams={len(book.streams)}")

Supported sources

Source	Site	Catalogue	Native search
`Librivox`	librivox.org	~18 000 books	REST API (title, author, narrator, tag)
`LoyalBooks`	loyalbooks.com	~3 500 books	sitemap + genre pages
`GoldenAudioBooks`	goldenaudiobook.co	~6 500 books	linear scan
`StephenKingAudioBooks`	stephenkingaudiobooks.com	~113 books	native site search
`AudioAnarchy`	audioanarchy.org	~11 books	linear scan
`DarkerProjects`	darkerprojects.com	~244 episodes	linear scan
`HPTalesAudioBooks`	hpaudiotales.com	~20 books	linear scan

YouTube (pip install audiobooker[youtube]):

Source	Channel	Content
`TheCybrarian`	@TheCybrarian	Robert E. Howard fiction
`HorrorBabble`	@HorrorBabble	Horror short fiction

Python API

from audiobooker import (
    search, search_by_title, search_by_author, search_by_tag, search_by_narrator,
    audiobook_to_release,
    BookIndex, IndexedSource,
    AudioBook, BookAuthor, AudiobookNarrator, AudioBookChapter,
)

# Targeted searches — all run in parallel across all sources
for book in search_by_author("Dickens", max_per_source=5):
    print(book.title)

for book in search_by_tag("horror", max_per_source=5):
    print(book.title)

Per-source

from audiobooker.scrappers.librivox import Librivox

lv = Librivox()
for book in lv.search_by_title("Dracula"):
    print(book.title, book.runtime)

for book in lv.iterate_all():   # full catalogue
    print(book.title)

All scrapers share the same interface: search(), search_by_title(), search_by_author(), search_by_tag(), search_by_narrator(), iterate_all(), iterate_popular(), iterate_by_author(), iterate_by_tag().

mediavocab integration

mediavocab is a required dependency. audiobook_to_release() projects an AudioBook into the typed mediavocab.Release schema — Work, credits, chapters, external IDs, codec, license.

from audiobooker import search, audiobook_to_release

for book in search("Lovecraft", max_per_source=3):
    release = audiobook_to_release(book)
    lic = release.license
    if lic and lic.is_open():
        print(release.work.title, lic.identifier)

See docs/converters.md for the full field mapping.

HTTP transport

By default every scraper uses a requests.Session with a randomised User-Agent. Two ways to override:

Environment variable — set before any import:

AUDIOBOOKER_TRANSPORT=curl_cffi python myscript.py

Falls back to plain requests if curl_cffi is not installed. Install with pip install audiobooker[stealth].

Per-instance injection — pass any requests-compatible session:

from curl_cffi import requests as cffi_requests
from audiobooker.scrappers.librivox import Librivox

session = cffi_requests.Session(impersonate="chrome")
lv = Librivox(session=session)

default_session() from audiobooker.transport respects AUDIOBOOKER_TRANSPORT and returns the appropriate session type. — audiobooker/transport.py:1

Local index

Build once, search without network access:

from audiobooker.index import BookIndex

idx = BookIndex()   # ~/.audiobooker/index.db
idx.build()         # iterate_all() on all 7 web sources

for book in idx.search_by_title("Sherlock Holmes", max_results=5):
    print(f"[{book.score:.2f}] {book.title}")

CLI reference

audiobooker search <query>
    --method  search|search_by_title|search_by_author|search_by_tag|search_by_narrator
    -n        max results (default 10)
    --source  limit to one source
    --timeout seconds (default 30)
    -v        verbose (tags, narrator, stream URLs)

audiobooker index build [--sources librivox loyalbooks ...]
audiobooker index update
audiobooker index search <query> [--method ...] [-n N]
audiobooker index stats
audiobooker index follow <url> [--kind channel|playlist] [--tags ...] [--blacklist ...]
audiobooker index unfollow <url>
audiobooker index list

audiobooker cache download <query> [--stream INDEX]
audiobooker cache play     <query> [--stream INDEX]
audiobooker cache list
audiobooker cache clear    [<query>]
audiobooker cache info     <query>

All index and cache commands accept --db PATH and --cache-dir PATH to override default locations (~/.audiobooker/index.db and ~/.cache/audiobooker).

Docs

Full documentation is in /docs/:

Getting started
Sources — per-scraper details and quirks
Search orchestrator
Scoring
Index — SQLite index, offline search, YouTube follow
Cache — download + play
Converters — mediavocab Release shape
Transport — HTTP session, stealth backend
API reference

Runnable examples are in /examples/ — numbered 01 → 10 from quickstart to advanced index usage.

Error handling

Network failures and malformed pages are swallowed per-item — a bad page never aborts an iterate_all() run. If a source site is down or has restructured its HTML, that scraper silently yields nothing.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.9.0a2 pre-release

Jun 23, 2026

0.9.0a1 pre-release

May 29, 2026

0.8.0a1 pre-release

May 7, 2026

0.7.0

Apr 28, 2026

0.7.0a1 pre-release

Apr 28, 2026

0.6.0a1 pre-release

Apr 28, 2026

0.5.1a1 pre-release

Apr 28, 2026

0.4.0

Oct 15, 2021

0.4.0a1 pre-release

Oct 15, 2021

0.3.1

Oct 15, 2021

0.3.0

Jun 18, 2020

0.2.7

Mar 29, 2020

0.2.6

Jan 7, 2019

0.2.4

Jan 4, 2019

0.2.1

Jan 3, 2019

0.1.1

Jan 3, 2019

0.1

Jan 3, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

audiobooker-0.9.0a2.tar.gz (65.4 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

audiobooker-0.9.0a2-py3-none-any.whl (48.5 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file audiobooker-0.9.0a2.tar.gz.

File metadata

Download URL: audiobooker-0.9.0a2.tar.gz
Upload date: Jun 23, 2026
Size: 65.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audiobooker-0.9.0a2.tar.gz
Algorithm	Hash digest
SHA256	`1da93c33d6239656b3046a2aefc0ba582b048bbe74d7e7f5092ddb54b43ed2f2`
MD5	`0ab365468644a3051134ebbf6d251c89`
BLAKE2b-256	`e54759deb3b7d882a269b3851556e4f8cb30d6aa004963766694a799c41a6140`

See more details on using hashes here.

File details

Details for the file audiobooker-0.9.0a2-py3-none-any.whl.

File metadata

Download URL: audiobooker-0.9.0a2-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 48.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for audiobooker-0.9.0a2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d7869aa829da1fbdf734c6135430e2522ab89d42f3f58e7d1cbad7e02974a9d3`
MD5	`3744a45cf70c0ca2ba255583316ec02a`
BLAKE2b-256	`6177cbb62b033bc1f1c5917117196d8c0d140c54f027f342df48cd9b43148a5d`

See more details on using hashes here.

audiobooker 0.9.0a2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

audiobooker

Install

Quick start

Supported sources

Python API

Per-source

mediavocab integration

HTTP transport

Local index

CLI reference

Docs

Error handling

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes