A library to deal with Daisy 2.02 digital talking books
Project description
A package to deal with Daisy 2.02 digital talking books
Introduction
DAISY (Digital Accessible Information SYstem) is a technical standard for digital audio books, periodicals and computerized text.
DAISY is designed to be a complete audio substitute for print material and is specifically designed for use by people with "print disabilities", including blindness, impaired vision, and dyslexia. Based on the MP3 and XML formats, the DAISY format has advanced features in addition to those of a traditional audio book. Users can search, place bookmarks, precisely navigate line by line, and regulate the speaking speed without distortion. DAISY also provides aurally accessible tables, references and additional information.
As a result, DAISY allows visually impaired listeners to navigate something as complex as an encyclopedia or textbook, otherwise impossible using conventional audio recordings.
DAISY multimedia can be a book, magazine, newspaper, journal, computerized text or a synchronized presentation of text and audio. It provides up to six embedded "navigation levels" for content, including embedded objects such as images, graphics, and MathML. In the DAISY standard, navigation is enabled within a sequential and hierarchical structure consisting of (marked-up) text synchronized with audio.
DAISY 2 was based on XHTML and SMIL. DAISY 3 is a newer technology, also based on XML, and is standardized as ANSI/NISO Z39.86-2005.
The DAISY Consortium was founded in 1996 and consists of international organizations committed to developing equitable access to information for people who have a print disability. The consortium was selected by the National Information Standards Organization (NISO) as the official maintenance agency for the DAISY/NISO Standard.
Source : https://encyclopedia.pub/entry/33638
Warning
This package (also published to PyPi) is still under active development (alpha stage).
Do NOT use it in production !
Dependencies
You will find all information in thy project.toml file.
In production, we use following packages :
loguru: https://github.com/Delgan/loguru
For development, following additional packages are used :
pytest: https://docs.pytest.org/en/stable/pylint: https://www.pylint.org/pytest-cov: https://pypi.org/project/pytest-cov/getkey: https://github.com/kcsaff/getkeypygame: https://www.pygame.org
Thanks to all these guys helping us to develop nice software and letting us have fun doing it !
Installation
You can install daisy-dtb with all common python dependencies manager.
- With pip :
pip install daisy-dtb - With uv :
uv add daisy-dtb
Code organization
src
├── models # Models (classes) used to refresent Daisy 2.02 data structures
│ ├── toc_entry.py # Class TocEntry (extracted from the NCC.html file)
│ ├── metadata.py # Class MetaData (extracted from the NCC.html file <meta/> tags)
│ ├── reference.py # Representation of a resource#fragment (href or src attribute)
│ ├── section.py # Section
│ ├── smil.py # Representation of a SMIL file
│ ├── text.py # Representation of a text fragment
│ └── audio.py # Representation of an audio clip
│
├── navigators # Classes to implement Book navigation features
│ ├── base_navigator.py # The base class
│ ├── toc_navigator.py # Navigation in the table of content (TOC)
│ ├── section_navigator.py # Navigation in the TOC sections
│ ├── clip_navigator.py # Navigation in the section audio clips
│ └── book_navigator.py # Navigation in the book (TOC, sections, clips)
│
├── sources # Classes to represent the datasources
│ ├── source.py # Base class
│ ├── folder_source.py # Folder based source (Web or filesystem)
│ └── zip_source.py # Zip based source (Web or filesystem)
│
├── cache # Data cache
│ ├── cache.py # Data cache classes
│ └── cachestats.py # Cache statistics
│
├── utilities # Utilities
│ ├── domlib.py # Classes to encapsulate and simplify the usage of the xml.dom.minidom library
│ ├── fetcher.py # Data fetcher to get resources (Web or filesystem)
│ └── logconfig.py # Logging configuration, log level setting
│
├── daisybook.py # Representation of the Daisy 2.02 DTB (classes DaisyBook and DaisyBookError)
└── develop.py # The programmers sandbox
Logging
For logging, we use the loguru package (https://github.com/Delgan/loguru). We used logging a lot to develop this piece of software.
To completely turn of logging, do the following in your code :
from loguru import logger
from utilities.logconfig import LogLevel
....
# Set logging level
LogLevel.set(LogLevel.NONE)
....
Alternatively, you can turn logging on (debug) :
from loguru import logger
from utilities.logconfig import LogLevel
....
# Set logging level
LogLevel.set(LogLevel.DEBUG)
....
DTB data sources
A Daisy 2.02 digital talking book (DTB) can come in multiple forms :
- in a file system folder or in a web location as individual files
- in a ZIP archive located in a file system folder or on a website
The base class representing this is DtbResource (an abstract class inheriting from ABC).
A data source must contain a Daisy 2.02 book.
Two kinds of DtbResource classes have been implemented :
FolderDtbResource: the resource base is a filesystem folder or a web location containing the DTB filesZipDtbResource: the resource base is a zip file (either on a filesystem or in a web location) containing the DTB files
Both classes implement the get(resource_name: str) -> bytes | str | None method which allows to retrieve a specific resource (e.g. the ncc.html file).
The conversion to a str type result is tried, and if it does not work, bytes are returned. In cas of an error, None is returned.
The imlementation can be found in the dtbsource.py file.
These classes are used to specifiy the source of a DaisyDTB, the class representing the Daisy 2.02 book.
Note 1 : If ZipDtbResource is instaciated from a web location, data is stored internally to avoid multiple accesses to the web.
Note 2 : If FolderDtbResource is instaciated, a buffer_size can be set. This allows to store resources internally to reduce network traffic.
Setting up a datasource
For Daisy books stored in a filesystem :
try:
# Create a `FolderDtbResource` with a `buffer_size` of 10 items
source = FolderDtbResource(resource_base='/path/to/a/dtb/folder/', buffer_size=10)
except FileNotFoundError:
# Handle error
...
data = source.get('ncc.html')
if data is not None:
# Process the data
...
For Daisy books stored in a web location :
try:
# Create a `FolderDtbResource` with a `buffer_size` of 20 items
source = FolderDtbResource(resource_base='https://www.site.com/daisy/book/', buffer_size=20)
except FileNotFoundError:
# Handle error
...
data = source.get('ncc.html')
if data is not None:
# Process the data
...
For Daisy books stored in a local ZIP file :
try:
source = ZipDtbResource(resource_base='/path/to/a/dtb/zipfile.zip')
except FileNotFoundError:
# Handle error
...
data = source.get('ncc.html')
if data is not None:
# Process the data
...
For Daisy books stored in a ZIP file on a web site :
try:
source = ZipDtbResource(resource_base='https://www.site.com/daisy/book/zipfile.zip')
except FileNotFoundError:
# Handle error
...
data = source.get('ncc.html')
if data is not None:
# Process the data
...
Project files
dtbsource.py: implementation of theDtbResource,FolderDtbResourceandZipDtbResourceclassesdomlib.py: classes to encapsulate and simplify the usage of thexml.dom.minidomlibrary
Dependencies
We use the loguru package for logging.
See file pyproject.toml.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file daisy_dtb-0.0.17.tar.gz.
File metadata
- Download URL: daisy_dtb-0.0.17.tar.gz
- Upload date:
- Size: 23.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39d5d339a79f6f1c77a9c94517b103075be4969c9ecce38fcfd9acbfa3c86c8c
|
|
| MD5 |
39886553bd40b3c4a371f3000dff0a63
|
|
| BLAKE2b-256 |
6b94333d42e9a74bd78f9fdb6b8f923c9c9529f4c2c10522f40622a036bd7991
|
Provenance
The following attestation bundles were made for daisy_dtb-0.0.17.tar.gz:
Publisher:
release.yaml on nicedata/daisy-dtb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
daisy_dtb-0.0.17.tar.gz -
Subject digest:
39d5d339a79f6f1c77a9c94517b103075be4969c9ecce38fcfd9acbfa3c86c8c - Sigstore transparency entry: 184292970
- Sigstore integration time:
-
Permalink:
nicedata/daisy-dtb@7e92ae6abbd81ea1e0a3e5db756fa01989809dd5 -
Branch / Tag:
refs/tags/0.0.17 - Owner: https://github.com/nicedata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@7e92ae6abbd81ea1e0a3e5db756fa01989809dd5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file daisy_dtb-0.0.17-py3-none-any.whl.
File metadata
- Download URL: daisy_dtb-0.0.17-py3-none-any.whl
- Upload date:
- Size: 28.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fafe60bf15f46078f2265ef804fe3045ec8c4938903e43a5e3b7c890b8a25513
|
|
| MD5 |
2fab7a20c8d68031b8c2445d42a01d5e
|
|
| BLAKE2b-256 |
95e95efd84e9bdd8ee009e2fc27d7a04751159606f03880c8f60526d5af90616
|
Provenance
The following attestation bundles were made for daisy_dtb-0.0.17-py3-none-any.whl:
Publisher:
release.yaml on nicedata/daisy-dtb
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
daisy_dtb-0.0.17-py3-none-any.whl -
Subject digest:
fafe60bf15f46078f2265ef804fe3045ec8c4938903e43a5e3b7c890b8a25513 - Sigstore transparency entry: 184292978
- Sigstore integration time:
-
Permalink:
nicedata/daisy-dtb@7e92ae6abbd81ea1e0a3e5db756fa01989809dd5 -
Branch / Tag:
refs/tags/0.0.17 - Owner: https://github.com/nicedata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yaml@7e92ae6abbd81ea1e0a3e5db756fa01989809dd5 -
Trigger Event:
push
-
Statement type: