A Python library for extracting data and metadata from FB2 (FictionBook 2) format files
Project description
fb2reader
Русская версия | English
A Python library for extracting data and metadata from FB2 (FictionBook 2) format files.
Description
fb2reader is a lightweight and easy-to-use Python library designed for working with FB2 format files. It provides convenient methods for extracting book metadata (title, authors, description, ISBN, etc.) and content, making it ideal for building e-book management systems, cataloging tools, or reading applications.
Features
- Extract book metadata:
- Title, authors, translators
- Book series information
- Language, genres/tags
- ISBN, book identifier
- Description/annotation
- Extract and save cover images (JPEG/PNG)
- Extract book body content
- Save book content as HTML
- Full error handling and validation
- Type hints support
- Comprehensive test coverage
- Support for Python 3.8+
Installation
Install using pip:
pip install fb2reader
Requirements
- Python 3.8 or higher
- BeautifulSoup4
- lxml
Quick Start
from fb2reader import fb2book
# Open an FB2 file
book = fb2book('path/to/your/book.fb2')
# Get book metadata
title = book.get_title()
authors = book.get_authors()
description = book.get_description()
print(f"Title: {title}")
print(f"Authors: {authors}")
print(f"Description: {description}")
Usage Examples
Getting Book Metadata
from fb2reader import fb2book
book = fb2book('example.fb2')
# Get basic information
title = book.get_title()
isbn = book.get_isbn()
lang = book.get_lang()
identifier = book.get_identifier()
# Get authors (returns list of dicts)
authors = book.get_authors()
for author in authors:
print(f"Author: {author['full_name']}")
print(f" First name: {author['first_name']}")
print(f" Last name: {author['last_name']}")
# Get translators
translators = book.get_translators()
for translator in translators:
print(f"Translator: {translator['full_name']}")
# Get series information
series = book.get_series()
if series:
print(f"Part of series: {series}")
# Get genres/tags
tags = book.get_tags()
print(f"Genres: {', '.join(tags)}")
# Get description
description = book.get_description()
print(f"Description: {description}")
Working with Cover Images
from fb2reader import fb2book
book = fb2book('example.fb2')
# Check if book has a cover
cover = book.get_cover_image()
if cover:
# Save cover image
result = book.save_cover_image(output_dir='covers')
if result:
image_name, image_type = result
print(f"Cover saved: {image_name}.{image_type}")
else:
print("No cover image found")
# You can also specify the image type explicitly
book.save_cover_image(
cover_image=cover,
cover_image_type='jpeg',
output_dir='my_covers'
)
Extracting Book Content
from fb2reader import fb2book
book = fb2book('example.fb2')
# Get book body as string
body = book.get_body()
if body:
print(f"Body length: {len(body)} characters")
# Save book body as HTML file
try:
output_path = book.save_body_as_html(
output_dir='output',
output_file_name='book_content.html'
)
print(f"Content saved to: {output_path}")
except Exception as e:
print(f"Error saving content: {e}")
Using the Helper Function
from fb2reader import get_fb2
# Alternative way to create fb2book instance
book = get_fb2('example.fb2')
if book:
title = book.get_title()
print(f"Title: {title}")
else:
print("Invalid or non-FB2 file")
Error Handling
from fb2reader import fb2book
from fb2reader.fb2reader import InvalidFB2Error, FB2ReaderError
try:
book = fb2book('example.fb2')
title = book.get_title()
print(f"Title: {title}")
except FileNotFoundError:
print("File not found")
except InvalidFB2Error as e:
print(f"Invalid FB2 file: {e}")
except FB2ReaderError as e:
print(f"FB2 Reader error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
API Reference
Class: fb2book
Main class for working with FB2 files.
Constructor
fb2book(file: str)
file(str): Path to the FB2 file
Raises:
FileNotFoundError: If file doesn't existInvalidFB2Error: If file is not a valid FB2 formatIOError: If there's an error reading the file
Methods
Metadata Methods
get_identifier() -> Optional[str]- Get book identifierget_title() -> Optional[str]- Get book titleget_authors() -> List[Dict[str, Optional[str]]]- Get list of authorsget_translators() -> List[Dict[str, Optional[str]]]- Get list of translatorsget_series() -> Optional[str]- Get series nameget_lang() -> Optional[str]- Get language codeget_description() -> Optional[str]- Get book description/annotationget_tags() -> List[str]- Get list of genres/tagsget_isbn() -> Optional[str]- Get ISBN
Content Methods
get_body() -> Optional[str]- Get book body contentget_cover_image() -> Optional[BeautifulSoup]- Get cover image elementsave_cover_image(cover_image=None, cover_image_type=None, output_dir='output') -> Optional[tuple]- Save cover imagesave_body_as_html(output_dir='output', output_file_name='body.html') -> str- Save body as HTML
Function: get_fb2
get_fb2(file: str) -> Optional[fb2book]
Helper function to create fb2book instance.
- Returns
fb2bookinstance if file is valid FB2 - Returns
Noneif file is not an FB2 file
Exceptions
FB2ReaderError- Base exception for all fb2reader errorsInvalidFB2Error- Raised when FB2 file is invalid or cannot be parsed
Development
Running Tests
# Install development dependencies
pip install -e .
pip install pytest pytest-cov
# Run tests
pytest tests/ -v
# Run with coverage
pytest tests/ -v --cov=fb2reader --cov-report=html
Project Structure
fb2reader/
├── fb2reader/ # Main package
│ ├── __init__.py # Package initialization
│ └── fb2reader.py # Core implementation
├── tests/ # Test suite
│ ├── __init__.py
│ ├── conftest.py # Pytest fixtures
│ ├── test_fb2reader.py
│ └── test_data/ # Test FB2 files
├── README.md # Documentation (English)
├── README_RU.md # Documentation (Russian)
├── setup.py # Package configuration
├── requirements.txt # Dependencies
└── LICENSE # Apache 2.0 License
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Author
Roman Kudryavskyi - devpilgrim@gmail.com
Links
Changelog
Version 1.0.4 (Upcoming)
- Fixed critical bugs in
__init__.py - Fixed cover image decoding (base64 instead of hex)
- Added proper error handling and validation
- Added type hints to all methods
- Improved docstrings
- Added comprehensive test suite
- Updated CI/CD pipeline with automated testing
- Improved documentation
Version 1.0.3
- Bug fixes and compatibility improvements
Acknowledgments
- FB2 format specification: FictionBook
- Built with BeautifulSoup4
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fb2reader-1.0.4.tar.gz.
File metadata
- Download URL: fb2reader-1.0.4.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7f6923b5d8d2bc01aa08c789f4166334b8a6f3e3f4a31b1d4b67bc40ddab62c
|
|
| MD5 |
bc43adc50625798dd7a9dbd694d84b97
|
|
| BLAKE2b-256 |
c1c8896fc477f2186bb6893fa01da22ee17fd5174f6c2ec8b8b24b28cb7698c5
|
File details
Details for the file fb2reader-1.0.4-py3-none-any.whl.
File metadata
- Download URL: fb2reader-1.0.4-py3-none-any.whl
- Upload date:
- Size: 11.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2a48017ee5a4abde427a377915b8da86f7ff3ec9600f6163a7438ba5486c7e02
|
|
| MD5 |
bb4b8c4c7cd7d6dda7d4653e58c6db63
|
|
| BLAKE2b-256 |
00cc66c66ee74938032873cf26089134472ac7f9d77524ef4c4a0b34aa3ce118
|