No project description provided
Project description
mal4u: Asynchronous MyAnimeList Scraper
An unofficial, asynchronous Python library for scraping data from MyAnimeList.net. Built with aiohttp for efficient network requests and beautifulsoup4 for HTML parsing. Uses Pydantic for data validation and structuring.
Disclaimer: This is an unofficial library and is not affiliated with MyAnimeList. Please use responsibly and respect MAL's terms of service. Excessive scraping can lead to IP bans.
Features
- Asynchronous: Leverages
asyncioandaiohttpfor non-blocking network I/O. - Session Management: Supports both explicit session creation/closing and automatic handling via
async with. - Modular Parsers: Designed with a base parser and specific sub-parsers for Manga and Anime.
- Type Hinted: Fully type-hinted codebase for better developer experience and static analysis.
- Data Validation: Uses Pydantic models (
MangaSearchResult,AnimeSearchResult,MangaDetails,AnimeDetails, etc.) to structure and validate scraped data. - Robust Detail Parsing: Extracts a wide range of information from detail pages, including titles, synopsis, background, stats, related entries, characters, themes, and more for both anime and manga.
Current Capabilities
- Search:
- Search for Manga.
- Search for Anime.
- Search for Character.
- Details:
- Get detailed information for a specific Manga by ID (using
MangaDetailsmodel). - Get detailed information for a specific Anime by ID (using
AnimeDetailsmodel). - Get detailed information for a specific Character by ID (using
CharacterDetailsmodel).
- Get detailed information for a specific Manga by ID (using
- Browse/Lists (from overview pages like
manga.php/anime.php):- Get available Genres (Anime & Manga).
- Get available Themes (Anime & Manga).
- Get available Demographics (Anime & Manga).
- Get a preview list of Magazines (Manga).
- Get Studios list (Anime)
Installation
pip install mal4u
Basic Usage
Recommended: Using async with
This automatically handles session creation and closing.
import asyncio
import logging
from mal4u import MyAnimeListApi, MangaSearchResult, MangaDetails, AnimeDetails
# Optional: Configure logging for more details
logging.basicConfig(level=logging.INFO)
logging.getLogger('mal4u').setLevel(logging.DEBUG) # See debug logs from the library
async def main():
async with MyAnimeListApi() as api:
# --- Manga Example ---
print("Searching for 'Berserk' manga...")
search_results: list[MangaSearchResult] = await api.manga.search("Berserk", limit=1)
manga_id_to_get = 2 # Default to Berserk if search fails
if search_results:
print(f"- Found: {search_results[0].title} (ID: {search_results[0].mal_id})")
manga_id_to_get = search_results[0].mal_id
else:
print("Search returned no results. Using default ID 2.")
print(f"\nGetting details for Manga ID: {manga_id_to_get}")
manga_details: MangaDetails | None = await api.manga.get(manga_id_to_get)
if manga_details:
print(f" Title: {manga_details.title} ({manga_details.type})")
print(f" Status: {manga_details.status}")
print(f" Score: {manga_details.score} (by {manga_details.scored_by} users)")
print(f" Chapters: {manga_details.chapters}, Volumes: {manga_details.volumes}")
print(f" Synopsis (start): {manga_details.synopsis[:100] if manga_details.synopsis else 'N/A'}...")
print(f" Genres: {[genre.name for genre in manga_details.genres]}")
else:
print(f" Could not retrieve details for Manga ID: {manga_id_to_get}")
print("\n" + "="*20 + "\n")
# --- Anime Example ---
anime_id_to_get = 40852 # Dr. Stone: Stone Wars
print(f"Getting details for Anime ID: {anime_id_to_get}")
anime_details: AnimeDetails | None = await api.anime.get(anime_id_to_get)
if anime_details:
print(f" Title: {anime_details.title} ({anime_details.type})")
print(f" Status: {anime_details.status}")
print(f" Score: {anime_details.score} (by {anime_details.scored_by} users)")
print(f" Episodes: {anime_details.episodes}")
print(f" Premiered: {anime_details.premiered.name if anime_details.premiered else 'N/A'}")
print(f" Synopsis (start): {anime_details.synopsis[:100] if anime_details.synopsis else 'N/A'}...")
print(f" Studios: {[studio.name for studio in anime_details.studios]}")
print(f" Opening Theme(s): {anime_details.opening_themes}")
else:
print(f" Could not retrieve details for Anime ID: {anime_id_to_get}")
if __name__ == "__main__":
asyncio.run(main())
Manual Session Management
You need to explicitly create and close the session.
import asyncio
import logging
from mal4u import MyAnimeListApi
logging.basicConfig(level=logging.INFO)
async def main_manual():
api = MyAnimeListApi()
try:
# Explicitly create the session
await api.create_session()
print("Session created.")
# Perform actions (e.g., get anime details)
anime_id = 5114 # FMA: Brotherhood
print(f"Getting details for Anime ID: {anime_id}")
details = await api.anime.get(anime_id)
if details:
print(f"- Found: {details.title} (Score: {details.score})")
else:
print(f"- Could not retrieve details for Anime ID: {anime_id}")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Ensure the session is closed
print("Closing session...")
await api.close()
print("Session closed.")
if __name__ == "__main__":
asyncio.run(main_manual())
TODO
- Search Manga
- Get Manga Details (
MangaDetails) - Search Anime (
AnimeSearchResult) - Get Anime Details (
AnimeDetails) - Search Character
- Get Character Details (
CharacterDetails) - Implement Parsers for other MAL sections (People, Studios, etc.).
- Implement more robust error handling (e.g., custom exceptions for 404, parsing failures).
- Add unit and integration tests.
- Improve documentation (detailed docstrings, potentially Sphinx docs).
- Add rate limiting awareness/options.
Contributing
Contributions are welcome! Please open an issue or submit a pull request. (You might want to add more details here later).
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mal4u-0.2.0.tar.gz.
File metadata
- Download URL: mal4u-0.2.0.tar.gz
- Upload date:
- Size: 41.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
766e9dfbf92018e79d5886c8210b23b46ae59248dfe965aff7d4e3161b6abfd7
|
|
| MD5 |
2cebe3c63819eb09e831e34ff758461d
|
|
| BLAKE2b-256 |
bedd54044503003e65d22721b216346c8529963e7f7f7bbe8c4f4b43ec07d65b
|
File details
Details for the file mal4u-0.2.0-py3-none-any.whl.
File metadata
- Download URL: mal4u-0.2.0-py3-none-any.whl
- Upload date:
- Size: 46.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fac0fd897dc2fa616f4308c0e675e49068a4a504847fcca0ad7370ba7ebae103
|
|
| MD5 |
91c13dcd81120b454253c660a6f345ee
|
|
| BLAKE2b-256 |
17ba5b9ab679e32fecfd2d29d142e44596edb6590a02cc8a2300d440996017b2
|