No project description provided
Project description
mal4u: Asynchronous MyAnimeList Scraper
An unofficial, asynchronous Python library for scraping data from MyAnimeList.net. Built with aiohttp for efficient network requests and beautifulsoup4 for HTML parsing. Uses Pydantic for data validation and structuring.
Disclaimer: This is an unofficial library and is not affiliated with MyAnimeList. Please use responsibly and respect MAL's terms of service. Excessive scraping can lead to IP bans.
Features
- Asynchronous: Leverages
asyncioandaiohttpfor non-blocking network I/O. - Session Management: Supports both explicit session creation/closing and automatic handling via
async with. - Modular Parsers: Designed with a base parser and specific sub-parsers (currently Manga).
- Type Hinted: Fully type-hinted codebase for better developer experience and static analysis.
- Data Validation: Uses Pydantic models (
MangaSearchResult,MangaDetails, etc.) to structure and validate scraped data. - Current Capabilities:
- Search for Manga.
- Get detailed information for a specific Manga by ID.
Installation
pip install mal4u
Basic Usage
Recommended: Using async with
This automatically handles session creation and closing.
import asyncio
import logging
from mal_api import MyAnimeListApi, MangaSearchResult, MangaDetails # Assuming types are exported
# Optional: Configure logging for more details
logging.basicConfig(level=logging.INFO)
logging.getLogger('mal_api').setLevel(logging.DEBUG) # See debug logs from the library
async def main():
async with MyAnimeListApi() as api:
# Search for manga
print("Searching for 'Berserk'...")
search_results: list[MangaSearchResult] = await api.manga.search("Berserk", limit=3)
if search_results:
print(f"Found {len(search_results)} results:")
for result in search_results:
print(f"- ID: {result.mal_id}, Title: {result.title}, Type: {result.manga_type}, Score: {result.score}")
else:
print("Search returned no results.")
print("\n" + "="*20 + "\n")
# Get details for a specific manga (using Berserk's ID: 2)
manga_id_to_get = 2
print(f"Getting details for Manga ID: {manga_id_to_get}")
details: MangaDetails | None = await api.manga.get(manga_id_to_get)
if details:
print(f"Title: {details.title} ({details.type})")
print(f"Status: {details.status}")
print(f"Score: {details.score} (by {details.scored_by} users)")
print(f"Rank: #{details.rank}, Popularity: #{details.popularity}")
print(f"Synopsis (first 100 chars): {details.synopsis[:100] if details.synopsis else 'N/A'}...")
print(f"Genres: {[genre.name for genre in details.genres]}")
else:
print(f"Could not retrieve details for Manga ID: {manga_id_to_get}")
if __name__ == "__main__":
asyncio.run(main())
Manual Session Management
You need to explicitly create and close the session.
import asyncio
import logging
from mal_api import MyAnimeListApi
logging.basicConfig(level=logging.INFO)
async def main_manual():
api = MyAnimeListApi()
try:
# Explicitly create the session
await api.create_session()
print("Session created.")
# Perform actions
print("Searching for 'Vinland Saga'...")
results = await api.manga.search("Vinland Saga", limit=1)
if results:
print(f"- Found: {results[0].title} (ID: {results[0].mal_id})")
else:
print("Search returned no results.")
except Exception as e:
print(f"An error occurred: {e}")
finally:
# Ensure the session is closed
print("Closing session...")
await api.close()
print("Session closed.")
if __name__ == "__main__":
asyncio.run(main_manual())
TODO
- Implement Anime Parser (
search,get). - Implement Character Parser (
get). - Add parsers for other MAL sections (People, Studios, etc.).
- Implement more robust error handling (e.g., custom exceptions).
- Add unit and integration tests.
- Improve documentation (detailed docstrings, potentially Sphinx docs).
- Add rate limiting awareness/options.
Contributing
Contributions are welcome! Please open an issue or submit a pull request. (You might want to add more details here later).
License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mal4u-0.1.0.tar.gz.
File metadata
- Download URL: mal4u-0.1.0.tar.gz
- Upload date:
- Size: 16.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3298645ed50dd06dbe9a233fc11d35bbc4f17367cb49fb8e25a7acf9a2e3d593
|
|
| MD5 |
12e18a7884c52e056af02acc7b4c0ac0
|
|
| BLAKE2b-256 |
cd96ba07f2e1ed41fcbfb04dcac9c5ae84d8fb258b67653b2f4e8c286ea524d5
|
File details
Details for the file mal4u-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mal4u-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8b35be2afb3dd92293417584b586ed4feb8d279eb65b62637be2adbc670eb393
|
|
| MD5 |
18b845d2ca748268efaf3092c4be10eb
|
|
| BLAKE2b-256 |
5414ff3cece449cf778b911d0c318b04bcb3709adee546dbc9c6cfa0d227e8fc
|