Skip to main content

No project description provided

Project description

mal4u: Asynchronous MyAnimeList Scraper

PyPI version License: MIT

An unofficial, asynchronous Python library for scraping data from MyAnimeList.net. Built with aiohttp for efficient network requests and beautifulsoup4 for HTML parsing. Uses Pydantic for data validation and structuring.

Disclaimer: This is an unofficial library and is not affiliated with MyAnimeList. Please use responsibly and respect MAL's terms of service. Excessive scraping can lead to IP bans.

Features

  • Asynchronous: Leverages asyncio and aiohttp for non-blocking network I/O.
  • Session Management: Supports both explicit session creation/closing and automatic handling via async with.
  • Modular Parsers: Designed with a base parser and specific sub-parsers for Manga and Anime.
  • Type Hinted: Fully type-hinted codebase for better developer experience and static analysis.
  • Data Validation: Uses Pydantic models (MangaSearchResult, AnimeSearchResult (planned), MangaDetails, AnimeDetails, etc.) to structure and validate scraped data.
  • Robust Detail Parsing: Extracts a wide range of information from detail pages, including titles, synopsis, background, stats, related entries, characters, themes, and more for both anime and manga.

Current Capabilities

  • Search:
    • Search for Manga.
    • Search for Anime
  • Details:
    • Get detailed information for a specific Manga by ID (using MangaDetails model).
    • Get detailed information for a specific Anime by ID (using AnimeDetails model).
  • Browse/Lists (from overview pages like manga.php/anime.php):
    • Get available Genres (Anime & Manga).
    • Get available Themes (Anime & Manga).
    • Get available Demographics (Anime & Manga).
    • Get a preview list of Magazines (Manga).
    • (Planned: Get Studios list (Anime)).

Installation

pip install mal4u

Basic Usage

Recommended: Using async with

This automatically handles session creation and closing.

import asyncio
import logging
from mal4u import MyAnimeListApi, MangaSearchResult, MangaDetails, AnimeDetails

# Optional: Configure logging for more details
logging.basicConfig(level=logging.INFO)
logging.getLogger('mal4u').setLevel(logging.DEBUG) # See debug logs from the library

async def main():
    async with MyAnimeListApi() as api:
        # --- Manga Example ---
        print("Searching for 'Berserk' manga...")
        search_results: list[MangaSearchResult] = await api.manga.search("Berserk", limit=1)
        manga_id_to_get = 2 # Default to Berserk if search fails
        if search_results:
            print(f"- Found: {search_results[0].title} (ID: {search_results[0].mal_id})")
            manga_id_to_get = search_results[0].mal_id
        else:
            print("Search returned no results. Using default ID 2.")

        print(f"\nGetting details for Manga ID: {manga_id_to_get}")
        manga_details: MangaDetails | None = await api.manga.get(manga_id_to_get)

        if manga_details:
            print(f"  Title: {manga_details.title} ({manga_details.type})")
            print(f"  Status: {manga_details.status}")
            print(f"  Score: {manga_details.score} (by {manga_details.scored_by} users)")
            print(f"  Chapters: {manga_details.chapters}, Volumes: {manga_details.volumes}")
            print(f"  Synopsis (start): {manga_details.synopsis[:100] if manga_details.synopsis else 'N/A'}...")
            print(f"  Genres: {[genre.name for genre in manga_details.genres]}")
        else:
            print(f"  Could not retrieve details for Manga ID: {manga_id_to_get}")

        print("\n" + "="*20 + "\n")

        # --- Anime Example ---
        anime_id_to_get = 40852 # Dr. Stone: Stone Wars
        print(f"Getting details for Anime ID: {anime_id_to_get}")
        anime_details: AnimeDetails | None = await api.anime.get(anime_id_to_get)

        if anime_details:
            print(f"  Title: {anime_details.title} ({anime_details.type})")
            print(f"  Status: {anime_details.status}")
            print(f"  Score: {anime_details.score} (by {anime_details.scored_by} users)")
            print(f"  Episodes: {anime_details.episodes}")
            print(f"  Premiered: {anime_details.premiered.name if anime_details.premiered else 'N/A'}")
            print(f"  Synopsis (start): {anime_details.synopsis[:100] if anime_details.synopsis else 'N/A'}...")
            print(f"  Studios: {[studio.name for studio in anime_details.studios]}")
            print(f"  Opening Theme(s): {anime_details.opening_themes}")
        else:
            print(f"  Could not retrieve details for Anime ID: {anime_id_to_get}")


if __name__ == "__main__":
    asyncio.run(main())

Manual Session Management

You need to explicitly create and close the session.

import asyncio
import logging
from mal4u import MyAnimeListApi

logging.basicConfig(level=logging.INFO)

async def main_manual():
    api = MyAnimeListApi()
    try:
        # Explicitly create the session
        await api.create_session()
        print("Session created.")

        # Perform actions (e.g., get anime details)
        anime_id = 5114 # FMA: Brotherhood
        print(f"Getting details for Anime ID: {anime_id}")
        details = await api.anime.get(anime_id)
        if details:
            print(f"- Found: {details.title} (Score: {details.score})")
        else:
            print(f"- Could not retrieve details for Anime ID: {anime_id}")

    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        # Ensure the session is closed
        print("Closing session...")
        await api.close()
        print("Session closed.")

if __name__ == "__main__":
    asyncio.run(main_manual())

TODO

  • Search Manga
  • Get Manga Details (MangaDetails)
  • Search Anime (AnimeSearchResult)
  • Get Anime Details (AnimeDetails)
  • Get Character Details (CharacterDetails)
  • Implement Parsers for other MAL sections (People, Studios, etc.).
  • Implement more robust error handling (e.g., custom exceptions for 404, parsing failures).
  • Add unit and integration tests.
  • Improve documentation (detailed docstrings, potentially Sphinx docs).
  • Add rate limiting awareness/options.

Contributing

Contributions are welcome! Please open an issue or submit a pull request. (You might want to add more details here later).

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mal4u-0.1.1.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mal4u-0.1.1-py3-none-any.whl (32.6 kB view details)

Uploaded Python 3

File details

Details for the file mal4u-0.1.1.tar.gz.

File metadata

  • Download URL: mal4u-0.1.1.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for mal4u-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b91456ca04cf7ad9126235052fe5f2963ec8d1d537ecee4365dcdf271fa47ce8
MD5 81d7f494a7cd19fe56e5ac29a84994c9
BLAKE2b-256 176ab515dec0c8f90fea15f24f0b0035ab6db92f2a99a27d6f42da41b8aea96c

See more details on using hashes here.

File details

Details for the file mal4u-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mal4u-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.0

File hashes

Hashes for mal4u-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2101e108966045895081839be85b95c31c54cc18ca4cfe11174dcc684750e397
MD5 c95eee5c67ca35d2a3d0b751a522b4b1
BLAKE2b-256 333d524befcfffd36119dc1b4fd1e0bbe0eed92ef7ee4c2967152974b4788eed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page