Skip to main content

Async-first Python library for scraping anime websites. Supports AnimeFLV and JKAnime with unified interface, detailed metadata, and download links from multiple servers.

Project description

Ani Scrapy

PyPI Version

License

Ani-Scrapy is a Python async-first library for scraping anime websites. It currently supports AnimeFLV and JKAnime, and makes it easy to switch between different platforms.

Note: The synchronous API was removed due to maintainability complexity. Keeping two native implementations duplicated code without sufficient benefits.

Ani-Scrapy helps developers automate anime downloads and build applications. It provides detailed anime and episode information, along with download links from multiple servers, supporting dynamic and static content across several sites.

🚀 Features

Core Functionality

  • Async-First Design: Built from the ground up for asynchronous Python
  • Multi-Platform Support: Unified interface for different platforms
  • Comprehensive Data: Detailed anime metadata, episode information, and download links

Content Handling

  • Static Content Extraction: Direct server links using aiohttp and BeautifulSoup
  • Dynamic Content Processing: JavaScript-rendered links using Playwright
  • Mixed Approach: Smart fallback between static and dynamic methods

Technical Capabilities

  • Concurrent Scraping: Built-in support for asynchronous batch processing
  • Automatic Resource Management: Browser instances handled automatically
  • Custom Browser Support: Configurable browser paths via executable_path

Development Experience

  • Modular Design: Easy to extend with new scrapers and platforms
  • Structured Logging: Configurable log levels with task_id tracking for correlated logs
  • Performance Optimization: Connection reuse and caching capabilities

📦 Installation

From PyPI:

pip install ani-scrapy

From GitHub:

pip install git+https://github.com/ElPitagoras14/ani-scrapy.git

Development Installation:

git clone https://github.com/ElPitagoras14/ani-scrapy.git
cd ani-scrapy
pip install -e ".[dev]"
playwright install chromium

🐍 Requirements

  • Python >= 3.10.14 (tested with 3.12)

Install browser (only once):

playwright install chromium

Recommendation: Use Brave browser for sites with excessive advertising. See Custom Browser below.

🔍 Diagnostics

Run the diagnostic tool to check your environment:

ani-scrapy doctor

This checks:

  • Python version, platform, RAM
  • Required dependencies installed
  • Playwright and Chromium available
  • Recommended browsers (Brave)
  • Network connectivity to supported sites

Options

ani-scrapy doctor --output json  # JSON output for CI/CD
ani-scrapy doctor --timeout 10   # Increase timeout for slow connections

Exit Codes

Code Meaning
0 All checks passed
1 Warnings found
2 Errors found

📊 Supported Websites

Currently Supported

  • AnimeFLV: Full support
  • JKAnime: Supports search, info, table downloads, file downloads | iframe downloads

🚀 Basic Usage

Simple Example (No task_id required)

import asyncio

from ani_scrapy import AnimeFLVScraper


async def main():
    async with AnimeFLVScraper() as scraper:
        results = await scraper.search_anime(query="naruto", page=1)
        print(f"Found {len(results.animes)} results")

        info = await scraper.get_anime_info(anime_id=results.animes[0].id)
        print(f"Title: {info.title}")


if __name__ == "__main__":
    asyncio.run(main())

Advanced Example (With task_id for log correlation)

import asyncio

from ani_scrapy import AnimeFLVScraper, JKAnimeScraper
from ani_scrapy.core.base import generate_task_id


async def main():
    # Custom task_id for log correlation (or use generate_task_id())
    task_id = generate_task_id()

    async with AnimeFLVScraper() as animeflv_scraper:
        an_results = await animeflv_scraper.search_anime(query="naruto", page=1, task_id=task_id)
        print(f"AnimeFLV results: {len(an_results.animes)} animes found")

        an_info = await animeflv_scraper.get_anime_info(
            anime_id=an_results.animes[0].id, include_episodes=True, task_id=task_id
        )
        print(f"AnimeFLV info: {an_info.title}")

        an_table_links = await animeflv_scraper.get_table_download_links(
            anime_id=an_info.id, episode_number=1, task_id=task_id
        )
        print(f"AnimeFLV table links: {len(an_table_links.download_links)}")

    async with JKAnimeScraper() as jkanime_scraper:
        jk_results = await jkanime_scraper.search_anime(query="naruto", task_id=task_id)
        print(f"JKAnime results: {len(jk_results.animes)} animes found")

        jk_info = await jkanime_scraper.get_anime_info(
            anime_id=jk_results.animes[0].id, include_episodes=True, task_id=task_id
        )
        print(f"JKAnime info: {jk_info.title}")


if __name__ == "__main__":
    asyncio.run(main())

Note: The task_id parameter is optional. If not provided, a random ID is generated automatically. Use it to correlate logs across multiple operations.

Custom Browser (Brave Recommended)

You can configure a custom browser executable path. Brave is recommended because its native ad-blocker reduces blocking on sites with excessive advertisements, but any Chromium-based browser (Chrome, Chromium, Edge) will work.

Benefits of Brave

  • Native Ad-Block: Built-in protection reduces detection probability
  • Avoids Captchas: Sites with aggressive ads may fail with Chromium's default configuration
  • Better Success Rate: Sites with excessive advertising can fail or timeout with the default browser

Configuration

from ani_scrapy import AnimeFLVScraper

brave_path = "C:/Program Files/BraveSoftware/Brave-Browser/Application/brave.exe"

async with AnimeFLVScraper(executable_path=brave_path) as scraper:
    info = await scraper.get_anime_info(anime_id="anime-id")

Path Examples

# Brave (Recommended)
brave_path = "C:/Program Files/BraveSoftware/Brave-Browser/Application/brave.exe"

# Chrome
chrome_path = "C:/Program Files/Google/Chrome/Application/chrome.exe"

# Chromium
chromium_path = "C:/Program Files/Chromium/Application/chrome.exe"

# Linux
brave_path = "/usr/bin/brave"

# macOS
brave_path = "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"

📖 API Reference

For complete documentation: Docs index

Methods Overview:

  • search_anime - Search for anime
  • get_anime_info - Get detailed anime information
  • get_table_download_links - Get direct server links
  • get_iframe_download_links - Get iframe links
  • get_file_download_link - Get final download URL
  • get_new_episodes - Get new episodes since last known

Scraper Classes:

  • AnimeFLVScraper - Scraper for AnimeFLV
  • JKAnimeScraper - Scraper for JKAnime

Browser Classes:

  • AsyncBrowser - Manual browser control for advanced use cases

🛠️ Advanced Usage

Manual Browser Usage

For advanced use cases where you need direct control over the browser:

import asyncio

from ani_scrapy import AsyncBrowser


async def main():
    async with AsyncBrowser(headless=True) as browser:
        page = await browser.new_page()
        await page.goto("https://example.com")
        # Your custom browser automation here


if __name__ == "__main__":
    asyncio.run(main())

Error Handling

from ani_scrapy.core.exceptions import (
    ScraperBlockedError,
    ScraperTimeoutError,
    ScraperParseError,
    ScraperError
)

try:
    results = await scraper.search_anime("naruto")
    if results.animes:
        anime_info = await scraper.get_anime_info(results.animes[0].id)
        print(f"Success: {anime_info.title}")
except ScraperBlockedError:
    print("Access blocked - try again later or use a different IP")
except ScraperTimeoutError:
    print("Request timed out - check your connection")
except ScraperParseError:
    print("Failed to parse response - website structure may have changed")
except ScraperError as e:
    print(f"Scraping error occurred: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Concurrent Scraping

import asyncio

async def scrape_multiple_animes(anime_ids, scraper):
    tasks = []
    for anime_id in anime_ids:
        task = scraper.get_anime_info(anime_id)
        tasks.append(task)

    results = await asyncio.gather(*tasks, return_exceptions=True)
    return results

🤝 Contributing

Contributions to Ani-Scrapy are welcome! You can help by:

  • Reporting bugs or suggesting new features via GitHub Issues.
  • Improving documentation.
  • Adding new scrapers or enhancing existing ones.
  • Ensuring code quality and following coding standards.

How to contribute

  1. Fork the repository.
  2. Create a new branch for your feature or fix:
git checkout -b my-feature
  1. Make your changes and commit with clear messages.
  2. Push your branch to your fork.
  3. Open a Pull Request against the main branch of the original repository.

Contributions are expected to respect the license and coding style.

🧪 Development

Install development dependencies:

pip install -e ".[dev]"

🚧 Coming Soon

Support for more anime websites and further unification of scraper methods is planned.

If you want to contribute by adding new scrapers for other sites, contributions are welcome!

⚠️ Disclaimer

This library is intended for educational and personal use only. Please respect the terms of service of the websites being scraped and the applicable laws. The author is not responsible for any misuse.

📄 License

MIT © 2025 El Pitágoras

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ani_scrapy-0.2.0.tar.gz (37.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ani_scrapy-0.2.0-py3-none-any.whl (31.2 kB view details)

Uploaded Python 3

File details

Details for the file ani_scrapy-0.2.0.tar.gz.

File metadata

  • Download URL: ani_scrapy-0.2.0.tar.gz
  • Upload date:
  • Size: 37.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ani_scrapy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 3bc38de3559f56eec32a652bffef42b2306b9d527b836cfd9a878bef5a9efe09
MD5 0592551c64f062880c2b192b0f262dac
BLAKE2b-256 38960ff11678fb75ebf580a1959051cf0b18b3edc47105e7783fe135386b7db8

See more details on using hashes here.

File details

Details for the file ani_scrapy-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: ani_scrapy-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 31.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ani_scrapy-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2ee61c05e0c10b3126097459f839f4911c90d1bfda4099b6b51003f1f5564019
MD5 23d16371397b7e1dc0349ab33576b746
BLAKE2b-256 c6a839a6de251ad5ae966ed98784f8a67e1974c27cb274829ce33cc28845df50

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page