Skip to main content

Python library for scraping anime websites, currently supporting AnimeFLV and JKAnime.

Project description

Ani Scrapy

PyPI Version

License

Ani-Scrapy is a Python library for scraping anime websites, designed to provide both synchronous and asynchronous interfaces. It currently supports AnimeFLV and JKAnime, and makes it easy to switch between different platforms.

Ani-Scrapy helps developers automate anime downloads and build applications. It provides detailed anime and episode information, along with download links from multiple servers, supporting dynamic and static content across several sites.

🚀 Features

Core Functionality

  • Dual Interface: Synchronous and asynchronous APIs for flexible integration.
  • Multi-Platform Support: Unified interface for different platforms.
  • Comprehensive Data: Detailed anime metadata, episode information, and download links.

Content Handling

  • Static Content Extraction: Direct server links using request + cloudscraper + curl_cffi + aiohttp + bs4
  • Dynamic Content Processing: JavaScript-rendered links using Playwright
  • Mixed Approach: Smart fallback between static and dynamic methods

Technical Capabilities

  • Concurrent Scraping: Built-in support for asynchronous batch processing
  • Automatic Resource Management: Browser instances handled automatically when not provided
  • Custom Browser Support: Configurable browser paths and headless/headed modes via executable_path and headless options

Development Experience

  • Modular Design: Easy to extend with new scrapers and platforms
  • Configurable Logging: Verbose mode and multiple log levels (DEBUG, INFO, SUCCESS, WARNING, ERROR)
  • Performance Optimization: Connection reuse and caching capabilities

📦 Installation

From PyPI:

pip install ani-scrapy

From GitHub:

pip install git+https://github.com/ElPitagoras14/ani-scrapy.git

Development Installation:

git clone https://github.com/ElPitagoras14/ani-scrapy.git
cd ani-scrapy
pip install -r requirements-dev.txt
pip install -e .
playwright install chromium

🐍 Requirements

  • Python >= 3.10.14 (tested with 3.12)

Install Chromium (only once):

playwright install chromium

📊 Supported Websites

Currently Supported

  • AnimeFLV: Full support
  • JKAnime: Supports search, info, table downloads, file downloads | iframe downloads

🚀 Basic Usage

Asynchronous API Example

from ani_scrapy.async_api import AnimeFLVScraper, JKAnimeScraper, AsyncBrowser
import asyncio


async def main():
    # Initialize scrapers
    animeflv_scraper = AnimeFLVScraper(verbose=True)
    jkanime_scraper = JKAnimeScraper(verbose=True)

    # Search anime
    an_results = await animeflv_scraper.search_anime(query="naruto", page=1)
    jk_results = await jkanime_scraper.search_anime(query="naruto")
    print(f"AnimeFLV results: {len(an_results.animes)} animes found")
    print(f"JKAnime results: {len(jk_results.animes)} animes found")


    # Get dynamic content
    async with AsyncBrowser(headless=False) as browser:
        # Get anime info
        an_info = await animeflv_scraper.get_anime_info(
            anime_id=an_results.animes[0].id, include_episodes=True
        )
        jk_info = await jkanime_scraper.get_anime_info(
            anime_id=jk_results.animes[0].id, include_episodes=True, browser=browser
        )
        print(f"AnimeFLV info: {an_info.title}")
        print(f"JKAnime info: {jk_info.title}")

        # Get new episodes
        an_new_episodes = await animeflv_scraper.get_new_episodes(
            anime_id=an_info.id, last_episode_number=an_info.episodes[-1].number
        )
        jk_new_episodes = await jkanime_scraper.get_new_episodes(
            anime_id=jk_info.id, last_episode_number=jk_info.episodes[-1].number, browser=browser
        )
        print(f"AnimeFLV new episodes: {len(an_new_episodes)} episodes found")
        print(f"JKAnime new episodes: {len(jk_new_episodes)} episodes found")

        # Table download links
        an_table_links = await animeflv_scraper.get_table_download_links(
            anime_id=an_info.id, episode_number=1
        )
        jk_table_links = await jkanime_scraper.get_table_download_links(
            anime_id=jk_info.id, episode_number=1, browser=browser
        )

        # Iframe download links (requires browser for JS content)
        an_iframe_links = await animeflv_scraper.get_iframe_download_links(
            anime_id=an_info.id, episode_number=1, browser=browser
        )

        # Get final file download links
        if an_iframe_links.download_links:
            file_link = await animeflv_scraper.get_file_download_link(
                download_info=an_iframe_links.download_links[0],
                browser=browser,
            )
            print(f"Download URL: {file_link}")


if __name__ == "__main__":
    asyncio.run(main())

Synchronous API Example

from ani_scrapy.sync_api import AnimeFlvScraper, JKAnimeScraper, SyncBrowser

# Initialize scrapers

animeflv_scraper = AnimeFlvScraper(verbose=True)
jkanime_scraper = JKAnimeScraper(verbose=True)

# Search anime

an_results = animeflv_scraper.search_anime(query="naruto", page=1)
jk_results = jkanime_scraper.search_anime(query="naruto")
print(f"AnimeFLV results: {len(an_results.animes)} animes found")
print(f"JKAnime results: {len(jk_results.animes)} animes found")


# Get dynamic content
with SyncBrowser(headless=False) as browser:
    # Get anime info
    an_info = animeflv_scraper.get_anime_info(anime_id=an_results.animes[0].id)
    jk_info = jkanime_scraper.get_anime_info(
        anime_id=jk_results.animes[0].id, browser=browser
    )
    print(f"AnimeFLV info: {an_info.title}")
    print(f"JKAnime info: {jk_info.title}")

    # Get new episodes
    an_new_episodes = animeflv_scraper.get_new_episodes(
        anime_id=an_info.id, last_episode_number=1
    )
    jk_new_episodes = jkanime_scraper.get_new_episodes(
        anime_id=jk_info.id, last_episode_number=1, browser=browser
    )
    print(f"AnimeFLV new episodes: {len(an_new_episodes)} episodes found")
    print(f"JKAnime new episodes: {len(jk_new_episodes)} episodes found")

    # Table download links
    an_table_links = animeflv_scraper.get_table_download_links(
    anime_id=an_info.id, episode_number=1
    )
    jk_table_links = jkanime_scraper.get_table_download_links(
    anime_id=jk_info.id, episode_number=1, browser=browser
    )

    # Iframe download links (requires browser for JS content)
    an_iframe_links = animeflv_scraper.get_iframe_download_links(
        anime_id=an_info.id, episode_number=1, browser=browser
    )

    # Get final file download links
    if an_iframe_links.download_links:
        file_link = animeflv_scraper.get_file_download_link(
            download_info=an_iframe_links.download_links[0], browser=browser
        )
        print(f"Download URL: {file_link}")

📖 API Reference

For complete documentation: API Reference

Methods Overview:

  • search_anime - Search for anime
  • get_anime_info - Get detailed anime information
  • get_table_download_links - Get direct server links
  • get_iframe_download_links - Get iframe links
  • get_file_download_link - Get final download URL

Browser Classes:

  • AsyncBrowser - Automatic resource management for async operations
  • SyncBrowser - Context manager for synchronous scraping

🛠️ Advanced Usage

Custom Browser Configuration

from ani_scrapy.async_api.browser import AsyncBrowser
from ani_scrapy.sync_api.browser import SyncBrowser

# Custom Brave browser path
brave_path = ""

# Async browser configuration
async with AsyncBrowser(
    headless=False,
    executable_path=brave_path,
) as browser:
    # Your scraping code here
    pass

# Sync browser configuration
with SyncBrowser(
    headless=False,
    executable_path=brave_path,
) as browser:
    # Your scraping code here
    pass

Error Handling Example

from ani_scrapy.core.exceptions import (
    ScraperBlockedError,
    ScraperTimeoutError,
    ScraperParseError,
    ScraperError
)

try:
    results = await scraper.search_anime("naruto")
    if results.animes:
        anime_info = await scraper.get_anime_info(results.animes[0].id)
        print(f"Success: {anime_info.title}")
except ScraperBlockedError:
    print("Access blocked - try again later or use a different IP")
except ScraperTimeoutError:
    print("Request timed out - check your connection")
except ScraperParseError:
    print("Failed to parse response - website structure may have changed")
except ScraperError as e:
    print(f"Scraping error occurred: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")
    # Implement retry logic or fallback here

Concurrent Scraping

import asyncio

async def scrape_multiple_animes(anime_ids):
    tasks = []
    for anime_id in anime_ids:
        task = scraper.get_anime_info(anime_id)
        tasks.append(task)

    results = await asyncio.gather(*tasks, return_exceptions=True)
    return results

🤝 Contributing

Contributions to Ani-Scrapy are welcome! You can help by:

  • Reporting bugs or suggesting new features via GitHub Issues.
  • Improving documentation.
  • Adding new scrapers or enhancing existing ones.
  • Ensuring code quality and following coding standards.

How to contribute

  1. Fork the repository.
  2. Create a new branch for your feature or fix:
git checkout -b my-feature
  1. Make your changes and commit with clear messages.
  2. Push your branch to your fork.
  3. Open a Pull Request against the main branch of the original repository.

Contributions are expected to respect the license and coding style.

🧪 Development

Install development dependencies:

pip install -r requirements-dev.txt

# Code formatting and linting
black src/ tests/
isort src/ tests/
mypy src/
flake8 src/ tests/

🚧 Coming Soon

Support for more anime websites and further unification of scraper methods is planned.

If you want to contribute by adding new scrapers for other sites, contributions are welcome!

⚠️ Disclaimer

This library is intended for educational and personal use only. Please respect the terms of service of the websites being scraped and the applicable laws. The author is not responsible for any misuse.

📄 License

MIT © 2025 El Pitágoras

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ani_scrapy-0.1.23.tar.gz (32.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ani_scrapy-0.1.23-py3-none-any.whl (36.6 kB view details)

Uploaded Python 3

File details

Details for the file ani_scrapy-0.1.23.tar.gz.

File metadata

  • Download URL: ani_scrapy-0.1.23.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ani_scrapy-0.1.23.tar.gz
Algorithm Hash digest
SHA256 bf3b2cb7e6131cbb018477d1f5b66f30d7aa1516fe08d4384d95fb0b151d7718
MD5 de6072bf95e08311bc46f8ef6824d6a3
BLAKE2b-256 f5a9dedb2c89a17ef9cfeb49b24109686153bf092a851441042c3b656117c5be

See more details on using hashes here.

File details

Details for the file ani_scrapy-0.1.23-py3-none-any.whl.

File metadata

  • Download URL: ani_scrapy-0.1.23-py3-none-any.whl
  • Upload date:
  • Size: 36.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ani_scrapy-0.1.23-py3-none-any.whl
Algorithm Hash digest
SHA256 a2c7ed27e67e0613ed146c339b95488c29bf0324273388f173b561369bdd0094
MD5 bf7ec3161271995ca7fad0f6a79ccc01
BLAKE2b-256 364695aea7ee2af12a4a5cef1a5647294ea59b4927e41b3a64c7c4371ea6a931

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page