Skip to main content

A scraper for www.myanimelist.com

Project description

my_anime_list_scraper

A package to easily scrape www.myanimelist.com. The purpose of this package is to scrape large amounts of data from MyAnimeList, not to request individual information about a specific item within MyAnimeList. This package can be used to request individual information, but would recommend you to look towards another tool for that as this was not created with that in mind.

my_anime_list_scraper is being developed mainly for anibrain.ai. This package will be updated as needed for the site mentioned above. If you require additional information currently not available through this package, please reach out or feel free to contribute.

Installation

pip install my_anime_list_scraper

Usage

Importing the scraper

from my_anime_list_scraper import MalScraper

Instantiating the scraper

Constructor Parameters Type Description
output_type string How the scraper will save the data. Either 'tsv' or 'mysql'

Default Value: 'tsv'
output_location string The absolute path of the location to save the data. This value must end with a slash (either forward or backward depending on the operating system) to specify it is the location of a folder.

Required if output_type='tsv'
db_host string The database host.

Required if output_type='mysql'
db_user string The username to access the database.

Required if output_type='mysql'
db_password string The password to access the database.

Required if output_type='mysql'
db_database string The database name to write scrape data to.

Required if output_type='mysql'

Scraper saves data as TSV

mal_scraper = MalScraper(output_type='tsv', output_location='/Home/example/folder/')

Scraper saves data in MySQL Database

mal_scraper = MalScraper(output_type='mysql', db_host=example_host_name, db_user=example_user, db_password=example_password, db_database=example_database)

Methods

scrape_details()

Parameters Type Description
content_type string The type of the page being scraped.
Only 'anime' available right now

Default Value: 'anime'
start_page int The page to start scraping (e.g. type="anime" and start_page=5 => https://myanimelist.net/anime/5 )

Default Value: 0
failure_threshold int The number of consecutive fails allowed before stopping the scraper. A fail is when a 404 page is returned, signifying no content is at that page.

Default Value: 100
print_intermediate bool Determine if to print intermediate output during scraping to keep user informed on progress.

Default Value: False
num_retries int Number of retries the scraper will attempt for an individual page. This is used as a backup for IP blocking scenarios. The time between retries are always (5 minutes * retry attempt #).

Default Value: 5

Scraped Details (anime example):

{ "AltNameEnglish":"Cowboy Bebop", "AltNameSynonyms":None, "MediaType":"TV", "EpisodeCount":26, "CurrentStatus":"Finished Airing", "Aired":"Apr 3, 1998 to Apr 24, 1999", "Premiered":"Spring 1998", "Broadcast":"Saturdays at 01", "Producers":"Bandai Visual", "Licensors":"Funimation,Bandai Entertainment", "Studios":"Sunrise", "Source":"Original", "Genres":"Action,Adventure,Comedy,Drama,Sci-Fi,Space", "Duration":"24 min. per ep.", "Rating":"R - 17+ (violence & profanity)", "Score":8.79, "ScoredByCount":566904, "Ranked":25, "Popularity":37, "MembersCount":1159776, "FavoritesCount":58298, "Title":"Cowboy Bebop", "Synopsis":"In the year 2071, humanity has colonized several of the planets and moons of the solar system leaving the now uninhabitable surface of planet Earth behind. The Inter Solar System Police attempts to keep peace in the galaxy, aided in part by outlaw bounty hunters, referred to as \"Cowboys.\" The ragtag team aboard the spaceship Bebop are two such individuals. Mellow and carefree Spike Spiegel is balanced by his boisterous, pragmatic partner Jet Black as the pair makes a living chasing bounties and collecting rewards. Thrown off course by the addition of new members that they meet in their travels—Ein, a genetically engineered, highly intelligent Welsh Corgi; femme fatale Faye Valentine, an enigmatic trickster with memory loss; and the strange computer whiz kid Edward Wong—the crew embarks on thrilling adventures that unravel each member\\'s dark and mysterious past little by little. Well-balanced with high density action and light-hearted comedy, Cowboy Bebop is a space Western classic and an homage to the smooth and improvised music it is named after. [Written by MAL Rewrite]", "MyAnimeListId":1, "PromoVideo":"https://www.youtube.com/embed/qig4KOK2R2g?enablejsapi=1&wmode=opaque&autoplay=1", "PromoVideoBackgroundImage":"https://i.ytimg.com/vi/qig4KOK2R2g/mqdefault.jpg", "ImageSrc":"https://cdn.myanimelist.net/images/anime/4/19644.jpg" }

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

my_anime_list_scraper-0.0.56.tar.gz (11.4 kB view details)

Uploaded Source

Built Distribution

my_anime_list_scraper-0.0.56-py2.py3-none-any.whl (12.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file my_anime_list_scraper-0.0.56.tar.gz.

File metadata

  • Download URL: my_anime_list_scraper-0.0.56.tar.gz
  • Upload date:
  • Size: 11.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for my_anime_list_scraper-0.0.56.tar.gz
Algorithm Hash digest
SHA256 1b5a8047fbd8cc05ee4fe568f7561b88e6dcb6ebaa691775a95b59dada1d4cbf
MD5 30e613cd646c3f31ceb1ca13f6925efd
BLAKE2b-256 4b912a3f31aae5105875f9906c15b3cd0a4db53db8d20603291e623a15b591dc

See more details on using hashes here.

File details

Details for the file my_anime_list_scraper-0.0.56-py2.py3-none-any.whl.

File metadata

  • Download URL: my_anime_list_scraper-0.0.56-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.9.0

File hashes

Hashes for my_anime_list_scraper-0.0.56-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5cb9587fc559dcd704f577784abe9e7f3d7f769ab057ffc8b56245a9a07807d0
MD5 ff37d592a3b5e9e0388760565e466535
BLAKE2b-256 c28bf30da79e41c55f285c2e49c473dd75b2f89713caffb1ab05b7321c1eb1d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page