Unofficial Library for Scrapping Rotten Tomato.
Project description
Goje, in Persian (گوجه) means tomato. Goje is another library for scrapping Movie Metadata from Rotten Tomato movie database. it is mainly developed based on native python libraries. and believe me it is blazing fast!
Installation
pip install Goje
Usage
Currently Goje supports 3 main functions:
Method Name |
Functionality |
---|---|
GojeScra per.extract_extract_movie_links() |
return all the Rotten Tomato Movie Links based on a given year range |
GojeScraper.extract_metadata() |
scrape, extract and return all movie information upon a given Movie URL |
GojeScraper.extract_reviews() |
extract all the reviews of a Movie, based on a given Rotten Tomato movie URL and specified review page |
GojeScraper.extract_extract_movie_links()
from goje_scrapper import GojeScraper
movie_scraper = GojeScraper()
print(movie_scraper.extract_movie_links(2021,2022))
GojeScraper.extract_metadata()
from goje_scrapper import GojeScraper
# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# Scrape Movie Meta Data
movie_scraper.extract_metadata()
print(movie_scraper.metadata)
GojeScraper.extract_reviews() (single page review)
from goje_scrapper import GojeScraper
# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# When you want to extract one page of reviews
all_reviews = movie_scraper.extract_reviews(page_number=1)
print(all_reviews)
GojeScraper.extract_reviews() (All reviews)
from goje_scrapper import GojeScraper
# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# When you want to grab every review in rotten tomato
review_list = list()
try:
movie_scraper.number_of_review_pages()
for i in range(1,movie_scraper.number_of_review_pages()):
review_list.append(movie_scraper.extract_reviews(page_number=movie_scraper.number_of_review_pages()))
print("page {0} is scrapped!".format(i))
except IndexError:
review_list.append(movie_scraper.extract_reviews())
print(review_list)
Contribute, Issues and Stuff
Feel free to open an issue in Github repository of Goje.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file Goje-0.0.4.tar.gz
.
File metadata
- Download URL: Goje-0.0.4.tar.gz
- Upload date:
- Size: 5.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 597df2b32c762cf33ff3c2db0495b8f6a296e1e53703d33eef518b5bb988a1a7 |
|
MD5 | 741f0104bb7ab61549cdace211c98283 |
|
BLAKE2b-256 | af6aa265949b649ca324a1b5559b3f04819e1ccfff54c46ce267371aed87bffe |
File details
Details for the file Goje-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: Goje-0.0.4-py3-none-any.whl
- Upload date:
- Size: 6.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 41de7b5dc8fa9f25bf8dc8fe608107b6860a7abbfea86acd9aa845731f7ab518 |
|
MD5 | 8749d5060dbf7b63ede461756bace3db |
|
BLAKE2b-256 | 3ad7de581285790922248441db6e35ba29d52166bbbd1b09b7b22d9242113a74 |