Skip to main content

Unofficial Library for Scrapping Rotten Tomato.

Project description

Goje, in Persian (گوجه) means tomato. Goje is another library for scrapping Movie Metadata from Rotten Tomato movie database. it is mainly developed based on native python libraries. and believe me it is blazing fast!

Installation

pip install Goje

Usage

Currently Goje supports 4 main functions:

Method Name

Functionality

GojeScra per.extract_extract_movie_links()

return all the Rotten Tomato Movie Links based on a given year range

GojeScraper.extract_metadata()

scrape, extract and return all movie information upon a given Movie URL

Goj eScraper.extract_critic_reviews()

extract all the reviews of a Movie, based on a given Rotten Tomato movie URL and specified review page

GojeS craper.extract_audience_reviews()

extract all the reviews of a Movie, based on the opinion of folks

GojeScraper.extract_metadata()

from goje_scrapper import GojeScraper

# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# Scrape Movie Meta Data
movie_scraper.extract_metadata()
print(movie_scraper.metadata)

GojeScraper.extract_critic_reviews() (single page review)

from goje_scrapper import GojeScraper

# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# When you want to extract one page of reviews
all_reviews = movie_scraper.extract_critic_reviews(page_number=1)
print(all_reviews)

GojeScraper.extract_critic_reviews() (All reviews)

from goje_scrapper import GojeScraper

# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
# When you want to grab every review in rotten tomato
review_list = list()
try:
    movie_scraper.number_of_review_pages()

    for i in range(1,movie_scraper.number_of_review_pages()):
        review_list.append(movie_scraper.extract_critic_reviews(page_number=movie_scraper.number_of_review_pages()))
        print("page {0} is scrapped!".format(i))
except IndexError:
    review_list.append(movie_scraper.extract_critic_reviews())

print(review_list)

GojeScraper.extract_audience_reviews()

from goje_scrapper import GojeScraper

# give a Rotten Tomato Movie URL
movie_url = 'https://www.rottentomatoes.com/m/a_separation_2011'
# Instantiate Goje via given URL
movie_scraper = GojeScraper(movie_url=movie_url)
audience_reviews = movie_scraper.extract_audience_reviews()
print(audience_reviews)

Contribute, Issues and Stuff

Feel free to open an issue in Github repository of Goje.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Goje-0.1.0.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

Goje-0.1.0-py3-none-any.whl (6.8 kB view details)

Uploaded Python 3

File details

Details for the file Goje-0.1.0.tar.gz.

File metadata

  • Download URL: Goje-0.1.0.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for Goje-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6fd15bd1aedd83b5f1f5b599d07fcf33307a7c107bc0f07c837f94a0dc11c745
MD5 0924afe9476d258be219d9b20ee4ec61
BLAKE2b-256 8153608382f17269acbf650e1d9a3d0e6594a825c2050b2577920b552dc5dde9

See more details on using hashes here.

File details

Details for the file Goje-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: Goje-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.0 pkginfo/1.6.1 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.8.5

File hashes

Hashes for Goje-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe7ab09f690dd98509a9e4d6831d94d1a690b1534822be4e55aa3c05c67adb14
MD5 d797b5f74b078cd1444d381448b503c4
BLAKE2b-256 e3d4378ccf77cd09f7aa7437aa7a30a93953638737311b523fe35c95ce58afea

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page