Skip to main content

A Python application that provides web scraping capabilities, including fetching Google and Yelp reviews.

Project description

pydataharvest

pydataharvest is a Python application that provides web scraping capabilities, including fetching Google and Yelp reviews. The application has a user-friendly graphical user interface (GUI) for easy interaction.

Features

  • Web Scraping: Extract information from web pages based on user-provided URLs.
  • Google Reviews: Fetch reviews for a given business or location using Google Maps API.
  • Yelp Reviews: Retrieve reviews for a business using the Yelp API.
  • OpenStreetMap Data: Extract latitude, longitude, and additional information from OpenStreetMap.

Requirements

  • Python 3.x
  • Required Python packages (install using pip install -r requirements.txt):
    • requests
    • beautifulsoup4
    • pandas
    • openpyxl
    • nltk (for text processing)
    • tkinter (GUI toolkit)

Usage

  1. Clone the repository:

    git clone https://github.com/arjunlimat/pydataharvest.git
    
  2. Navigate to the project directory:

cd pydataharvest

  1. Install the required packages:

pip install -r requirements.txt

  1. Run the application: python main.py

The GUI will appear, allowing you to choose different services and perform web scraping tasks.

Services Web Scraping

Enter a URL and click "Search" to explore available data types.

Choose the desired data type, enter a file name, and click "Download" to save the data.

Google Reviews Select "Google reviews" from the services dropdown.

Enter the business or location name and address. Provide a file name and click "Download" to fetch and save Google reviews.

Yelp Reviews Select "Yelp reviews" from the services dropdown. Enter the business name and address. Provide a file name and click "Download" to fetch and save Yelp reviews.

OpenStreetMap Select "Open Street Map" from the services dropdown. Enter the map URL, provide a file name, and click "Download" to extract map data.

Contributing Contributions are welcome! If you encounter issues or have ideas for improvement, please open an issue or submit a pull request.

License:

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydatascraper-1.0.0.tar.gz (3.0 kB view hashes)

Uploaded Source

Built Distribution

pydatascraper-1.0.0-py3-none-any.whl (3.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page