Skip to main content

Powerful web scraping tool.

Project description



PyScrappy: powerful Python data scraping toolkit

forthebadge made-with-python

Python 3.6 PyPI Latest Release

Package Status License

stars forks

What is it?

PyScrappy is a Python package that provides a fast, flexible, and exhaustive way to scrape data from various different sources. Being an easy and intuitive library. It aims to be the fundamental high-level building block for scraping data in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data scraping tool available.

Main Features

Here are just a few of the things that PyScrappy does well:

  • Easy scraping of Data available on the internet
  • Returns a DataFrame for further analysis and research purposes.
  • Automatic Data Scraping: Other than a few user input parameters the whole process of scraping the data is automatic.
  • Powerful, flexible

Where to get it

The source code is currently hosted on GitHub at: https://github.com/mldsveda/PyScrappy

Binary installers for the latest released version are available at the Python Package Index (PyPI).

pip install PyScrappy

Dependencies

  • selenium - Selenium is a free (open-source) automated testing framework used to validate web applications across different browsers and platforms.
  • webdriver-manger - WebDriverManager is an API that allows users to automate the handling of driver executables like chromedriver.exe, geckodriver.exe etc required by Selenium WebDriver API. Now let us see, how can we set path for driver executables for different browsers like Chrome, Firefox etc.
  • beautifulsoup4 - Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.
  • pandas - Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.

License

MIT

Getting Help

For usage questions, the best place to go to is StackOverflow. Further, general questions and discussions can also take place on GitHub in this repository.

Discussion and Development

Most development discussions take place on GitHub in this repository.

Also visit the official documentation of PyScrappy for more information.

Contributing to PyScrappy

All contributions, bug reports, bug fixes, documentation improvements, enhancements, and ideas are welcome.

If you are simply looking to start working with the PyScrappy codebase, navigate to the GitHub "issues" tab and start looking through interesting issues.

End Notes

Learn More about this package on Medium.

This package is solely made for educational and research purposes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

PyScrappy-0.1.1.tar.gz (17.8 kB view details)

Uploaded Source

Built Distribution

PyScrappy-0.1.1-py3-none-any.whl (25.9 kB view details)

Uploaded Python 3

File details

Details for the file PyScrappy-0.1.1.tar.gz.

File metadata

  • Download URL: PyScrappy-0.1.1.tar.gz
  • Upload date:
  • Size: 17.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for PyScrappy-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c2b681e80079ea644c95541ab59e331717b2fe54ec842899a1024489720c397b
MD5 0da96ec7ccaeb4fe9fecae331f06be0c
BLAKE2b-256 731be8dac9b9fa3b7a1d7fcf336064c2a11ae8cfc86200a5ede8035559f12782

See more details on using hashes here.

File details

Details for the file PyScrappy-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: PyScrappy-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 25.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for PyScrappy-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e60b3d62e1301a33d96d0c56892b03d4fcd103744163b2685e0086d9fb0bb939
MD5 da73df65f44d3b4e2729454e94ef66ed
BLAKE2b-256 31868e9e57ff50c3f03849cd72d6ff5a9450f2d02eac83e852b365a40514751b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page