Skip to main content

All in one scraping library

Project description


Logo

MIT License version-shield python-shield

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgements

About The Project

Scrapera provides access to a variety of scraper scripts for most commonly used machine learning and data science domains, mainly consisting of scrapers for

  • Images
  • Text
  • Audio
  • Videos

  • This main aim of this package is to cluster common scraping tasks so as to make it more convenient for ML researchers and engineers to focus on their models rather than worrying about the data collection process

    DISCLAIMER: Owner or Contributors do not take any responsibility for misuse of data obtained through Scrapera. Contact the owner if copyright terms are violated due to any module provided by Scrapera.

    Prerequisites

    All prerequisites can be installed separately through the requirements.txt file as below

    pip install -r requirements.txt
    

    Installation

    Scrapera is built with Python 3 and can be pip installed directly

    pip install scrapera
    

    Alternatively, if you wish to install the latest version directly through GitHub then run

    pip install git+https://github.com/DarshanDeshpande/Scrapera.git
    

    Usage

    To use any sub-module, you just need to import, instantiate and execute

    from Scrapera.Video import VimeoScraper
    scraper = VimeoScraper(out_path='path/to/output/directory')
    scraper.scrape(url='https://vimeo.com/191955190', quality='720p')
    

    For more examples, please refer to the individual test folders in respective modules

    Roadmap

    Known issues
      Instagram Comments Scraper needs updation due to GraphQL changes

    Contributing

    Scrapera welcomes any and all contributions and scraper requests. Feel free to fork the repository and add your own scrapers to help the community!

    License

    Distributed under the MIT License. See LICENSE for more information.

    Contact

    Feel free to reach out for any issues or requests related to Scrapera

    Darshan Deshpande (Owner) - Email | LinkedIn

    Acknowledgements

    Project details


    Download files

    Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

    Source Distribution

    Scrapera-1.0.2.tar.gz (3.5 kB view hashes)

    Uploaded Source

    Built Distribution

    Scrapera-1.0.2-py3-none-any.whl (3.9 kB view hashes)

    Uploaded Python 3

    Supported by

    AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page