Skip to main content

A lightweight Javascript-aware, headless web scraping library for Python

Project description

Javascript frameworks are all the rage now. Unfortunately that means some of my favourite tools, wget and curl, are no longer up to the task. Loading up a page on the browser then opening up source and then copying it is just too cumbersome.

I initially thought of creating an extension to help with the situation. Unfortunately an extension would never have the seamless feel of wget or curl for the unfortunate ones like me who are in a relationship with the terminal. Also an extension cannot be used in those quick and dirty bash/perl/python scripts.

Hence pyscrape and its http sibling pyrun. Hope it helps you navigate the unwieldy world of javascript rendering :) .

## Installation

pip install pyscrape

OR

  1. Clone https://github.com/animeshkundu/pyscrape

  2. pip install -r requirements.txt

  3. python setup.py install

## Test 1. pyscrape http://www.google.co.in/ 2. pyrun -p 1234; curl localhost:1234/scrape?url=http://www.google.co.in/

Improvements are welcome.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapejs-0.0.1.1.tar.gz (3.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapejs-0.0.1.1-py2-none-any.whl (6.6 kB view details)

Uploaded Python 2

File details

Details for the file scrapejs-0.0.1.1.tar.gz.

File metadata

  • Download URL: scrapejs-0.0.1.1.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for scrapejs-0.0.1.1.tar.gz
Algorithm Hash digest
SHA256 0d0051734b0046af2bc3d8b2703e736e9364436f06f044b9b4b29e17eaa60121
MD5 cd2fc59409ab6b0b5fe7594c6c59c367
BLAKE2b-256 fb54c11af84917e15db64b64a77a9f640cd2746cc23b1fe4799ef74978590df6

See more details on using hashes here.

File details

Details for the file scrapejs-0.0.1.1-py2-none-any.whl.

File metadata

File hashes

Hashes for scrapejs-0.0.1.1-py2-none-any.whl
Algorithm Hash digest
SHA256 931088a11b4fe273fa5e23149f50bb78336fb93c291df07584ad8f1cd8b86100
MD5 0e3cb1e4be0fe4cf6575bd2b65a10fa6
BLAKE2b-256 1e1346a5c695b79d713f7d166439807fba72783547e327cb5f1960ab525afab8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page