Web scraping framework

These details have not been verified by PyPI

Project links

homepage

Project description

Grab

Update (2025 year)

Since 2018 (which is the year of most recent Grab release) I have tried to do large refactoring of code base of Grab library. Which ended up with semi-working product which nobody uses, including me. I have decided to reset all project files to the state of most recent pypi release 0.6.41 dated by june 2018. At least, now the code base corresponds to live version of the product which is being used by some people, according to pypi stats.

I've updated Grab code base and code base of its dependencies to be compatible with Python 2.7 and Python 3.13 (and, hopefully, all py versions between these two). I have set up github action to run all tests on Python 2.7 and Python 3.13.

There is NO new features. It is just an updated code base which is alive now i.e. it can run on Python 2.7 or on modern python, and its tests pass, and it has github CI config to run tests on new commits.

One backward-incompatible change is that I do not use weblib.error::DataNotFound and weblib.error::ResponseNotValid exceptions anymore. Now Grab uses DataNotFound and InvalidResponseError exceptions which is stored in grab.errors module. So, if your code imports DataNotFound or ResponseNotValid from weblib, you should fix such imports. Also, if your code explicitly catches these weblib exceptions then you should convert it to catch new grab.error exceptions.

The major version of new release is 1. If you use Grab in your project and you want to keep old release to be sure there is no backward-compatility bugs, then use this specification in your requirements file grab<1.0.

Support

You are welcome to talk about web scraping and data processing in these Telegram chat groups: @grablab (English) and @grablab_ru (Russian)

Documentation: https://grab.readthedocs.io/en/stable/

Installation

Run pip install -U grab

See details about installing Grab on different platforms here https://grab.readthedocs.io/en/stable/usage/installation.html

What is Grab?

Grab is a python web scraping framework. Grab provides a number of helpful methods to perform network requests, scrape web sites and process the scraped content:

Automatic cookies (session) support
HTTP and SOCKS proxy with/without authorization
Keep-Alive support
IDN support
Tools to work with web forms
Easy multipart file uploading
Flexible customization of HTTP requests
Automatic charset detection
Powerful API to extract data from DOM tree of HTML documents with XPATH queries
Asynchronous API to make thousands of simultaneous queries. This part of library called Spider. See list of spider fetures below.
Python 3 ready

Spider is a framework for writing web-site scrapers. Features:

Rules and conventions to organize the request/parse logic in separate blocks of codes
Multiple parallel network requests
Automatic processing of network errors (failed tasks go back to task queue)
You can create network requests and parse responses with Grab API (see above)
HTTP proxy support
Caching network results in permanent storage
Different backends for task queue (in-memory, redis, mongodb)
Tools to debug and collect statistics

Grab Example

import logging

from grab import Grab

logging.basicConfig(level=logging.DEBUG)

g = Grab()

g.go('https://github.com/login')
g.doc.set_input('login', '****')
g.doc.set_input('password', '****')
g.doc.submit()

g.doc.save('/tmp/x.html')

g.doc('//ul[@id="user-links"]//button[contains(@class, "signout")]').assert_exists()

home_url = g.doc('//a[contains(@class, "header-nav-link name")]/@href').text()
repo_url = home_url + '?tab=repositories'

g.go(repo_url)

for elem in g.doc.select('//h3[@class="repo-list-name"]/a'):
    print('%s: %s' % (elem.text(),
                      g.make_url_absolute(elem.attr('href'))))

Grab::Spider Example

import logging

from grab.spider import Spider, Task

logging.basicConfig(level=logging.DEBUG)


class ExampleSpider(Spider):
    def task_generator(self):
        for lang in 'python', 'ruby', 'perl':
            url = 'https://www.google.com/search?q=%s' % lang
            yield Task('search', url=url, lang=lang)

    def task_search(self, grab, task):
        print('%s: %s' % (task.lang,
                          grab.doc('//div[@class="s"]//cite').text()))


bot = ExampleSpider(thread_number=2)
bot.run()

Project details

These details have not been verified by PyPI

Project links

homepage

Release history Release notifications | RSS feed

This version

1.2.0

Sep 18, 2025

1.1.0

Sep 17, 2025

1.0.1

Sep 15, 2025

1.0.0

Sep 15, 2025

0.6.41

Jun 24, 2018

0.6.40

May 14, 2018

0.6.39

May 10, 2018

0.6.38

May 17, 2017

0.6.37

May 13, 2017

0.6.36

May 13, 2017

0.6.35

Feb 6, 2017

0.6.34

Feb 4, 2017

0.6.33

Jan 27, 2017

0.6.32

Dec 31, 2016

0.6.31

Dec 31, 2016

0.6.30

Nov 22, 2015

0.6.29

Oct 15, 2015

0.6.28

Oct 13, 2015

0.6.27

Oct 13, 2015

0.6.26

Oct 9, 2015

0.6.25

Sep 20, 2015

0.6.24

Sep 9, 2015

0.6.23

Aug 27, 2015

0.6.22

Aug 14, 2015

0.6.21

Jun 20, 2015

0.6.20

Jun 8, 2015

0.6.19

Jun 5, 2015

0.6.18

Jun 5, 2015

0.6.17

Jun 5, 2015

0.6.16

Jun 3, 2015

0.6.15

May 31, 2015

0.6.14

May 18, 2015

0.6.13

May 12, 2015

0.6.12

May 7, 2015

0.6.11

May 7, 2015

0.6.10

Apr 30, 2015

0.6.9

Apr 29, 2015

0.6.8

Apr 26, 2015

0.6.7

Apr 26, 2015

0.6.6

Apr 23, 2015

0.6.5

Apr 16, 2015

0.6.4

Apr 12, 2015

0.6.3

Apr 10, 2015

0.6.2

Apr 9, 2015

0.6.1

Apr 8, 2015

0.6.0

Apr 6, 2015

0.5.5

Mar 27, 2015

0.5.4

Mar 7, 2015

0.5.3

Mar 7, 2015

0.5.2

Feb 22, 2015

0.5.1

Feb 16, 2015

0.5.0

Feb 9, 2015

0.4.13

Sep 12, 2013

0.4.12

Jul 25, 2013

0.4.11

Jun 7, 2013

0.4.10

May 1, 2013

0.4.9

Apr 27, 2013

0.4.8

Nov 18, 2012

0.4.7

Aug 31, 2012

0.4.5

Jun 27, 2012

0.4.4

Jun 21, 2012

0.4.3

Jun 10, 2012

0.4.2

May 16, 2012

0.4.1

Apr 28, 2012

0.4.0

Apr 27, 2012

0.3.33

Apr 13, 2012

0.3.32

Apr 5, 2012

0.3.31

Mar 30, 2012

0.3.30

Mar 27, 2012

0.3.29

Mar 7, 2012

0.3.28

Mar 6, 2012

0.3.27

Mar 6, 2012

0.3.26

Mar 5, 2012

0.3.25

Mar 1, 2012

0.3.24

Feb 21, 2012

0.3.23

Jan 26, 2012

0.3.22

Jan 16, 2012

0.3.21

Jan 6, 2012

0.3.20

Dec 31, 2011

0.3.19

Dec 25, 2011

0.3.18

Dec 20, 2011

0.3.17

Dec 18, 2011

0.3.16

Dec 7, 2011

0.3.15

Dec 2, 2011

0.3.14

Nov 24, 2011

0.3.13

Nov 22, 2011

0.3.12

Nov 14, 2011

0.3.11

Nov 9, 2011

0.3.10

Nov 6, 2011

0.3.9

Nov 6, 2011

0.3.8

Nov 5, 2011

0.3.7

Nov 5, 2011

0.3.6

Nov 4, 2011

0.3.4

Oct 26, 2011

0.3.3

Oct 23, 2011

0.3.2

Oct 3, 2011

0.3.1

Sep 23, 2011

0.3

Sep 2, 2011

0.2.20

Aug 21, 2011

0.2.19

Aug 14, 2011

0.2.18

Jul 31, 2011

0.2.17

Jul 31, 2011

0.2.16

Jul 23, 2011

0.2.15

Jul 23, 2011

0.2.12

Jun 17, 2011

0.2.11

Jun 13, 2011

0.2.10

May 17, 2011

0.2.9

May 11, 2011

0.2.8

May 5, 2011

0.2.7

May 5, 2011

0.2.6

Mar 23, 2011

0.2.5

Dec 5, 2010

0.2.4

Dec 5, 2010

0.2.3

Nov 10, 2010

0.2.2

Nov 8, 2010

0.2.1

Nov 1, 2010

0.2.0

Nov 1, 2010

0.1.7

Sep 12, 2010

0.1.6

Sep 8, 2010

0.1.5

Sep 8, 2010

0.1.4

Sep 4, 2010

0.1.3

Sep 4, 2010

0.1.2

Sep 3, 2010

0.1.1

Aug 14, 2010

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

grab-1.2.0.tar.gz (4.3 MB view details)

Uploaded Sep 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

grab-1.2.0-py2.py3-none-any.whl (91.5 kB view details)

Uploaded Sep 18, 2025 Python 2Python 3

File details

Details for the file grab-1.2.0.tar.gz.

File metadata

Download URL: grab-1.2.0.tar.gz
Upload date: Sep 18, 2025
Size: 4.3 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for grab-1.2.0.tar.gz
Algorithm	Hash digest
SHA256	`0c9159328007dbb4cbccbf3c9ad63d3acd36d0c33aa30a6e7ca31bae43d32203`
MD5	`6fc5205607f24d424aa8982b99e8d2a2`
BLAKE2b-256	`dc6032186e18b5e4219324ac9e7b28bcd23c4fff1b109627c276317aa5ee16a1`

See more details on using hashes here.

File details

Details for the file grab-1.2.0-py2.py3-none-any.whl.

File metadata

Download URL: grab-1.2.0-py2.py3-none-any.whl
Upload date: Sep 18, 2025
Size: 91.5 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for grab-1.2.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`adc0a1214054781fe67a4be0f8e7a2c2fa63ff18592f1a4da90c1d8c3dae7c5c`
MD5	`89fe56f55e81a6d063e6fa8e4dac0d6a`
BLAKE2b-256	`39e5ac8e44aa6f5ef85dcc8626b40c73897985e546323938753bb67dabe8f75a`

See more details on using hashes here.

grab 1.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Grab

Update (2025 year)

Support

Installation

What is Grab?

Grab Example

Grab::Spider Example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes