Skip to main content

Site Scraping Framework

Project description

https://travis-ci.org/lorien/grab.png

Grab is a python site scraping framework. Grab provides powerful interface to two libraries: lxml and pycurl. There are two ways how to use Grab: 1) Use Grab to configure network requests and to process fetched documents. In this way you should manually control flow of you program. 2) Use Grab::Spider to buld asynchronous site scrapers. This is how scrapy works.

Example of Grab usage:

from grab import Grab

g = Grab()
g.go('https://github.com/login')
g.set_input('login', 'lorien')
g.set_input('password', '***')
g.submit()
for elem in g.doc.select('//ul[@id="repo_listing"]/li/a'):
    print '%s: %s' % (elem.text(), elem.attr('href'))

Example of Grab::Spider usage:

from grab.spider import Spider, Task
import logging

class ExampleSpider(Spider):
    def task_generator(self):
        for lang in ('python', 'ruby', 'perl'):
            url = 'https://www.google.com/search?q=%s' % lang
            yield Task('search', url=url)

    def task_search(self, grab, task):
        print grab.doc.select('//div[@class="s"]//cite').text()


logging.basicConfig(level=logging.DEBUG)
bot = ExampleSpider()
bot.run()

Installation

Pip is recommended way to install Grab and its dependencies:

$ pip install lxml
$ pip install pycurl
$ pip install grab

Documentation

Russian docs: http://docs.grablib.org English docs in progress.

Discussion group (Russian or English): http://groups.google.com/group/python-grab/

Contribution

If you found a bug or if you want new feature please create new issue on github:

Project details


Release history Release notifications

History Node

0.6.40

History Node

0.6.39

History Node

0.6.38

History Node

0.6.37

History Node

0.6.36

History Node

0.6.35

History Node

0.6.34

History Node

0.6.33

History Node

0.6.32

History Node

0.6.31

History Node

0.6.30

History Node

0.6.29

History Node

0.6.28

History Node

0.6.27

History Node

0.6.26

History Node

0.6.25

History Node

0.6.24

History Node

0.6.23

History Node

0.6.22

History Node

0.6.21

History Node

0.6.20

History Node

0.6.19

History Node

0.6.18

History Node

0.6.17

History Node

0.6.16

History Node

0.6.15

History Node

0.6.14

History Node

0.6.13

History Node

0.6.12

History Node

0.6.11

History Node

0.6.10

History Node

0.6.9

History Node

0.6.8

History Node

0.6.7

History Node

0.6.6

History Node

0.6.5

History Node

0.6.4

History Node

0.6.3

History Node

0.6.2

History Node

0.6.1

History Node

0.6.0

History Node

0.5.5

History Node

0.5.4

History Node

0.5.3

History Node

0.5.2

History Node

0.5.1

History Node

0.5.0

This version
History Node

0.4.13

History Node

0.4.12

History Node

0.4.11

History Node

0.4.10

History Node

0.4.9

History Node

0.4.8

History Node

0.4.7

History Node

0.4.5

History Node

0.4.4

History Node

0.4.3

History Node

0.4.2

History Node

0.4.1

History Node

0.4.0

History Node

0.3.33

History Node

0.3.32

History Node

0.3.31

History Node

0.3.30

History Node

0.3.29

History Node

0.3.28

History Node

0.3.27

History Node

0.3.26

History Node

0.3.25

History Node

0.3.24

History Node

0.3.23

History Node

0.3.22

History Node

0.3.21

History Node

0.3.20

History Node

0.3.19

History Node

0.3.18

History Node

0.3.17

History Node

0.3.16

History Node

0.3.15

History Node

0.3.14

History Node

0.3.13

History Node

0.3.12

History Node

0.3.11

History Node

0.3.10

History Node

0.3.9

History Node

0.3.8

History Node

0.3.7

History Node

0.3.6

History Node

0.3.4

History Node

0.3.3

History Node

0.3.2

History Node

0.3.1

History Node

0.3

History Node

0.2.20

History Node

0.2.19

History Node

0.2.18

History Node

0.2.17

History Node

0.2.16

History Node

0.2.15

History Node

0.2.12

History Node

0.2.11

History Node

0.2.10

History Node

0.2.9

History Node

0.2.8

History Node

0.2.7

History Node

0.2.6

History Node

0.2.5

History Node

0.2.4

History Node

0.2.3

History Node

0.2.2

History Node

0.2.1

History Node

0.2.0

History Node

0.1.7

History Node

0.1.6

History Node

0.1.5

History Node

0.1.4

History Node

0.1.3

History Node

0.1.2

History Node

0.1.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
grab-0.4.13.tar.gz (149.4 kB) Copy SHA256 hash SHA256 Source None Sep 12, 2013

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page