Skip to main content

Ciur is a scrapper layer based on DSL for extracting data

Project description


Ciur is a scrapper layer in code development

Ciur is a lib because it has less black magic than a framework

It exports all scrapper related code into separate layer.

If you are annoyed by Spaghetti code, sql inside php and inline css inside html THEN you also are annoyed by xpath/css code inside crawler.

Ciur gives the taste of Lasagna code generally by enforcing encapsulation for scrapping layer.

For more information visit the documentation.


Ciur uses own DSL, here is a small example of a query:

root `/html/body` +1
    name `.//h1/text()` +1
    paragraph `.//p/text()` +1

This command

$ ciur -p -r

Will produce a json

    "root": {
        "name": "Example Domain",
        "paragraph": "This domain is established to be used for illustrative
                       examples in documents. You may use this
                       domain in examples without prior coordination or
                      asking for permission."


pip install ciur

Install via docker

$ docker run -it python:3.9 bash
root@e4d327153f2f:/# pip install ciur
root@e4d327153f2f:/# ciur --help

root@e4d327153f2f:/# ciur --help
usage: ciur [-h] -p PARSE -r RULE [-w] [-v]

*Ciur is a scrapper layer based on DSL for extracting data*

*Ciur is a lib because it has less black magic than a framework*

If you are annoyed by `Spaghetti code` than we can taste `Lasagna code`
with help of Ciur

optional arguments:
  -h, --help            show this help message and exit
  -p PARSE, --parse PARSE
                        url or local file path required document for html, xml, pdf. (f.e. or /tmp/
  -r RULE, --rule RULE  url or local file path file with parsing dsl rule (f.e. /tmp/ or http:/host/
  -w, --ignore_warn     suppress python warning warnings and ciur warnings hints
  -v, --version         show program's version number and exit

Ciur use MIT License

This means that code may be included in proprietary code without any additional restrictions.

Please see LICENSE.


The code of Cuir was concepted in 2012, and is going to continue developing.

All contributions are welcome and should be done via Bitbucket (Pull Request, Issues).

A second alternative as exception (maybe if bitbucket is not available) can be done via email ciur[mail symbol]

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ciur-0.2.0.tar.gz (22.5 kB view hashes)

Uploaded source

Built Distribution

ciur-0.2.0-py3-none-any.whl (26.3 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page