Skip to main content

Combine XPath, CSS Selectors and JSONPath for Web data extracting.

Project description

license Pypi Status Python version Package version PyPI - Downloads GitHub last commit Code style: black Build Status codecov Documentation Status

Combine XPath, CSS Selectors and JSONPath for Web data extracting.

Quickstarts

Installation

Install the stable version from PYPI.

pip install data-extractor

Or install the latest version from Github.

pip install git+https://github.com/linw1995/data_extractor.git@master

Usage

from data_extractor import Field, Item, JSONExtractor


class Count(Item):
    followings = Field(JSONExtractor("countFollowings"))
    fans = Field(JSONExtractor("countFans"))


class User(Item):
    name_ = Field(JSONExtractor("name"), name="name")
    age = Field(JSONExtractor("age"), default=17)
    count = Count()


assert User(JSONExtractor("data.users[*]"), is_many=True).extract(
    {
        "data": {
            "users": [
                {
                    "name": "john",
                    "age": 19,
                    "countFollowings": 14,
                    "countFans": 212,
                },
                {
                    "name": "jack",
                    "description": "",
                    "countFollowings": 54,
                    "countFans": 312,
                },
            ]
        }
    }
) == [
    {"name": "john", "age": 19, "count": {"followings": 14, "fans": 212}},
    {"name": "jack", "age": 17, "count": {"followings": 54, "fans": 312}},
]

Changelog

v0.5.4

  • 9552c79 Fix:Simplified item’s extract_first method fail to raise ExtractError

  • 08167ab Fix:Simplified item’s extract_first method should support param default

  • 6e4c269 New:More unittest for testing the simplified items

  • a35b85a Chg:Update poetry.lock

  • e5ff37b Docs,Chg:Update travis-ci status source in the README.rst

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_extractor-0.5.4.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

data_extractor-0.5.4-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file data_extractor-0.5.4.tar.gz.

File metadata

  • Download URL: data_extractor-0.5.4.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.17 CPython/3.7.1 Linux/4.15.0-1028-gcp

File hashes

Hashes for data_extractor-0.5.4.tar.gz
Algorithm Hash digest
SHA256 8da1a0e25877c9a90ee8d7bdc18923756587c9078afff27a3361e82c3cd37627
MD5 b6f6fb8a844e503e4be9a5ade72b84b9
BLAKE2b-256 53daa2853c338776a010af5794f9ddc4823153a119a1c34955d547930e418d23

See more details on using hashes here.

Provenance

File details

Details for the file data_extractor-0.5.4-py3-none-any.whl.

File metadata

  • Download URL: data_extractor-0.5.4-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/0.12.17 CPython/3.7.1 Linux/4.15.0-1028-gcp

File hashes

Hashes for data_extractor-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b046dafff32b252bf67dd39387484eca308b71c2cc77050314ec49dd636136e7
MD5 2b75d6047ceaa1c57d0906d50eba30c6
BLAKE2b-256 7e5c39064f290978b14eb8eeae4354f068e9f0e78c765feb39236cb749f3c350

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page