Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Project description
Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Quickstarts
Installation
Install the stable version from PYPI.
pip install data-extractor
Or install the latest version from Github.
pip install git+https://github.com/linw1995/data_extractor.git@master
Usage
from data_extractor import Field, Item, JSONExtractor
class Count(Item):
followings = Field(JSONExtractor("countFollowings"))
fans = Field(JSONExtractor("countFans"))
class User(Item):
name_ = Field(JSONExtractor("name"), name="name")
age = Field(JSONExtractor("age"), default=17)
count = Count()
assert User(JSONExtractor("data.users[*]"), is_many=True).extract(
{
"data": {
"users": [
{
"name": "john",
"age": 19,
"countFollowings": 14,
"countFans": 212,
},
{
"name": "jack",
"description": "",
"countFollowings": 54,
"countFans": 312,
},
]
}
}
) == [
{"name": "john", "age": 19, "count": {"followings": 14, "fans": 212}},
{"name": "jack", "age": 17, "count": {"followings": 54, "fans": 312}},
]
Changelog
v0.6.0.dev2
b7edbae Dev,New:Use nox test in multi-py-versions, Update workflow
a043838 Fix:Can’t import JSONPathExtractor from root module
a23ece9 Test,Fix:Missing JSONPathExtractor in simple extractor tests
5903ff9 Dev,Fix:Nox changes symlink ‘.venv’ of virtualenv of development
57d03ad Dev,Fix:Install unneeded development dependencies
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
data_extractor-0.6.0.dev2.tar.gz
(12.2 kB
view hashes)
Built Distribution
Close
Hashes for data_extractor-0.6.0.dev2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78d32eada482d0fe7fd154e795275530d735aebdeacda9a45e60ebceaabd42c6 |
|
MD5 | 4945466ecacfd6f3614036c3da97a6e5 |
|
BLAKE2b-256 | cce366236c39c6247c58408de61e14d4394b553f9d5baffd250180660f86d327 |
Close
Hashes for data_extractor-0.6.0.dev2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 609606e390f97ce08d8ee02482564a7b06ffe6608613c7ce7b0c945e09808b31 |
|
MD5 | 5c5cd7782ac0cf70a740a1b7733cdab4 |
|
BLAKE2b-256 | 07a717dfaa788d27026eacf0ca2626beb70f633408a3b7461d6ef905b3824c1d |