A library for parsing jingdong pages.
Project description
Currently only support book pages. All fields use same encoding ‘utf-8’.
Category fields: * name - Category name * url - Category url * children - Subcategories.
Book list fields: * links - A list of links in {‘title’: ‘’, ‘url’: ‘’} format * next_page_uri- Next page uri
Book detail fields: * title - Book title * author - Book authors, delimited by comma * images - Image url list * detail - Detail key-value pairs
Content
jd_page_parser.category_parsers.BookCategoryParser
jd_page_parser.product_list_parsers.BookListParser
jd_page_parser.detail_parsers.BookDetailParser
Installation
The simplest way is to install it via pip:
pip install jd-page-parser
Run Test
pip install -r requirements-dev.txt
tox
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file jd_page_parser-0.0.1-py2.py3-none-any.whl
.
File metadata
- Download URL: jd_page_parser-0.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.21.0 setuptools/38.5.0 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.4.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 75d5f0fb6e37a3681c9320e19056992189ba0844816f0a3defe46e222c6cd198 |
|
MD5 | 37b5e37605d84cb2dbea899837a5b550 |
|
BLAKE2b-256 | 6594a7822ac63ead685f43d28d10f711aa5f5597bb39c8b9797d7a5825e942f1 |