scrape a website with json template
Project description
# py-instant-crawl
python library for scrape websites by specifying template in json
## Installation
```
pip install pyinstantcrawl
```
## Quickstart
1. create the template like below and save is as `sample.json`
```
{
"tip-of-day": {
"expression": "string(//div[@class='tip-of-day'])",
"type": "xpath",
"getter": "get"
},
"testimonial": {
"expression": ".testimonial",
"type": "css",
"getter": "getall"
}
}
```
2. call the command below
```python
python main.py https://pragprog.com sample.json
```
now its work with parent + child structure. Check it at examples folder.
python library for scrape websites by specifying template in json
## Installation
```
pip install pyinstantcrawl
```
## Quickstart
1. create the template like below and save is as `sample.json`
```
{
"tip-of-day": {
"expression": "string(//div[@class='tip-of-day'])",
"type": "xpath",
"getter": "get"
},
"testimonial": {
"expression": ".testimonial",
"type": "css",
"getter": "getall"
}
}
```
2. call the command below
```python
python main.py https://pragprog.com sample.json
```
now its work with parent + child structure. Check it at examples folder.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyinstantcrawl-1.0.1.tar.gz
(2.6 kB
view details)
Built Distribution
File details
Details for the file pyinstantcrawl-1.0.1.tar.gz
.
File metadata
- Download URL: pyinstantcrawl-1.0.1.tar.gz
- Upload date:
- Size: 2.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7647811eca5a54a47bd234e92e7b36ae6b609427fbf9ea596714b0726320b34c |
|
MD5 | 5afbba097625c3b28604a7be7905977a |
|
BLAKE2b-256 | 0bcc5a80dde92e4811773c1f58bea24a12b8bdaf21aa4ce8600f40e5cbbb4df6 |
File details
Details for the file pyinstantcrawl-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: pyinstantcrawl-1.0.1-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e14bd1e0ffcf4dd68f85dd9dbee4b03422fcd563882ed3cb225981ad73e988dc |
|
MD5 | ccb3d5e663e8b319f18b95a32214e10d |
|
BLAKE2b-256 | 68fffc8d21ed9e7fe24a0e16c57b3b6c478f44bdf104191eaa2b177e44dbf4eb |