scrape a website with json template
Project description
# py-instant-crawl
python library for scrape websites by specifying template in json
## Installation
```
pip install pyinstantcrawl
```
## Quickstart
1. create the template like below and save is as `sample.json`
```
{
"tip-of-day": {
"expression": "string(//div[@class='tip-of-day'])",
"type": "xpath",
"getter": "get"
},
"testimonial": {
"expression": ".testimonial",
"type": "css",
"getter": "getall"
}
}
```
2. call the command below
```python
python main.py https://pragprog.com sample.json
```
now its work with parent + child structure. Check it at examples folder.
python library for scrape websites by specifying template in json
## Installation
```
pip install pyinstantcrawl
```
## Quickstart
1. create the template like below and save is as `sample.json`
```
{
"tip-of-day": {
"expression": "string(//div[@class='tip-of-day'])",
"type": "xpath",
"getter": "get"
},
"testimonial": {
"expression": ".testimonial",
"type": "css",
"getter": "getall"
}
}
```
2. call the command below
```python
python main.py https://pragprog.com sample.json
```
now its work with parent + child structure. Check it at examples folder.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
pyinstantcrawl-1.0.1.tar.gz
(2.6 kB
view hashes)
Built Distribution
Close
Hashes for pyinstantcrawl-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e14bd1e0ffcf4dd68f85dd9dbee4b03422fcd563882ed3cb225981ad73e988dc |
|
MD5 | ccb3d5e663e8b319f18b95a32214e10d |
|
BLAKE2b-256 | 68fffc8d21ed9e7fe24a0e16c57b3b6c478f44bdf104191eaa2b177e44dbf4eb |