Collaborative AI for Web Scraping, Data Extraction and Crawling,Knowledge Graph
Project description
In.parse
=========
0.1.0
Open Collaborative AI Driven Parser builder for Web Scraping, Data Extraction and Crawling,Knowledge Graph
Installing
----------
Install and update using ``pip``:
`pip install -U inparse`
Parser Generator
-----------------
http://inparse.com
Motivation
----------
1. Most painful thing of Web Data Extraction is to write parser rule. the Inparse try to
generate the parser by AI according the training web pages.
2. Commercial Universal Parser work good in Statistics, but failed in my case .And blackbox to user.
Inparser create parser for special website ,web page category。And be correctable and improvable online by yourself.
3. Open and free to create parser .Parser rule can be cached locally without remote server
if you have concern.
4. You will not be charged by usage. Run parser in your own CPU.
Example
===============================
```python
from inparse import Inparse
p=Inparse('b45beddc', #parser no is generator by inparse.com parser builder.
'd50cb533f69b6a78892afbd093f95fc1') #access token can be found in your user page .
d=p.parse_url('https://qz.com/india/1413291/trulymadly-ceo-on-how-dating-apps-like-bumble-india-must-localise/')
Inparse.pretty_print(d)
```
**Or parse in raw html**
```python
from inparse import Inparse
import requests
p=Inparse('b45beddc', #parser no is generator by inparse.com parser builder.
'd50cb533f69b6a78892afbd093f95fc1') #access token can be found in your user page .
html=requests.get('https://qz.com/india/1413291/trulymadly-ceo-on-how-dating-apps-like-bumble-india-must-localise/').text
d=p.parse(html)
Inparse.pretty_print(d)
```
Below is output of Article data extraction
```
{ 'article_body': '<div><p>Last week, American dating app <a '
'href="https://qz.com/india/1413051/priyanka-chopra-invests-in-dating-app-bumble-to-rival-tinder/">Bumble '
'It’s from smaller cities. And varied people are coming '
'from different backgrounds. So that’s really '
'encouraging.</p></div>',
'author': 'Kuwar Singh',
'publish_date': None,
'title': 'Young Indians are using dating apps for so much more than just '
'dating',
'top_image': [ 'https://cms.qz.com/wp-content/uploads/2018/10/AP_900509923043-e1538971405267.jpg?quality=75&strip=all&w=410&h=231']
}
```
More about Inparse
===============
Contributing
------------
You are welcome to port this SDK to Java, Go ,or any other programming languages.
Donate
------
Links
-----
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
inparse-0.1.1.tar.gz
(4.7 kB
view details)
Built Distribution
File details
Details for the file inparse-0.1.1.tar.gz
.
File metadata
- Download URL: inparse-0.1.1.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0683ff08db7391972f13252de74dbd3001cdeeaf1e2e6b0d68bf1e017e24af07 |
|
MD5 | d4c4b77fe70530e78cdb05b73a125164 |
|
BLAKE2b-256 | 0e7e75133fa266a3bfa659f9e16bb17d02cf02bd9bd738032de74fedf50fa11d |
File details
Details for the file inparse-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: inparse-0.1.1-py3-none-any.whl
- Upload date:
- Size: 4.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.19.1 setuptools/40.4.3 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 04f7b617b646a1841ef48e3347dac1a2a36002acb3507cb40d90c2df044e2aef |
|
MD5 | 7ecccd305685a21ce183b08471d2e8df |
|
BLAKE2b-256 | 8d070e83ab7dfaa6517363291a8a3cd16f0ecb15e87ca754c3d09179c0baacd1 |