muchu's utility module
Project description
== Install:
VCForPython27.msi (https://www.microsoft.com/en-us/download/details.aspx?id=44266)
lxml-3.5.0.win32-py2.7.exe (https://pypi.python.org/pypi/lxml/3.5.0)
$pip install crawlermaster
== write csslist.json:
example: techcrunch_news_csslist.json
{
"rawdata-converter-name":"techcrunch_news",
"local-html-file-pattern":{
"strBasedir":"C:\\Users\\muchu\\Downloads",
"strSuffixes":"news.html"
},
"csslist":[
{
"name": "techcrunch-news-title",
"sampleUrl": "https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule": "header.page-title h1.tweet-title::text",
"sampleAns": [
"Apple App Store goes down"
],
"ansType": "exact"
},
{
"name":"techcrunch-news-tags",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"div.tags a.tag::text",
"sampleAns":[
"Apps",
"Apple",
"iCloud",
"iTunes",
"app-store"
],
"ansType":"exact"
},
{
"name":"techcrunch-news-pubtime",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"header.article-header div.byline time.timestamp::text",
"sampleAns":[
"4 hours ago"
],
"ansType":"exist"
}
]
}
== run test:
$crawlermaster csslist.json
== Uninstall:
$pip uninstall crawlermaster
VCForPython27.msi (https://www.microsoft.com/en-us/download/details.aspx?id=44266)
lxml-3.5.0.win32-py2.7.exe (https://pypi.python.org/pypi/lxml/3.5.0)
$pip install crawlermaster
== write csslist.json:
example: techcrunch_news_csslist.json
{
"rawdata-converter-name":"techcrunch_news",
"local-html-file-pattern":{
"strBasedir":"C:\\Users\\muchu\\Downloads",
"strSuffixes":"news.html"
},
"csslist":[
{
"name": "techcrunch-news-title",
"sampleUrl": "https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule": "header.page-title h1.tweet-title::text",
"sampleAns": [
"Apple App Store goes down"
],
"ansType": "exact"
},
{
"name":"techcrunch-news-tags",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"div.tags a.tag::text",
"sampleAns":[
"Apps",
"Apple",
"iCloud",
"iTunes",
"app-store"
],
"ansType":"exact"
},
{
"name":"techcrunch-news-pubtime",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"header.article-header div.byline time.timestamp::text",
"sampleAns":[
"4 hours ago"
],
"ansType":"exist"
}
]
}
== run test:
$crawlermaster csslist.json
== Uninstall:
$pip uninstall crawlermaster
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crawlermaster-0.1.2a6.zip
(2.6 MB
view details)
File details
Details for the file crawlermaster-0.1.2a6.zip
.
File metadata
- Download URL: crawlermaster-0.1.2a6.zip
- Upload date:
- Size: 2.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c158d1828340f37644ec4e827803869dd3281dd1a47d9f8a11347a8ec7ff6d76 |
|
MD5 | a526c215bbe807adf60363acc684ea9e |
|
BLAKE2b-256 | c39d8167b5feec10ed5e6ea777543d1312d1651855d01785f6a3b2b7eb101d79 |