Skip to main content

muchu's utility module

Project description

== Install:

VCForPython27.msi (https://www.microsoft.com/en-us/download/details.aspx?id=44266)
lxml-3.5.0.win32-py2.7.exe (https://pypi.python.org/pypi/lxml/3.5.0)

$pip install crawlermaster

== write csslist.json:
example: techcrunch_news_csslist.json

{
"rawdata-converter-name":"techcrunch_news",
"local-html-file-pattern":{
"strBasedir":"C:\\Users\\muchu\\Downloads",
"strSuffixes":"news.html"
},
"csslist":[
{
"name": "techcrunch-news-title",
"sampleUrl": "https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule": "header.page-title h1.tweet-title::text",
"sampleAns": [
"Apple App Store goes down"
],
"ansType": "exact"
},
{
"name":"techcrunch-news-tags",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"div.tags a.tag::text",
"sampleAns":[
"Apps",
"Apple",
"iCloud",
"iTunes",
"app-store"
],
"ansType":"exact"
},
{
"name":"techcrunch-news-pubtime",
"sampleUrl":"https://techcrunch.com/2016/06/02/apple-app-store-goes-down/",
"cssRule":"header.article-header div.byline time.timestamp::text",
"sampleAns":[
"4 hours ago"
],
"ansType":"exist"
}
]
}

== run test:

$crawlermaster csslist.json

== Uninstall:

$pip uninstall crawlermaster

Project details


Release history Release notifications

This version
History Node

0.1.2a6

History Node

0.1.2a5

History Node

0.1.2a4

History Node

0.1.2a3

History Node

0.1.2a2

History Node

0.1.2a1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
crawlermaster-0.1.2a6.zip (2.6 MB) Copy SHA256 hash SHA256 Source None Oct 27, 2016

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page