Skip to main content
This is a pre-production deployment of Warehouse. Changes made here affect the production instance of PyPI (
Help us improve Python packaging - Donate today!

A high-level distributed crawling framework

Project Description


Cola is a high-level distributed crawling framework, used to crawl pages and extract structured data from websites. It provides simple and fast yet flexible way to achieve your data acquisition objective. Users only need to write one piece of code which can run under both local and distributed mode.


  • Python2.7 (Python3+ will be supported later)
  • Work on Linux, Windows and Mac OSX


The quick way:

pip install cola

Or, download source code, then run:

python install

Write applications

Documents will update soon, now just refer to the wiki or weibo application.

Run applications

For the wiki or weibo app, please ensure the installation of dependencies, weibo as an example:

pip install -r /path/to/cola/app/weibo/requirements.txt

Local mode

In order to let your application support local mode, just add code to the entrance as below.

from cola.context import Context
ctx = Context(local_mode=True)

Then run the application:


Stop the local job by CTRL+C.

Distributed mode

Start master:

coca master -s [ip:port]

Start one or more workers:

coca worker -s -m [ip:port]

Then run the application(weibo as an example):

coca job -u /path/to/cola/app/weibo -r

Coca command

Coca is a convenient command-line tool for the whole cola environment.


Kill master to stop the whole cluster:

coca master -k


List all jobs:

coca job -m [ip:port] -l

Example as:

list jobs at master:
====> job id: 8ZcGfAqHmzc, job description: sina weibo crawler, status: stopped

You can run a job which shown in the list above:

coca job -r 8ZcGfAqHmzc

Actually, you don’t have to input the complete job name:

coca job -r 8Z

Part of the job name is fine if there’s no conflict.

You can know the status of a running job by:

coca job -t 8Z

The status like counters during running and so on will be output to the terminal.

You can kill a job by the kill command:

coca job -k 8Z


You can create an application by this command:

coca startproject colatest

Remember, help command will always be helpful:

coca -h


coca master -h


Cola is a non-profit project and by now maintained by myself, thus any donation will be encouragement for the further improvements of cola project.

Alipay & Paypal:

Release History

Release History

This version
History Node


History Node


History Node


Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
Cola-0.1.0.tar.gz (64.3 kB) Copy SHA256 Checksum SHA256 Source Mar 31, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting