Skip to main content

A high-level web scraping framework

Project description


Okami is a high-level web scraping framework built entirely for Python 3.6+ using asynchronous model provided by standard library asyncio module with aiohttp as a networking layer and lxml for parsing data.

Architecture is entirely modular and main components can be swapped out and replaced with custom implementations.


  • complete website-wide page processing
  • full scraping mode or delta mode scraping only unvisited pages
  • immediate, on-demand or real-time page processing over HTTP API
  • single page processing via command line
  • lots of pipelines, middlewares and signals

Spiders are very simple implementations. Take a look at an example here.

Quick start

  • Install okami

    • pip install okami
  • Run example web server

    • OKAMI_SETTINGS=okami.cfg.example okami example server

Open localhost:8000 and browse around a little. Quite a remarkable website. We will run our example spider against this website shortly and process few items.

  • Run example spider

    • OKAMI_SETTINGS=okami.cfg.example okami example spider

Our example spider started and you can see it processing pages. Take a look at an example spider implementation here.


Read the rest of documentation here.


Okami is licensed under a three clause BSD License. Full license text can be found here.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for okami, version 0.2.0
Filename, size File type Python version Upload date Hashes
Filename, size okami-0.2.0-py2.py3-none-any.whl (25.1 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size okami-0.2.0.tar.gz (20.5 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page