This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

A simple way to access the webhose.io API from your Python code

import webhose

webhose.config(token=YOUR_API_KEY)
for post in webhose.search("github"):
    print(post.title)

API Key

To make use of the webhose.io API, you need to obtain a token that would be used on every request. To obtain an API key, create an account at https://webhose.io/auth/signup, and then go into https://webhose.io/dashboard to see your token.

Installing

You can install from source:

$ git clone https://github.com/Buzzilla/webhose-python
$ cd webhose-python
$ python setup.py install

Use the API

To get started, you need to import the library, and set your access token. (Replace YOUR_API_KEY with your actual API key).

>>> import webhose
>>> webhose.config(token=YOUR_API_KEY)

Now you can make a request and inspect the results:

>>> r = webhose.search("foobar")
>>> r.total
62
>>> len(r.posts)
62
>>> r.posts[0].language
'english'
>>> r.posts[0].title
'Putting quotes around dictionary keys in JS'

For your convenience, the Response object is iterable, so you can loop over it and get all the results. The iterator will create additional API requests to fetch additional pages.

>>> total_words = 0
>>> for post in r:
...     total_words += len(post.text.split(" "))
...
>>> print(total_words)
56006

Warning: This method can use up your credits if your search has lots of results.

Full documentation

  • config(token)
    • token - your API key
  • search(query, token=None)
    • query - the search query, either as a search string, or as a Query object
    • token - you can provide the API key directly to the search function if you want

Query objects

Query object correspond to the advanced search options that appear on https://webhose.io/use

Query object have the following members:

  • all_terms - a list of strings, all of which must appear in the results
  • some_terms - a list of strings, some of which must appear in the results
  • phrase - a phrase that must appear verbatim in the results
  • exclude - terms that should not appear in the results
  • site_type - one or more of discussions, news, blogs
  • language - one or more of language names, in lowercase english
  • site - one or more of site names, top level only (i.e., yahoo.com and not news.yahoo.com)
  • title - terms that must appear in the title
  • body_text - term that must appear in the body text

Query objects implement the __str__() method, which shows the resulting search string.

Response objects

Response objects have the following members:

  • total - the total number of posts which match this search
  • more - the number of posts not included in this response
  • posts - a list os Post objects
  • next - a URL for the next results page for this search
  • response - the original requests response
  • get_next() - a method to fetch the next page of results. Returns a new Response object

Response objects implement the __iter__() method, which can be used to loop over all posts matching the query. (Automatic page fetching)

Post and Thread objects

Post and Thread object contain the actual data returned from the API. Consult https://webhose.io/documentation to find out about their structure.

Polling

If you want to make repeated searches, performing an action whenever there are new results, use code like this:

r = webhose.search("skyrim")
while True:
    for post in r:
        perform_action(post)
    time.sleep(300)
    r = r.get_next()
Release History

Release History

0.1.5

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.4

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.3

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.2

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.1

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
webhose-0.1.5.tar.gz (5.5 kB) Copy SHA256 Checksum SHA256 Source Sep 26, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting