Simple. Powerful. Persistent LRU caching for the requests library.
Project description
cache_requests
Simple. Powerful. Persistent LRU caching for the requests library.
Features
Documentation: https://cache_requests.readthedocs.org
Open Source: https://github.com/bionikspoon/cache_requests
Python version agnostic: tested against Python 2.7, 3.3, 3.4, 3.5 and Pypy
MIT license
Drop in decorator for the requests library.
Automatic timer based expiration on stored items (optional).
Backed by yahoo’s powerful redislite.
Scalable with redis. Optionally accepts a redis connection.
Exposes the powerful underlying Memoize decorator to decorate any function.
Tested with high coverage.
Lightweight. Simple logic.
Lightning fast.
Jump start your development cycle.
Collect and reuse entire response objects.
Installation
At the command line either via easy_install or pip
$ pip install cache_requests
$ easy_install cache_requests
Or, if you have virtualenvwrapper installed
$ mkvirtualenv cache_requests
$ pip install cache_requests
Uninstall
$ pip uninstall cache_requests
Usage
To use cache_requests in a project
import cache_requests
Quick Start
To use cache_requests in a project
>>> from cache_requests import Session()
requests = Session()
# from python-requests.org
>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}
Config Options
cache_requests.config
- config.ex
sets the default expiration (seconds) for new cache entries. Can be configured with env REDIS_EX.
- config.dbfilename
sets the default location for the database. The default location is a spot in your OS’ temp directory. Can be configured with env REDIS_DBFILENAME.
- config.connection
creates the connection to the redis or redislite database. By default this is a redislite connection, but a redis connection can be dropped in for an easy upgrade. Can be configured with env REDIS_CONNECTION.
cache_requests.Session
Caching individual session methods is turned on and off independently.
These methods are accessed through the Session objects cache.[method name]. They can be overridden with the cache.all setting.
For example
from cache_requests import Session
requests = Session()
requests.cache.delete = True
# cached, only called once.
requests.delete('http://google.com')
requests.delete('http://google.com')
requests.cache.delete = True
# not cached, called twice.
requests.delete('http://google.com')
requests.delete('http://google.com')
# cache ALL methods
requests.cache.all = True
# don't cache any methods
requests.cache.all = False
# Use individual method cache options.
requests.cache.all = None
Default settings
Method |
Cached |
---|---|
get |
True |
head |
True |
options |
True |
post |
False |
put |
False |
patch |
False |
delete |
False |
all |
None |
Use Case Scenarios
Development: 3rd Party APIs
- Scenario:
Working on a project that uses a 3rd party API or service.
- Things you want:
A cache that persists between sessions and is lightning fast.
Ability to rapidly explore the API and it’s parameters.
Ability to inspect and debug response content.
Ability to focus on progress.
Perfect transition to a production environment.
- Things you don’t want:
Dependency on network and server stability for development.
Spamming the API. Especially APIs with limits.
Responses that change in non-meaningful ways.
Burning energy with copypasta or fake data to run piece of your program.
Slow. Responses.
Make a request one time. Cache the results for the rest of your work session.
import os
if os.environ.get('ENV') == 'DEVELOP':
from cache_requests import Session, config
config.ex = 60 * 60 # 60 min
request = Session()
else:
import requests
# strange, complicated request you might make
headers = {"accept-encoding": "gzip, deflate, sdch", "accept-language": "en-US,en;q=0.8"}
payload = dict(sourceid="chrome-instant", ion="1", espv="2", ie="UTF-8", client="ubuntu",
q="hash%20a%20dictionary%20python")
response = requests.get('http://google.com/search', headers=headers, params=payload)
# spam to prove a point
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
# tweak your query, we're exploring here
payload = dict(sourceid="chrome-instant", ion="1", espv="2", ie="UTF-8", client="ubuntu",
q="hash%20a%20dictionary%20python2")
# do you see what changed? the caching tool did.
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
response = requests.get('http://google.com/search', headers=headers, params=payload)
Optionally. Setup with environment variables.
$ export ENV=DEVELOP
$ export REDIS_DBFILENAME='redis/requests.redislite' # make sure directory exists
$ export REDIS_EX=3600 # 1 hour; default
Production: Web Scraping
Automatically expire old content.
How often? After a day? A week? A Month? etc. 100% of this logic is built in with the config.ex setting.
Effectively it can manage all of the time-based rotation.
Perfect if you theres more data then what your API caps allow.
One line of code to use a redis full database.
Try redislite; it can handle quite a bit. The redislite api used by this module is 1:1 with the redis package. Just replace the connection parameter/config value.
redis is a drop in:
config.connection = redis.StrictRedis(host='localhost', port=6379, db=0)
Everything else just works. There’s no magic required.
from cache_requests import Session, config config.connection = redis.StrictRedis(host='localhost', port=6379, db=0) config.ex = 7 * 24 * 60 * 60 # 1 week requests = Session() for i in range(1000) payload = dict(q=i) response = requests.get('http://google.com/search', params=payload) print(response.text)
Usage: memoize
from cache_requests import memoize, config
config.ex = 15 * 60 # 15 min, defult, 60 min
@memoize
def amazing_but_expensive_function(*args, **kwargs)
print("You're going to like this")
Credits
Tools used in rendering this package:
History
Next Release
Coming Soon
2.0.0 (2015-12-12)
API completely rewritten
New API extends requests internals as opposed to monkeypatching.
Entire package is redesigned to be more maintainable, more modular, and more usable.
Dependencies are pinned.
Tests are expanded.
PY26 and PY32 support is dropped, because of dependency constraints.
PY35 support is added.
Docs are rewritten.
Move towards idiomatic code.
2.0.6 Fix broken coverage, broken rst render.
1.0.0 (2015-04-23)
First real release.
Feature/ Unit test suite, very high coverage.
Feature/ redislite integration.
Feature/ Documentation. https://cache-requests.readthedocs.org.
Feature/ Exposed the beefed up Memoize decorator.
- Feature/ Upgraded compatibility to:
PY26
PY27
PY33
PY34
PYPY
Added examples and case studies.
0.1.0 (2015-04-19)
First release on PyPI.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for cache_requests-2.0.6-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0462d3881a465fe09171976959d2d25d21f42417d7c03f72c2a43d3384d61c28 |
|
MD5 | 78c0a6422dcdd4454140c58db5dce857 |
|
BLAKE2b-256 | 1d5263b19479f0ba765947669c00d722590e2892fd93dbf214fd38ae52610797 |