Skip to main content

A comprehensive HTTP client library modified to add response streaming support.

Project description

Httplib2

--------------------------------------------------------------------
This is a modified version of the original httplib2 library
to support streaming of large http responses, instead of loading them
into memory as in the original library.
See CHANGELOG for more information.

The package can be installed throught pip:

pip install streaming_httplib2

A distributed cache class is available in dcache.py
You can use it to create multi-process or even multi-machines
web cache (using distributed file systems like GlusterFS).

--------------------------------------------------------------------
Introduction

A comprehensive HTTP client library, httplib2.py supports many
features left out of other HTTP libraries.

HTTP and HTTPS
HTTPS support is only available if the socket module was
compiled with SSL support.
Keep-Alive
Supports HTTP 1.1 Keep-Alive, keeping the socket open and
performing multiple requests over the same connection if
possible.
Authentication
The following three types of HTTP Authentication are
supported. These can be used over both HTTP and HTTPS.

* Digest
* Basic
* WSSE

Caching
The module can optionally operate with a private cache that
understands the Cache-Control: header and uses both the ETag
and Last-Modified cache validators.
All Methods
The module can handle any HTTP request method, not just GET
and POST.
Redirects
Automatically follows 3XX redirects on GETs.
Compression
Handles both 'deflate' and 'gzip' types of compression.
Lost update support
Automatically adds back ETags into PUT requests to resources
we have already cached. This implements Section 3.2 of
Detecting the Lost Update Problem Using Unreserved Checkout.
Unit Tested
A large and growing set of unit tests.


For more information on this module, see:

http://bitworking.org/projects/httplib2/


--------------------------------------------------------------------
Installation

The httplib2 module is shipped as a distutils package. To install
the library, unpack the distribution archive, and issue the following
command:

$ python setup.py install


--------------------------------------------------------------------
Usage
A simple retrieval:

import httplib2
h = httplib2.Http(".cache")
(resp_headers, content) = h.request("http://example.org/", "GET")

The 'content' is the content retrieved from the URL. The content
is already decompressed or unzipped if necessary.

To PUT some content to a server that uses SSL and Basic authentication:

import httplib2
h = httplib2.Http(".cache")
h.add_credentials('name', 'password')
(resp, content) = h.request("https://example.org/chapter/2",
"PUT", body="This is text",
headers={'content-type':'text/plain'} )

Use the Cache-Control: header to control how the caching operates.

import httplib2
h = httplib2.Http(".cache")
(resp, content) = h.request("http://bitworking.org/", "GET")
...
(resp, content) = h.request("http://bitworking.org/", "GET",
headers={'cache-control':'no-cache'})

The first request will be cached and since this is a request
to bitworking.org it will be set to be cached for two hours,
because that is how I have my server configured. Any subsequent
GET to that URI will return the value from the on-disk cache
and no request will be made to the server. You can use the
Cache-Control: header to change the caches behavior and in
this example the second request adds the Cache-Control:
header with a value of 'no-cache' which tells the library
that the cached copy must not be used when handling this request.


--------------------------------------------------------------------
Httplib2 Software License

Copyright (c) 2006 by Joe Gregorio

Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.



0.7.2 (streaming version)
Changed behaviour to return file like object instead of strings for content.
Changed cache to handle streaming too.

0.7.1
Fix failure to install cacerts.txt for 2.x installs.

0.7.0
The two major changes in this release are SSL Certificate
checking and App Engine support. By default the certificates
of an HTTPS connection are checked, but that can be disabled
via disable_ssl_certificate_validation. The second change
is that on App Engine there is a new connection object
that utilizes the urlfetch capabilities on App Engine, including
setting timeouts and validating certificates.

The following issues have been addressed:

Fixes issue 72. Always lowercase authorization header.
Fix issue 47. Redirects that become a GET should not have a body.
Fixes issue 19. Set Content-location on redirected HEAD requests
Fixes issue 139. Redirect with a GET on 302 regardless of the originating method.
Fixes issue 138. Handle unicode in headers when writing and retrieving cache entries. Who says headers have to be ASCII!
Add certificate validation. Work initially started by Christoph Kern.
Set a version number. Fixes issue # 135.
Sync to latest version of socks.py
Add gzip to the user-agent, in case we are making a request to an app engine project: http://code.google.com/appengine/kb/general.html#compression
Uses a custom httplib shim on App Engine to wrap urlfetch, as opposed
Add default support for optimistic concurrency on PATCH requests
Fixes issue 126. IPv6 under various conditions would fail.
Fixes issue 131. Handle socket.timeout's that occur during send.
proxy support: degrade gracefully when socket.socket is unavailable


0.6.0

The following issues have been addressed:

#51 - Failure to handle server legitimately closing connection before request body is fully sent
#77 - Duplicated caching test
#65 - Transform _normalize_headers into a method of Http class
#45 - Vary header
#73 - All files in Mercurial are executable
#81 - Have a useful .hgignore
#78 - Add release tags to the Mercurial repository
#67 - HEAD requests cause next request to be retried

Mostly bug fixes, the big enhancement is the addition of proper Vary: header
handling. Thanks to Chris Dent for that change.

The other big change is the build process for distributions so that both python2 and python3
are included in the same .tar.gz/.zip file.

0.5.0

Added Python 3 support

Fixed the following bugs:

#12 - Cache-Control: only-if-cached incorrectly does request if item not in cache
#39 - Deprecation warnings in Python 2.6
#54 - Http.request fails accesing Google account via http proxy
#56 - Block on response.read() for HEAD requests.
#57 - Timeout ignore for Python 2.6
#58 - Fixed parsing of Cache-Control: header to make it more robust

Also fixed a deprecation warning that appeared between Python 3.0 and 3.1.

0.4.0

Added support for proxies if the Socksipy module is installed.

Fixed bug with some HEAD responses having content-length set to
zero incorrectly.

Fixed most except's to catch a specific exception.

Added 'connection_type' parameter to Http.request().

The default for 'force_exception_to_status_code' was changed to False. Defaulting
to True was causing quite a bit of confusion.


0.3.0
Calling Http.request() with a relative URI, as opposed to an absolute URI,
will now throw a specific exception.

Http() now has an additional optional parameter for the socket timeout.

Exceptions can now be forced into responses. That is, instead of
throwing an exception, a good httlib2.Response object is returned
that describe the error with an appropriate status code.

Many improvements to the file cache:

1. The names in the cache are now much less
opaque, which should help with debugging.

2. The disk cache is now Apache mod_asis compatible.

3. A Content-Location: header is supplied and stored in the
cache which points to the original requested URI.

User supplied If-* headers now override httplib2 supplied
versions.

IRIs are now fully supported. Note that they MUST be passed in
as unicode objects.

Http.add_credentials() now takes an optional domain to restrict
the credentials to being only used on that domain.

Added Http.add_certificate() which allows setting
a key and cert for SSL connnections.

Many other bugs fixed.


0.2.0
Added support for Google Auth.

Added experimental support for HMACDigest.

Added support for a pluggable caching system. Now supports
the old system of using the file system and now memcached.

Added httplib2.debuglevel which turns on debugging.

Change Response._previous to Response.previous.

Addded Http.follow_all_redirects which forces
httplib2 to follow all redirects, as opposed to
following only the safe redirects. This makes the
GData protocol easier to use.

All known bugs fixed to date.

0.1.1

Fixed several bugs raised by James Antill:
1. HEAD didn't get an Accept: header added like GET.
2. HEAD requests did not use the cache.
3. GET requests with Range: headers would erroneously return a full cached response.
4. Subsequent requests to resources that had timed out would raise an exception.
And one feature request for 'method' to default to GET.

Xavier Verges Farrero supplied what I needed to make the
library work with Python 2.3.

I added distutils based setup.py.

0.1 Rev 86

Initial Release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

streaming_httplib2-0.7.6.tar.gz (40.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page