Skip to main content

Python Client for the European Patent Office's Open Patent Services API

Project description

python-epo-ops-client

PyPI PyPI - Python Version Travis.org Coveralls github

python-epo-ops-client is an Apache2 licensed client library for accessing the European Patent Office's ("EPO") Open Patent Services ("OPS") v.3.2 (based on v 1.3.16 of the reference guide).

import epo_ops

client = epo_ops.Client(key='abc', secret='xyz')  # Instantiate client
response = client.published_data(  # Retrieve bibliography data
  reference_type = 'publication',  # publication, application, priority
  input = epo_ops.models.Docdb('1000000', 'EP', 'A1'),  # original, docdb, epodoc
  endpoint = 'biblio',  # optional, defaults to biblio in case of published_data
  constituents = []  # optional, list of constituents
)

Features

python-epo-ops-client abstracts away the complexities of accessing EPO OPS:

  • Format the requests properly
  • Bubble up quota problems as proper HTTP errors
  • Handle token authentication and renewals automatically
  • Handle throttling properly
  • Add optional caching to minimize impact on the OPS servers

There are two main layers to python-epo-ops-client: Client and Middleware.

Client

The Client contains all the formatting and token handling logic and is what you'll interact with mostly.

When you issue a request, the response is a requests.Response object. If response.status_code != 200 then a requests.HTTPError exception will be raised — it's your responsibility to handle those exceptions if you want to. The one case that's handled is when the access token has expired: in this case, the client will automatically handle the HTTP 400 status and renew the token.

Note that the Client does not attempt to interpret the data supplied by OPS, so it's your responsibility to parse the XML or JSON payload for your own purpose.

The following custom exceptions are raised for cases when OPS quotas are exceeded, they are all in the epo_ops.exceptions module and are subclasses of requests.HTTPError, and therefore offer the same behaviors:

  • IndividualQuotaPerHourExceeded
  • RegisteredQuotaPerWeekExceeded

Again, it's up to you to parse the response and decide what to do.

Currently the Client knows how to issue request for the following services:

Client method API end point throttle
family(reference_type, input, endpoint=None, constituents=None) family inpadoc
image(path, range=1, extension='tiff') published-data/images images
number(reference_type, input, output_format) number-service other
published_data(reference_type, input, endpoint='biblio', constituents=None) published-data retrieval
published_data_search(cql, range_begin=1, range_end=25, constituents=None) published-data/search search
register(reference_type, input, constituents=['biblio']) register other
register_search(cql, range_begin=1, range_end=25) register/search other
register_search(cql, range_begin=1, range_end=25) register/search other

Bulk operations can be achieved by passing a list of valid models to the published_data input field.

See the OPS guide or use the Developer's Area for more information on how to use each service.

Please submit pull requests for the following services by enhancing the epo_ops.api.Client class:

  • Legal service

Middleware

All requests and responses are passed through each middleware object listed in client.middlewares. Requests are processed in the order listed, and responses are processed in the reverse order.

Each middleware should subclass middlewares.Middleware and implement the process_request and process_response methods.

There are two middleware classes out of the box: Throttler and Dogpile. Throttler is in charge of the OPS throttling rules and will delay requests accordingly. Dogpile is an optional cache which will cache all HTTP status 200, 404, 405, and 413 responses.

By default, only the Throttler middleware is enabled, if you want to enable caching:

import epo_ops

middlewares = [
    epo_ops.middlewares.Dogpile(),
    epo_ops.middlewares.Throttler(),
]
client = epo_ops.Client(
    key='key',
    secret='secret',
    middlewares=middlewares,
)

You'll also need to install caching dependencies in your projects, such as pip install dogpile.cache.

Note that caching middleware should be first in most cases.

Dogpile

Dogpile is based on (surprise) dogpile.cache. By default it is instantiated with a DBMBackend region with timeout of 2 weeks.

Dogpile takes three optional instantiation parameters:

  • region: You can pass whatever valid dogpile.cache Region you want to backend the cache
  • kwargs_handlers: A list of keyword argument handlers, which it will use to process the kwargs passed to the request object in order to extract elements for generating the cache key. Currently one handler is implemented (and instantiated by default) to make sure that the range request header is part of the cache key.
  • http_status_codes: A list of HTTP status codes that you would like to have cached. By default 200, 404, 405, and 413 responses are cached.

Note: dogpile.cache is not installed by default, if you want to use it, pip install dogpile.cache in your project.

Throttler

Throttler contains all the logic for handling different throttling scenarios. Since OPS throttling is based on a one minute rolling window, we must persist historical (at least for the past minute) throtting data in order to know what the proper request frequency is. Each Throttler must be instantiated with a Storage object.

Storage

The Storage object is responsible for:

  1. Knowing how to update the historical record with each request (Storage.update()), making sure to observe the one minute rolling window rule.
  2. Calculating how long to wait before issuing the next request (Storage.delay_for()).

Currently the only Storage backend provided is SQLite, but you can easily write your own Storage backend (such as file, Redis, etc.). To use a custom Storage type, just pass the Storage object when you're instantiating a Throttler object. See epo_ops.middlewares.throttle.storages.Storage for more implementation details.

Change Log

4.0.0 (2021-09-19)

  • Upgrade dependencies
  • Drop support for Python 2.7 and Python 3.5
  • Add support for Python 3.9

3.1.3 (2020-09-23)

  • Upgrade dependencies

3.1.2 (2020-07-04)

  • Upgrade dependencies

3.1.1 (2019-10-28)

  • GET instead of POST for family services, thanks to amotl. See #33 for more info.

3.1.0 (2019-10-27)

  • Add support for bulk retrieval, thanks to mmath

3.0.0 (2019-10-27)

  • Drop support for PyPy, Python 3.4 (probably still works)
  • Add support for Python 3.7, 3.8
  • Invalid and expired tokens are now treated the same, since OPS doesn't distinguish between the two use cases.

2.3.2 (2018-01-15)

  • Bug fix: Cache 4xx results as well, thanks to amotl

2.3.1 (2017-11-10)

  • Bug fix: explicitly declare content-type during request

2.3.0 (2017-10-22)

  • Drop support for Python 2.6
  • Officially support Python 3.6
  • Update to latest dependencies
  • Add image retrieval service, thanks to rfaga

2.2.0 (2017-03-30)

  • EPO OPS v3.2 compatibility, thanks to eltermann

2.1.0 (2016-02-21)

2.0.0 (2015-12-11)

  • Dropping support for Python 3.3 (although it probably still works).
  • Update to latest dependencies, no new features.

1.0.0 (2015-09-20)

  • Allow no middleware to be specified
  • Minor tweaks to development infrastructure, no new features.
  • This has been working for a while now, let's call it 1.0!

0.1.9 (2015-07-21)

  • No new features, just updating third party dependencies

0.1.8 (2015-01-24)

  • No new features, just updating third party dependencies

0.1.7 (2015-01-24)

  • Created default Dogpile DBM path if it doesn't exist

0.1.6 (2014-12-12)

  • Fixed bug with how service URL is constructed

0.1.5 (2014-10-17)

  • Added support for register retrieval and search

0.1.4 (2014-10-10)

  • Verified PyPy3 support
  • Updated various dependency pacakges

0.1.3 (2014-05-21)

  • Python 3.4 compatibility
  • Updated requests dependency to 2.3.0

0.1.2 (2014-03-04)

  • Python 2.6 and 3.3 compatibility

0.1.1 (2014-03-01)

  • Allow configuration of which HTTP responses (based on status code) to cache

0.1.0 (2014-02-20)

  • Introduced dogpile.cache for caching http200 resopnses
  • Introduced the concept of middleware

0.0.1 (2014-01-21)

  • Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-epo-ops-client-4.0.0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

python_epo_ops_client-4.0.0-py2.py3-none-any.whl (21.9 kB view details)

Uploaded Python 2Python 3

File details

Details for the file python-epo-ops-client-4.0.0.tar.gz.

File metadata

  • Download URL: python-epo-ops-client-4.0.0.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for python-epo-ops-client-4.0.0.tar.gz
Algorithm Hash digest
SHA256 78abe507708f9563695f5d35114ada4db34bfa864b31b7008bbe15b79802ddd2
MD5 863fe598396fc878eae858bf79caf5ac
BLAKE2b-256 8d98c67007d2ea92fb17cf177a7c4da3763eeb3d25aa0396d7fee244fc5f02f6

See more details on using hashes here.

File details

Details for the file python_epo_ops_client-4.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: python_epo_ops_client-4.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 21.9 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.2 CPython/3.9.7

File hashes

Hashes for python_epo_ops_client-4.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 135e4183a6c07b3c23b22e5bfb2a03eafe0c2a3959e607a038073fd22b48fb31
MD5 51b074fe923bbd8528b58a2fadbe6519
BLAKE2b-256 141ea7f645bd34b985750456609f1742d92d2c83dd400e8432276fb4d6fc7f78

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page