Skip to main content

The Google Refine Python Client Library provides an interface to communicating with a Google Refine server.

Project description

The Google Refine Python Client Library provides an interface to communicating with a Google Refine server.

Currently, the following API is supported:

  • project creation/import, deletion, export

  • facet computation

    • text

    • text filter

    • numeric

    • blank

    • starred & flagged

    • … extensible class

  • ‘engine’: managing multiple facets and their computation results

  • sorting & reordering

  • clustering

  • transforms

  • transposes

  • single and mass edits

  • annotation (star/flag)

  • column

    • move

    • add

    • split

    • rename

    • reorder

    • remove

Configuration

By default the Google Refine server URL is http://127.0.0.1:3333 The environment variables GOOGLE_REFINE_HOST and GOOGLE_REFINE_PORT enable overriding the host & port.

In order to run all tests, a live Refine server is needed. No existing projects are affected.

Installation

(Someone with more familiarity with python’s byzantine collection of installation frameworks is very welcome to improve/”best practice” all this.)

  1. Install dependencies, which currently is urllib2_file:

    sudo pip install -r requirements.txt

  2. Ensure you have a Refine server running somewhere and, if necessary, set the envvars as above.

  3. Run tests, build, and install:

    python setup.py test # to do a subset, e.g., --test-suite tests.test_facet

    python setup.py build

    python setup.py install

There is a Makefile that will do this too, and more.

TODO

The API so far has been filled out from building a test suite to carry out the actions in David Huynh’s Refine tutorial which while certainly showing off a wide range of Refine features doesn’t cover the entire suite. Notable exceptions currently include:

  • reconciliation

  • undo/redo

  • Freebase

  • join columns

  • columns from URL

Credits

Paul Makepeace, author, <paulm@paulm.com>

David Huynh, initial cut

Artfinder, inspiration

Some data used in the test suite has been used from publicly available sources,

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

refine-client-0.1.0.tar.gz (540.4 kB view details)

Uploaded Source

File details

Details for the file refine-client-0.1.0.tar.gz.

File metadata

  • Download URL: refine-client-0.1.0.tar.gz
  • Upload date:
  • Size: 540.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for refine-client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e1e079f6675ca99d36dda2f3a25017b31fcc3c1dada546de6845e926a9af7621
MD5 fe917131ecbfc93703af4cc77193b883
BLAKE2b-256 f4ec987c6bfc59c9a8de5ae5647c0a4e16944f233173bece4d5f3007caf33822

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page