This is a pre-production deployment of Warehouse, however changes made here WILL affect the production instance of PyPI.
Latest Version Dependencies status unknown Test status unknown Test coverage unknown
Project Description

The Google Refine Python Client Library provides an interface to communicating with a Google Refine server.

Currently, the following API is supported:

  • project creation/import, deletion, export
  • facet computation
    • text
    • text filter
    • numeric
    • blank
    • starred & flagged
    • … extensible class
  • ‘engine’: managing multiple facets and their computation results
  • sorting & reordering
  • clustering
  • transforms
  • transposes
  • single and mass edits
  • annotation (star/flag)
  • column
    • move
    • add
    • split
    • rename
    • reorder
    • remove
  • reconciliation
    • reconciliation judgment facet
    • guessing column type
    • querying reconciliation services preferences
    • perform reconciliation

Configuration

By default the Google Refine server URL is http://127.0.0.1:3333 The environment variables GOOGLE_REFINE_HOST and GOOGLE_REFINE_PORT enable overriding the host & port.

In order to run all tests, a live Refine server is needed. No existing projects are affected.

Installation

(Someone with more familiarity with python’s byzantine collection of installation frameworks is very welcome to improve/”best practice” all this.)

  1. Install dependencies, which currently is urllib2_file:

    sudo pip install -r requirements.txt

  2. Ensure you have a Refine server running somewhere and, if necessary, set the envvars as above.

  3. Run tests, build, and install:

    python setup.py test # to do a subset, e.g., --test-suite tests.test_facet

    python setup.py build

    python setup.py install

There is a Makefile that will do this too, and more.

TODO

The API so far has been filled out from building a test suite to carry out the actions in David Huynh’s Refine tutorial which while certainly showing off a wide range of Refine features doesn’t cover the entire suite. Notable exceptions currently include:

  • reconciliation support is useful but not complete
  • undo/redo
  • Freebase
  • join columns
  • columns from URL

Contribute

Patches welcome! Source is at https://github.com/PaulMakepeace/refine-client-py

Useful Tools

One aspect of development is watching HTTP transactions. To that end, I found Fiddler on Windows and HTTPScoop invaluable. The latter won’t URL-decode nor nicely format JSON but the Online JavaScript Beautifier will.

Credits

Paul Makepeace, author, <paulm@paulm.com>

David Huynh, initial cut

Artfinder, inspiration

Some data used in the test suite has been used from publicly available sources,

Release History

Release History

0.2.1

This version

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

0.1.0

History Node

TODO: Figure out how to actually get changelog content.

Changelog content for this version goes here.

Donec et mollis dolor. Praesent et diam eget libero egestas mattis sit amet vitae augue. Nam tincidunt congue enim, ut porta lorem lacinia consectetur. Donec ut libero sed arcu vehicula ultricies a non tortor. Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Show More

Download Files

Download Files

TODO: Brief introduction on what you do with files - including link to relevant help section.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
refine-client-0.2.1.tar.gz (550.9 kB) Copy SHA256 Checksum SHA256 Source Jul 22, 2011

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS HPE HPE Development Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting