Skip to main content

Media Cloud CLIFF API Client Library

Project description

Media CLoud CLIFF API Client

This is a simple Python client for the Media Cloud CLIFF-CLAVIN geocoder.

Usage

If you just want to use this library to talk to a CLIFF server you have running somewhere, first install it

pip install mediacloud-cliff

Then instantiate and use it like this:

from cliff.api import Cliff
my_cliff = Cliff('http://myserver.com:8080')
my_cliff.parse_text("This is about Einstien at the IIT in New Delhi.")

This will return results like this:

{
  "results": {
    "organizations": [
      {
        "count": 1,
        "name": "IIT"
      }
    ],
    "places": {
      "focus": {
        "cities": [
          {
            "id": 1261481,
            "lon": 77.22445,
            "name": "New Delhi",
            "score": 1,
            "countryGeoNameId": "1269750",
            "countryCode": "IN",
            "featureCode": "PPLC",
            "featureClass": "P",
            "stateCode": "07",
            "lat": 28.63576,
            "stateGeoNameId": "1273293",
            "population": 317797
          }
        ],
        "states": [
          {
            "id": 1273293,
            "lon": 77.1,
            "name": "National Capital Territory of Delhi",
            "score": 1,
            "countryGeoNameId": "1269750",
            "countryCode": "IN",
            "featureCode": "ADM1",
            "featureClass": "A",
            "stateCode": "07",
            "lat": 28.6667,
            "stateGeoNameId": "1273293",
            "population": 16787941
          }
        ],
        "countries": [
          {
            "id": 1269750,
            "lon": 79,
            "name": "Republic of India",
            "score": 1,
            "countryGeoNameId": "1269750",
            "countryCode": "IN",
            "featureCode": "PCLI",
            "featureClass": "A",
            "stateCode": "00",
            "lat": 22,
            "stateGeoNameId": "",
            "population": 1173108018
          }
        ]
      },
      "mentions": [
        {
          "id": 1261481,
          "lon": 77.22445,
          "source": {
            "charIndex": 37,
            "string": "New Delhi"
          },
          "name": "New Delhi",
          "countryGeoNameId": "1269750",
          "countryCode": "IN",
          "featureCode": "PPLC",
          "featureClass": "P",
          "stateCode": "07",
          "confidence": 1,
          "lat": 28.63576,
          "stateGeoNameId": "1273293",
          "population": 317797
        }
      ]
    },
    "people": [
      {
        "count": 1,
        "name": "Einstien"
      }
    ]
  },
  "status": "ok",
  "milliseconds": 22,
  "version": "2.6.1"
}

You can also just get info from the GeoNames database inside CLIFF:

from cliff.api import Cliff
my_cliff = Cliff('http://myserver.com:8080')
my_cliff.geonames_lookup(4943351)

This will give you results like this:

{
  "results": {
    "id": 4943351,
    "lon": -71.09172,
    "name": "Massachusetts Institute of Technology",
    "countryGeoNameId": "6252001",
    "countryCode": "US",
    "featureCode": "SCH",
    "featureClass": "S",
    "parent": {
      "id": 4943909,
      "lon": -71.39184,
      "name": "Middlesex County",
      "countryGeoNameId": "6252001",
      "countryCode": "US",
      "featureCode": "ADM2",
      "featureClass": "A",
      "parent": {
        "id": 6254926,
        "lon": -71.10832,
        "name": "Massachusetts",
        "countryGeoNameId": "6252001",
        "countryCode": "US",
        "featureCode": "ADM1",
        "featureClass": "A",
        "parent": {
          "id": 6252001,
          "lon": -98.5,
          "name": "United States",
          "countryGeoNameId": "6252001",
          "countryCode": "US",
          "featureCode": "PCLI",
          "featureClass": "A",
          "stateCode": "00",
          "lat": 39.76,
          "stateGeoNameId": "",
          "population": 310232863
        },
        "stateCode": "MA",
        "lat": 42.36565,
        "stateGeoNameId": "6254926",
        "population": 6433422
      },
      "stateCode": "MA",
      "lat": 42.48555,
      "stateGeoNameId": "6254926",
      "population": 1503085
    },
    "stateCode": "MA",
    "lat": 42.35954,
    "stateGeoNameId": "6254926",
    "population": 0
  },
  "status": "ok",
  "version": "2.6.1"
}

Development

If you want to work on this API client, then first clone the source repo from GitHub and install the dependencies

nmake install

Then make a .env file in this directory and put the url to your CLIFF server in it:

CLIFF_URL=http://localhost:8080

Distribution

  1. Run make test to make sure all the test pass
  2. Update the version number in cliff/__init__.py
  3. Make a brief note in the version history section below about the changes
  4. Run make build-release to create an install package
  5. Run make release-test to upload it to PyPI's test platform
  6. Run make release to upload it to PyPI

Version History

  • v2.6.2: add timeout to constructor, change to bubble up any exceptions instead of swallowing silently
  • v2.6.1: upgrade to CLIFF v2.6.1 (internal build changes)
  • v2.6.0: upgrade to CLIFF v2.6.0 (adds multi-lingual support at query level and upgrades NER models)
  • v2.5.0: upgrade to CLIFF v2.5.0 (and keep version numbers roughly in sync)
  • v2.1.0: upgrade to CLIFF v2.4.2
  • v2.0.2: update examples in readme file
  • v2.0.1: init with url instead of host/port
  • v2.0.0: move to mediacloud naming, underscored method names, remove deprecated NLP endpoint
  • v1.4.0: upgrade to CLIFF v2.4.1, add support for extractContent endpoint
  • v1.3.1: updates for python3
  • v1.3.0: updates for python3, support for client-side text replacements
  • v1.2.0: points at CLIFF v2.3.0 (updates Stanford NER & has new plugin architecture)
  • v1.1.0: points at CLIFF v2.2.0 (adds ancestry to geonamesLookup helper)
  • v1.0.2: first release to PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mediacloud_cliff-2.6.2.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mediacloud_cliff-2.6.2-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file mediacloud_cliff-2.6.2.tar.gz.

File metadata

  • Download URL: mediacloud_cliff-2.6.2.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mediacloud_cliff-2.6.2.tar.gz
Algorithm Hash digest
SHA256 d7efb44cfa8d69abcb25e2f564b75c34d4d01e3ca0f3ccc1d363a713975510bc
MD5 7ebcd0f426bda78f1d46e8c654a7a741
BLAKE2b-256 5824daf3e2385554d441a2bcd410e7420efcf3f29612d857235f468beb8e727b

See more details on using hashes here.

Provenance

The following attestation bundles were made for mediacloud_cliff-2.6.2.tar.gz:

Publisher: publish-to-pypi.yml on mediacloud/cliff-api-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mediacloud_cliff-2.6.2-py3-none-any.whl.

File metadata

File hashes

Hashes for mediacloud_cliff-2.6.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d05ce71165ff461168a8aae9c57682609dd1add3ad5f600941645842812c8f6a
MD5 2d9167fafed933c7beb69955a0b5dda6
BLAKE2b-256 237bb5140802d69cff16a246dd674137c02acf1cf589ca32353c6aa3967ba9b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for mediacloud_cliff-2.6.2-py3-none-any.whl:

Publisher: publish-to-pypi.yml on mediacloud/cliff-api-client

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page