Skip to main content

REST extention for the Cognite extractor-utils framework

Project description

Cognite extractor-utils REST extension

The REST extension for Cognite extractor-utils provides a way to easily write your own extractors for RESTful source systems.

The library is currently under development, and should not be used in production environments yet.

Overview

The REST extension for extractor utils templetizes how the extractor will make HTTP requests to the source, automatically serializes the response into user-defined DTO classes, and handles uploading of data to CDF.

The only part of the extractor necessary to for a user to implement are

  • Describing how HTTP requests should be constructed using pre-built function decorators
  • Describing the response schema using Python dataclasses
  • Implementing a mapping from the source data model to the CDF data model

For example, consider CDF's Events API as a source. We could describe the response schema as an EventsList dataclass:

@dataclass
class RawEvent:
    externalId: Optional[str]
    dataSetId: Optional[int]
    startTime: Optional[int]
    endTime: Optional[int]
    type: Optional[str]
    subtype: Optional[str]
    description: Optional[str]
    metadata: Optional[Dict[str, str]]
    assetIds: Optional[List[Optional[int]]]
    source: Optional[str]
    id: Optional[int]
    lastUpdatedTime: Optional[int]
    createdTime: Optional[int]


@dataclass
class EventsList:
    items: List[RawEvent]
    nextCursor: Optional[str]

We can then write a handler that takes in one of these EventLists, and returns CDF Events, as represented by instances of the Event class from the cognite.extractorutils.rest.typing module.

extractor = RestExtractor(
    name="Event extractor",
    description="Extractor from CDF events to CDF events",
    version="1.0.0",
    base_url=f"https://api.cognitedata.com/api/v1/projects/{os.environ['COGNITE_PROJECT']}/",
    headers={"api-key": os.environ["COGNITE_API_KEY"]},
)

@extractor.get("events", response_type=EventsList)
def get_events(events: EventsList) -> Generator[Event, None, None]:
    for event in events.items:
        yield Event(
            external_id=f"testy-{event.id}",
            description=event.description,
            start_time=event.startTime,
            end_time=event.endTime,
            type=event.type,
            subtype=event.subtype,
            metadata=event.metadata,
            source=event.source,
        )

with extractor:
    extractor.run()

A full example is provided in the example.py file.

Lists at the root

Using Python dataclasses we're not able to express JSON structures where the root element is a list. To get around that responses of this nature will be automatically converted to something which can be modeled with Python dataclasses.

A JSON structure containing a list as it's root element will be converted to an object containing a single key, "items", which has the original JSON list as it's value, as in the example below.

[{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]

will be converted to

{
    "items": [{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]
}

Contributing

We use poetry to manage dependencies and to administrate virtual environments. To develop extractor-utils, follow the following steps to set up your local environment:

  1. Install poetry: (add --user if desirable)
    $ pip install poetry
    
  2. Clone repository:
    $ git clone git@github.com:cognitedata/python-extractor-utils-rest.git
    
  3. Move into the newly created local repository:
    $ cd python-extractor-utils-rest
    
  4. Create virtual environment and install dependencies:
    $ poetry install
    

All code must pass typing and style checks to be merged. It is recommended to install pre-commit hooks to ensure that these checks pass before commiting code:

$ poetry run pre-commit install

This project adheres to the Contributor Covenant v2.0 as a code of conduct.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognite_extractor_utils_rest-0.2.3.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file cognite_extractor_utils_rest-0.2.3.tar.gz.

File metadata

  • Download URL: cognite_extractor_utils_rest-0.2.3.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/5.0.0 keyring/23.11.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.15

File hashes

Hashes for cognite_extractor_utils_rest-0.2.3.tar.gz
Algorithm Hash digest
SHA256 b9318264fbe556e11493c57286aebf49560fa232e50965992dedbedc256dcfbb
MD5 9e88888dcea23141e37e9bc069b6952f
BLAKE2b-256 9ddf0b619e39586f390b8d8f0dc5ae251f4413077adbf1d9027ec926b59d123d

See more details on using hashes here.

File details

Details for the file cognite_extractor_utils_rest-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: cognite_extractor_utils_rest-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.13 tqdm/4.64.1 importlib-metadata/5.0.0 keyring/23.11.0 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.15

File hashes

Hashes for cognite_extractor_utils_rest-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6f429dbaac449556b0344b4209c568e92fde5fb62ede0db433dc5130d9d553b9
MD5 04aea6c3d61a20687eccfe377651c39d
BLAKE2b-256 17b5542404b2a857bf4431193600df9144ffb24d0985895815ba2985db0e7943

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page