Skip to main content

REST extention for the Cognite extractor-utils framework

Project description

Cognite extractor-utils REST extension

The REST extension for Cognite extractor-utils provides a way to easily write your own extractors for RESTful source systems.

The library is currently under development, and should not be used in production environments yet.

Overview

The REST extension for extractor utils templetizes how the extractor will make HTTP requests to the source, automatically serializes the response into user-defined DTO classes, and handles uploading of data to CDF.

The only part of the extractor necessary to for a user to implement are

  • Describing how HTTP requests should be constructed using pre-built function decorators
  • Describing the response schema using Python dataclasses
  • Implementing a mapping from the source data model to the CDF data model

For example, consider CDF's Events API as a source. We could describe the response schema as an EventsList dataclass:

@dataclass
class RawEvent:
    externalId: Optional[str]
    dataSetId: Optional[int]
    startTime: Optional[int]
    endTime: Optional[int]
    type: Optional[str]
    subtype: Optional[str]
    description: Optional[str]
    metadata: Optional[Dict[str, str]]
    assetIds: Optional[List[Optional[int]]]
    source: Optional[str]
    id: Optional[int]
    lastUpdatedTime: Optional[int]
    createdTime: Optional[int]


@dataclass
class EventsList:
    items: List[RawEvent]
    nextCursor: Optional[str]

We can then write a handler that takes in one of these EventLists, and returns CDF Events, as represented by instances of the Event class from the cognite.extractorutils.rest.typing module.

extractor = RestExtractor(
    name="Event extractor",
    description="Extractor from CDF events to CDF events",
    version="1.0.0",
    base_url=f"https://api.cognitedata.com/api/v1/projects/{os.environ['COGNITE_PROJECT']}/",
    headers={"api-key": os.environ["COGNITE_API_KEY"]},
)

@extractor.get("events", response_type=EventsList)
def get_events(events: EventsList) -> Generator[Event, None, None]:
    for event in events.items:
        yield Event(
            external_id=f"testy-{event.id}",
            description=event.description,
            start_time=event.startTime,
            end_time=event.endTime,
            type=event.type,
            subtype=event.subtype,
            metadata=event.metadata,
            source=event.source,
        )

with extractor:
    extractor.run()

A full example is provided in the example.py file.

Lists at the root

Using Python dataclasses we're not able to express JSON structures where the root element is a list. To get around that responses of this nature will be automatically converted to something which can be modeled with Python dataclasses.

A JSON structure containing a list as it's root element will be converted to an object containing a single key, "items", which has the original JSON list as it's value, as in the example below.

[{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]

will be converted to

{
    "items": [{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]
}

Contributing

We use poetry to manage dependencies and to administrate virtual environments. To develop extractor-utils, follow the following steps to set up your local environment:

  1. Install poetry: (add --user if desirable)
    $ pip install poetry
    
  2. Clone repository:
    $ git clone git@github.com:cognitedata/python-extractor-utils-rest.git
    
  3. Move into the newly created local repository:
    $ cd python-extractor-utils-rest
    
  4. Create virtual environment and install dependencies:
    $ poetry install
    

All code must pass typing and style checks to be merged. It is recommended to install pre-commit hooks to ensure that these checks pass before commiting code:

$ poetry run pre-commit install

This project adheres to the Contributor Covenant v2.0 as a code of conduct.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognite_extractor_utils_rest-0.2.1.tar.gz (14.8 kB view details)

Uploaded Source

Built Distribution

File details

Details for the file cognite_extractor_utils_rest-0.2.1.tar.gz.

File metadata

  • Download URL: cognite_extractor_utils_rest-0.2.1.tar.gz
  • Upload date:
  • Size: 14.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.12 tqdm/4.64.1 importlib-metadata/5.0.0 keyring/23.9.3 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.15

File hashes

Hashes for cognite_extractor_utils_rest-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e69bc9d057ba095cc7c18497d484d32ef0c4e18b880707f4f18878debaecd372
MD5 10ed47e9e722ed603682e747f4c8745e
BLAKE2b-256 cf5c90379c141af2289609419a734dc8d102814ac987566cc0e25d901235ac25

See more details on using hashes here.

File details

Details for the file cognite_extractor_utils_rest-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: cognite_extractor_utils_rest-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 15.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.3 readme-renderer/37.3 requests/2.28.1 requests-toolbelt/0.10.1 urllib3/1.26.12 tqdm/4.64.1 importlib-metadata/5.0.0 keyring/23.9.3 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.15

File hashes

Hashes for cognite_extractor_utils_rest-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4174f3c39c700d04cd20396ef9ad169907382588d9a59b36ed41d53784581ef5
MD5 b0673a65e162179c9c5537d8a9e4aa7b
BLAKE2b-256 3f28b9c79cc3b3a8784e8e68c6e1a18948beefe1c2b352bd4f338e2ca4bf6e3f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page