REST extention for the Cognite extractor-utils framework
Project description
Cognite extractor-utils
REST extension
The REST extension for Cognite extractor-utils
provides a way
to easily write your own extractors for RESTful source systems.
The library is currently under development, and should not be used in production environments yet.
Overview
The REST extension for extractor utils templetizes how the extractor will make HTTP requests to the source, automatically serializes the response into user-defined DTO classes, and handles uploading of data to CDF.
The only part of the extractor necessary to for a user to implement are
- Describing how HTTP requests should be constructed using pre-built function decorators
- Describing the response schema using Python
dataclass
es - Implementing a mapping from the source data model to the CDF data model
For example, consider CDF's Events API as a source. We could
describe the response schema as an EventsList
dataclass:
@dataclass
class RawEvent:
externalId: Optional[str]
dataSetId: Optional[int]
startTime: Optional[int]
endTime: Optional[int]
type: Optional[str]
subtype: Optional[str]
description: Optional[str]
metadata: Optional[Dict[str, str]]
assetIds: Optional[List[Optional[int]]]
source: Optional[str]
id: Optional[int]
lastUpdatedTime: Optional[int]
createdTime: Optional[int]
@dataclass
class EventsList:
items: List[RawEvent]
nextCursor: Optional[str]
We can then write a handler that takes in one of these EventList
s, and returns CDF Events, as represented by instances
of the Event
class from the cognite.extractorutils.rest.typing
module.
extractor = RestExtractor(
name="Event extractor",
description="Extractor from CDF events to CDF events",
version="1.0.0",
base_url=f"https://api.cognitedata.com/api/v1/projects/{os.environ['COGNITE_PROJECT']}/",
headers={"api-key": os.environ["COGNITE_API_KEY"]},
)
@extractor.get("events", response_type=EventsList)
def get_events(events: EventsList) -> Generator[Event, None, None]:
for event in events.items:
yield Event(
external_id=f"testy-{event.id}",
description=event.description,
start_time=event.startTime,
end_time=event.endTime,
type=event.type,
subtype=event.subtype,
metadata=event.metadata,
source=event.source,
)
with extractor:
extractor.run()
A full example is provided in the example.py
file.
The return type
If the return type is set to cognite.extractorutils.rest.http.JsonBody
then the raw json payload will be passed to the handler.
This is useful for cases where the payload is hard or impossible to describe with data classes.
If the return type is set to requests.Response
, the raw response message itself is passed to the handler.
Lists at the root
Using Python dataclasses we're not able to express JSON structures where the root element is a list. To get around that responses of this nature will be automatically converted to something which can be modeled with Python dataclasses.
A JSON structure containing a list as it's root element will be converted to an object containing a single key, "items", which has the original JSON list as it's value, as in the example below.
[{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]
will be converted to
{
"items": [{"object_id": 1}, {"object_id": 2}, {"object_id": 3}]
}
This does not apply if the return type is set to JsonBody
.
Contributing
We use poetry to manage dependencies and to administrate virtual environments. To develop
extractor-utils
, follow the following steps to set up your local environment:
- Install poetry: (add
--user
if desirable)$ pip install poetry
- Clone repository:
$ git clone git@github.com:cognitedata/python-extractor-utils-rest.git
- Move into the newly created local repository:
$ cd python-extractor-utils-rest
- Create virtual environment and install dependencies:
$ poetry install
All code must pass typing and style checks to be merged. It is recommended to install pre-commit hooks to ensure that these checks pass before commiting code:
$ poetry run pre-commit install
This project adheres to the Contributor Covenant v2.0 as a code of conduct.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cognite_extractor_utils_rest-0.5.0.tar.gz
.
File metadata
- Download URL: cognite_extractor_utils_rest-0.5.0.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.28.2 requests-toolbelt/0.10.1 urllib3/1.26.14 tqdm/4.64.1 importlib-metadata/6.0.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1075d92af99771da087c604f0c35fd8ea7714ab5c8179da3bb45dac10f03ed4 |
|
MD5 | 8387fe924ba5a2bf82a523d4be0d6c1d |
|
BLAKE2b-256 | 840d410102e8f454153787bdc92b67b327127c09bae0a3f8da3d48351b9c12f6 |
Provenance
File details
Details for the file cognite_extractor_utils_rest-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: cognite_extractor_utils_rest-0.5.0-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.9.6 readme-renderer/37.3 requests/2.28.2 requests-toolbelt/0.10.1 urllib3/1.26.14 tqdm/4.64.1 importlib-metadata/6.0.0 keyring/23.13.1 rfc3986/2.0.0 colorama/0.4.6 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80b561add0fc876a350e8c018ed863f072f2a273e44b76796eb40947f4014462 |
|
MD5 | 8e98c58527ee791273d579dd85302091 |
|
BLAKE2b-256 | aaf30b93c8129b279c328f9fe849fba911616f22936d4d42f717a76dff64b632 |