Skip to main content

C8 Connector Interface

Project description

Implementing C8 Connectors.

Users can extend C8Connector interface and develop 3 types of connectors.

  1. Source Connectors (Connectors that ingest data)
  2. Target Connectors (Connectors that export data)
  3. Integration Connectors (Generic integrations for other services)

When developing these connectors, developers must adhere to a few guidelines mentioned below.

Naming the Connector

  • Package name of the connector must be in the macrometa-{type}-{connector} format (i.e macrometa-source-postgres).
  • Module name of the connector must be in the macrometa_{type}_{connector} format (i.e macrometa_source_postgres).

Project structure (package names and structure)

  • Project source code must follow the below structure.
.
├── LICENSE
├── README.md
├── GETTING_STARTED.md
├── macrometa_{type}_{connector}
│        ├── __init__.py
│        └── main.py
│        └── {other source files or modules}
├── pyproject.toml
└── setup.cfg
  • Within the /macrometa_{type}_{connector}/__init__.py there must be a class which implements C8Connector interface.

Dependencies/Libraries and their versions to use.

  • Connectors must only use following dependencies/libraries and mentioned versions' when developing.
python = ">=3.7"
c8connector = "latest"
pipelinewise-singer-python = "1.2.0"
  • Developers must not use singer-sdk or any other singer sdk variants other than pipelinewise-singer-python.

Connector specific documentation

  • Every connector project should have a GETTING_STARTED.md file, documenting the connector configuration and all other requirements for the connector. It should be formatted like a User-Facing document and it should also provide the necessary instructions for the end user to be able to use the connector.

    Developers can follow the Generic Template available here and apply any necessary changes required on top of it for the specific connector.

Resolving reserved key conflicts between macrometa collection and external DB

  • For Source connectors: Macrometa collection(document) has the following reserved keys, _key, _id and _rev. _key is the primary key and hence, it will have the value of the primary key of source data and _id and _rev are always autogenerated. So if _key, _id, _rev also exists in source data (assuming _key is not the primary key of source data) then these values from source data would be lost. Hence we need to append an additional _ to these reserved keys if they are present in source data. If _key is the primary key of the source data itself then no need to append _ to _key. We should also check that the new key generated doesn't exist in the source data, If it exists then keep appending _.

    During the actual workflow run, this logic is implemented at the target level (macrometa-target-collection). But we also need the same to be implemented at source connector levels for samples and schemas API. Refer [PR] (https://github.com/Macrometacorp/macrometa-source-postgres/)

  • For Target connectors: As seen for source connector with target as (macrometa-target-collection), the same reserved keys conflict can arise in case of target connectors too, where External database might have some fixed reserved keys which might be their primary key, autogenerated or internal key. So if such reserved keys also exist in source collection then these values from source collection will be lost in the target data.

    In such cases we should first specify all the reserved keys as a list of string in the reserved_keys field of target connector. If there is a fixed primary key it should always be specified as the first element of the list, else if there isn't a fixed primary key but there are other reserved keys then the first element should be an empty string followed by the list of reserved keys, Example: ["", "reservedkey1", "reservedkey2"]. If no reserved keys exist return an empty list []. Refer [PR] (https://github.com/Macrometacorp/macrometa-target-collection/pull/9)

    In addition to this, we also need to implement the logic of appending _ to the reserved keys (only if they exist in source collection) before writing the data in the external DB at the target connector level. If _key is also the reserved primary key of the target external DB then no need to append _ to this reserved primary key. We should also check that the new key generated doesn't exist in the source collection, If it exists then keep appending _. Refer [PR] (https://github.com/Macrometacorp/macrometa-target-collection/pull/10)

    NOTE: This is applicable only when there are certain keys reserved in the external database.

Adding metrics to the connectors

  • For Source connectors: We support the following ingest metrics for source connectors: ingested_bytes, ingested_documents, ingest_errors, ingest_lag Out of the above 4 metrics, we only need to increment ingest_errors at the source connector level whenever there is an error and start the prometheus client http server at port 8000. The rest of them are calculated at macrometa target collection level. But to calculate ingest_lag metrics the source connector needs to send the current timestamp in time_extracted property of the Singer Record Message. You can refer to other source connectors, for example: https://github.com/Macrometacorp/macrometa-source-postgres/pull/14/files

  • For Target connectors: We support the following ingest metrics for target connectors: exported_bytes, exported_documents, export_errors, export_lag Out of the above 4 metrics, exported_documents and exported_bytes are calculated at Macrometa source collection level. We need to increment export_errors at the target connector level whenever there is an error and start the prometheus client http server at port 8001 and we also need to calculate export_lag. export_lag is nothing but the time difference in seconds between the time_extracted property in Singer Record Message, which is sent by macrometa source collection connector, and the current timestamp (UTC timezone). You can refer to other target connectors, for example: https://github.com/Macrometacorp/macrometa-target-postgres/pull/31/files

State management for connectors

We do support state management for connectors. Please refer here for a comprehensive guide on managing states using connectors.

Sample Connectors

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

c8connector-0.0.32.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

c8connector-0.0.32-py3-none-any.whl (12.8 kB view details)

Uploaded Python 3

File details

Details for the file c8connector-0.0.32.tar.gz.

File metadata

  • Download URL: c8connector-0.0.32.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for c8connector-0.0.32.tar.gz
Algorithm Hash digest
SHA256 cbcc55702f3e3596f3deb6f558fbcbbde210c1d54dc886bc48d56bf4101201ad
MD5 2aaffc9d68ce628fbef04a6b71600730
BLAKE2b-256 02aeece314f9645c08b8d54b5e6fca62144bbf35520c9fa5200caf1cc7b710f1

See more details on using hashes here.

File details

Details for the file c8connector-0.0.32-py3-none-any.whl.

File metadata

  • Download URL: c8connector-0.0.32-py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for c8connector-0.0.32-py3-none-any.whl
Algorithm Hash digest
SHA256 f377b53bc3eb7ec145847a944bfcb4a230aef33d200ef5912e376e4ec3525746
MD5 fdb6a0131f71def42837d24a7293bdbe
BLAKE2b-256 0c292c473b5d369e705224ad0818b0546bf305aa2b4ee25f04647f9257e6a19b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page