C8 Connector Interface
Project description
Implementing C8 Connectors.
Users can extend C8Connector
interface and develop 3 types of connectors.
- Source Connectors (Connectors that ingest data)
- Target Connectors (Connectors that export data)
- Integration Connectors (Generic integrations for other services)
When developing these connectors, developers must adhere to a few guidelines mentioned below.
Naming the Connector
- Package name of the connector must be in the
macrometa-{type}-{connector}
format (i.emacrometa-source-postgres
). - Module name of the connector must be in the
macrometa_{type}_{connector}
format (i.emacrometa_source_postgres
).
Project structure (package names and structure)
- Project source code must follow the below structure.
.
├── LICENSE
├── README.md
├── GETTING_STARTED.md
├── macrometa_{type}_{connector}
│ ├── __init__.py
│ └── main.py
│ └── {other source files or modules}
├── pyproject.toml
└── setup.cfg
- Within the
/macrometa_{type}_{connector}/__init__.py
there must be a class which implementsC8Connector
interface.
Dependencies/Libraries and their versions to use.
- Connectors must only use following dependencies/libraries and mentioned versions' when developing.
python = ">=3.7"
c8connector = "latest"
pipelinewise-singer-python = "1.2.0"
- Developers must not use
singer-sdk
or any other singer sdk variants other thanpipelinewise-singer-python
.
Connector specific documentation
-
Every connector project should have a GETTING_STARTED.md file, documenting the connector configuration and all other requirements for the connector. It should be formatted like a User-Facing document and it should also provide the necessary instructions for the end user to be able to use the connector.
Developers can follow the Generic Template available here and apply any necessary changes required on top of it for the specific connector.
Resolving reserved key conflicts between macrometa collection and external DB
-
For Source connectors: Macrometa collection(document) has the following reserved keys,
_key
,_id
and_rev
._key
is the primary key and hence, it will have the value of the primary key of source data and_id
and_rev
are always autogenerated. So if_key
,_id
,_rev
also exists in source data (assuming _key is not the primary key of source data) then these values from source data would be lost. Hence we need to append an additional_
to these reserved keys if they are present in source data. If_key
is the primary key of the source data itself then no need to append_
to_key
. We should also check that the new key generated doesn't exist in the source data, If it exists then keep appending_
.During the actual workflow run, this logic is implemented at the target level (macrometa-target-collection). But we also need the same to be implemented at source connector levels for
samples
andschemas
API. Refer [PR] (https://github.com/Macrometacorp/macrometa-source-postgres/) -
For Target connectors: As seen for source connector with target as (macrometa-target-collection), the same reserved keys conflict can arise in case of target connectors too, where External database might have some fixed reserved keys which might be their primary key, autogenerated or internal key. So if such reserved keys also exist in source collection then these values from source collection will be lost in the target data.
In such cases we should first specify all the reserved keys as a list of string in the
reserved_keys
field of target connector. If there is a fixed primary key it should always be specified as the first element of the list, else if there isn't a fixed primary key but there are other reserved keys then the first element should be an empty string followed by the list of reserved keys, Example: ["", "reservedkey1", "reservedkey2"]. If no reserved keys exist return an empty list []. Refer [PR] (https://github.com/Macrometacorp/macrometa-target-collection/pull/9)In addition to this, we also need to implement the logic of appending
_
to the reserved keys (only if they exist in source collection) before writing the data in the external DB at the target connector level. If_key
is also the reserved primary key of the target external DB then no need to append_
to this reserved primary key. We should also check that the new key generated doesn't exist in the source collection, If it exists then keep appending_
. Refer [PR] (https://github.com/Macrometacorp/macrometa-target-collection/pull/10)NOTE: This is applicable only when there are certain keys reserved in the external database.
Adding metrics to the connectors
-
For Source connectors: We support the following ingest metrics for source connectors: ingested_bytes, ingested_documents, ingest_errors, ingest_lag Out of the above 4 metrics, we only need to increment ingest_errors at the source connector level whenever there is an error and start the prometheus client http server at port 8000. The rest of them are calculated at macrometa target collection level. But to calculate ingest_lag metrics the source connector needs to send the current timestamp in
time_extracted
property of the Singer Record Message. You can refer to other source connectors, for example: https://github.com/Macrometacorp/macrometa-source-postgres/pull/14/files -
For Target connectors: We support the following ingest metrics for target connectors: exported_bytes, exported_documents, export_errors, export_lag Out of the above 4 metrics, exported_documents and exported_bytes are calculated at Macrometa source collection level. We need to increment export_errors at the target connector level whenever there is an error and start the prometheus client http server at port 8001 and we also need to calculate export_lag.
export_lag
is nothing but the time difference in seconds between thetime_extracted
property in Singer Record Message, which is sent by macrometa source collection connector, and the current timestamp (UTC timezone). You can refer to other target connectors, for example: https://github.com/Macrometacorp/macrometa-target-postgres/pull/31/files
State management for connectors
We do support state management for connectors. Please refer here for a comprehensive guide on managing states using connectors.
Sample Connectors
- Postgres Source Connector: Git Repository
- Oracle Source Connector: Git Repository
- C8 Collections target Connector: Git Repository
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file c8connector-0.0.32.tar.gz
.
File metadata
- Download URL: c8connector-0.0.32.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cbcc55702f3e3596f3deb6f558fbcbbde210c1d54dc886bc48d56bf4101201ad |
|
MD5 | 2aaffc9d68ce628fbef04a6b71600730 |
|
BLAKE2b-256 | 02aeece314f9645c08b8d54b5e6fca62144bbf35520c9fa5200caf1cc7b710f1 |
File details
Details for the file c8connector-0.0.32-py3-none-any.whl
.
File metadata
- Download URL: c8connector-0.0.32-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f377b53bc3eb7ec145847a944bfcb4a230aef33d200ef5912e376e4ec3525746 |
|
MD5 | fdb6a0131f71def42837d24a7293bdbe |
|
BLAKE2b-256 | 0c292c473b5d369e705224ad0818b0546bf305aa2b4ee25f04647f9257e6a19b |