Skip to main content

Python package for replicating data across CDF tenants. Copyright 2021 Cognite AS

Project description

Cognite logo

Cognite Python Replicator

build codecov Documentation Status PyPI version tox PyPI - Python Version Code style: black

Cognite Replicator is a Python package for replicating data across Cognite Data Fusion (CDF) projects. This package is built on top of the Cognite Python SDK.

Copyright 2019 Cognite AS

Prerequisites

In order to start using the Replicator, you need:

  • Python3 (>= 3.6)
  • Two API keys, one for your source tenant and one for your destination tenant. Never include the API key directly in the code or upload the key to github. Instead, set the API key as an environment variable.

This is how you set the API key as an environment variable on Mac OS and Linux:

$ export COGNITE_SOURCE_API_KEY=<your source API key>
$ export COGNITE_DESTINATION_API_KEY=<your destination API key>

Installation

The replicator is available on PyPI, and can also be executed as a standalone script.

To run it from command line, run:

pip install cognite-replicator
python -m cognite.replicator config/filepath.yml

If no file is specified then replicator will use config/default.yml.

Alternatively, build and run it as a docker container. The image is avaible on docker hub:

docker build -t cognite-replicator .
docker run -it cognite-replicator

For Databricks you can install it on a cluster. First, click on Libraries and Install New. Choose your library type to be PyPI, and enter cognite-replicator as Package. Let the new library install and you are ready to replicate!

Usage

Setup as Python library

import os

from cognite.client import CogniteClient

SRC_API_KEY = os.environ.get("COGNITE_SOURCE_API_KEY")
DST_API_KEY = os.environ.get("COGNITE_DESTINATION_API_KEY")
PROJECT_SRC = "Name of source tenant"
PROJECT_DST = "Name of destination tenant"
CLIENT_NAME = "cognite-replicator"
BATCH_SIZE = 10000 # this is the max size of a batch to be posted
NUM_THREADS= 10 # this is the max number of threads to be used
SRC_BASE_URL = "https://api.cognitedata.com"
DST_BASE_URL = "https://api.cognitedata.com"
TIMEOUT = 90

if __name__ == '__main__': # this is necessary because threading
    from cognite.replicator import assets, events, files, time_series, datapoints

    CLIENT_SRC = CogniteClient(api_key=SRC_API_KEY, project=PROJECT_SRC, base_url=SRC_BASE_URL, client_name=CLIENT_NAME)
    CLIENT_DST = CogniteClient(api_key=DST_API_KEY, project=PROJECT_DST, base_url=DST_BASE_URL, client_name=CLIENT_NAME, timeout=TIMEOUT)

    assets.replicate(CLIENT_SRC, CLIENT_DST)
    events.replicate(CLIENT_SRC, CLIENT_DST, BATCH_SIZE, NUM_THREADS)
    files.replicate(CLIENT_SRC, CLIENT_DST, BATCH_SIZE, NUM_THREADS)
    time_series.replicate(CLIENT_SRC, CLIENT_DST, BATCH_SIZE, NUM_THREADS)
    datapoints.replicate(CLIENT_SRC, CLIENT_DST)

Run it from databricks notebook

import logging

from cognite.client import CogniteClient
from cognite.replicator import assets, configure_databricks_logger

SRC_API_KEY = dbutils.secrets.get("cdf-api-keys", "source-tenant")
DST_API_KEY = dbutils.secrets.get("cdf-api-keys", "destination-tenant")

CLIENT_SRC = CogniteClient(api_key=SRC_API_KEY, client_name="cognite-replicator")
CLIENT_DST = CogniteClient(api_key=DST_API_KEY, client_name="cognite-replicator")

configure_databricks_logger(log_level=logging.INFO)
assets.replicate(CLIENT_SRC, CLIENT_DST)

Changelog

Wondering about upcoming or previous changes? Take a look at the CHANGELOG.

Contributing

Want to contribute? Check out CONTRIBUTING.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cognite_replicator-0.9.5.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

cognite_replicator-0.9.5-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file cognite_replicator-0.9.5.tar.gz.

File metadata

  • Download URL: cognite_replicator-0.9.5.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.9.1

File hashes

Hashes for cognite_replicator-0.9.5.tar.gz
Algorithm Hash digest
SHA256 0997018ae32a9504e97d5d83492147956ba8d30c647de7bf5287e9878cd7f04a
MD5 36bf9144c9d4ff6326bba7aaf4e3a5aa
BLAKE2b-256 e10745f17d18d585bf48d87e9059db70caea3b6fbfe7d5a0b8391088fe688450

See more details on using hashes here.

File details

Details for the file cognite_replicator-0.9.5-py3-none-any.whl.

File metadata

  • Download URL: cognite_replicator-0.9.5-py3-none-any.whl
  • Upload date:
  • Size: 34.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.25.0 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.52.0 CPython/3.9.1

File hashes

Hashes for cognite_replicator-0.9.5-py3-none-any.whl
Algorithm Hash digest
SHA256 dd08d666cf303825b08d3e8935ef1800c2662215b486e03b3fa40e464f5987c1
MD5 4e8921cc46c7d48e6baaabdb7ce3a373
BLAKE2b-256 77df448eafc3c3036d43631348caab742fbe779121899ab9de45f67e226ca221

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page