A project manager for Python based extractors
Project description
cogex
cogex
is a tool for managing extractors for Cognite Data Fusion written in Python. It provides
utilities for initializing a new extractor project and building self-contained executables of Python
based extractors.
Important note for users running pyenv
pyenv
is a neat tool for managing Python installations.
Since cogex
uses PyInstaller to build executables, we need Python to be installed with a shared
instance of libpython
, which pyenv
does not do by default. To fix this, make sure to add the
--enable-shared
flag when installing new Python versions with pyenv
, like so:
env PYTHON_CONFIGURE_OPTS="--enable-shared" pyenv install 3.9.0
You can read more about it in the PyInstaller documentation
Overview of features
Start a new extractor project
To start a new extractor project, move to the desired directory and run
cogex init
You will first be prompted for some information, before cogex
will initialize a new project.
Add dependencies
Extractor projects initiated with cogex
will use poetry
for managing dependencies. Running
cogex init
will automatically install the Cognite SDK and extractor-utils framework, but if your
extractor needs any other dependency, simply add them using poetry
, like so:
poetry add requests
Type checking and code style
It is recommended that you run code checkers on your extractor, in particular:
black
is an opinionated code style checker that will enforce a consistent code style throughout your project. This is useful to avoid unecessary changes and minimizing PR diffs.isort
is a tool that sorts your imports, also contributing to a consistent code style and minimal PR diffs.mypy
is a static type checker for Python which ensures that you are not making any type errors in your code that would go unnoticed before suddently breaking your extractor in production.
cogex
will install all of these, and automatically run them on every commit. If you for some
reason need to perform a commit despite one of these failing, you can run git commit --no-verify
,
although this is not recommended.
Build and package an extractor project
Packaging a binary of your extractor
It is not always an option to rely on a Python installation at the machine your extractor will be deployed at. For those scenarios it is useful to package the extractor, including its dependencies and the Python runtime, into a single self-contained executable. To do this, run
cogex build
This will create a new executable (for the operating system you ran cogex build
from) in the
dist
directory.
Making docker images
To build a docker image, you first need to add a [tools.cogex.docker]
section to your pyproject
file. The required fields are
tags
: A list of tags to tag the resulting image with. These support some simple templating, if you include{version}
in your tag, it will be replaced with the current version of the extractor.{major}
will be replaced with the current major version.- If your
[tool.poetry.scripts]
includes multiple entries, you need to specify which one to use in the docker image with theentrypoint
field
In addition, you have some additional fields:
base-image
: Which base image to use. By default, thedebian-slim
based python image for the python version currently running with be chosen.install-dir
if you want to specify where in the image the extractor should be installedpreamble
which can contain additional dockerimage statements to run in the beginning of the dockerfile.
Minimal example:
[tool.cogex.docker]
tags = ["cognite/my-extractor:{version}"]
Larger example (from the DB Extractor):
[tool.cogex.docker]
base-image = "python:3.10"
preamble = """
RUN apt-get update \
&& apt-get dist-upgrade -y dirmngr gnupg gnupg-l10n gnupg-utils gpg gpg-agent gpg-wks-client gpg-wks-server \
&& gpgconf gpgsm gpgv libssl-dev libssl1.1 openssl
RUN apt-get install -y apt-utils build-essential
RUN apt-get install -y unixodbc-dev unixodbc
"""
tags = [
"eu.gcr.io/cognite-registry/db-extractor-base:latest",
"eu.gcr.io/cognite-registry/db-extractor-base:{version}",
"cognite/db-extractor-base:{version}",
]
You can now build and tag docker images with
cogex build --dockerimage
If you just want to see the generated dockerfile, instead run
cogex build --dockerfile
Creating a new version of your extractor
To keep track of which version of the code base is running at a given deployment it is very useful to version your extractor. When releasing a new version, run
poetry version [patch/minor/major]
To automatically bump the corresponding version number. Note that this only updates the version
number in pyproject.toml
. When running cogex build
this new version number will be propagated
through the rest of the code base.
Any extractor project should follow semantic versioning, which means you should bump
patch
for any minor bug fixes or improvementsminor
for new features or bigger improvements that doesn't break compatabilitymajor
for new feature or improvements that breaks compatability with previous versions, in other words for those scenarios where the new version is not a drop-in replacement for an old version. For example:- When adding a new required config field
- When removing a config field
- When changing defaults in a way that could break existing deployments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cognite_extractor_manager-1.1.1.tar.gz
.
File metadata
- Download URL: cognite_extractor_manager-1.1.1.tar.gz
- Upload date:
- Size: 16.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 77cc1a2bf7ceff9f56b2efae43411aade69d5c5c7706d003c6d6cc35f416e1b8 |
|
MD5 | 1561e7db1a3ec074c644fbed4d6d0439 |
|
BLAKE2b-256 | eb419cd45cc83e9a32d4b2b4f585d123b6caba4c57f3c772424de24df5eac3ab |
File details
Details for the file cognite_extractor_manager-1.1.1-py3-none-any.whl
.
File metadata
- Download URL: cognite_extractor_manager-1.1.1-py3-none-any.whl
- Upload date:
- Size: 19.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4acc2c7321b52b0d7a95e5e631f5cc607f12820294243cf6b7016a412035d20f |
|
MD5 | 5e49c6f9298fff86816aaa73bdc22bf1 |
|
BLAKE2b-256 | bf351bba8b0025e8cffef53bbb37eaea19c7818cb7a03b015a6107348719c2bd |