DLT is an open-source python-native scalable data loading framework that does not require any devops efforts to run.
Project description
data load tool (dlt)
import dlt
from chess import chess # a utility function that grabs data from the chess.com API
# create a dlt pipeline that will load chess game data to the DuckDB destination
pipeline = dlt.pipeline(
pipeline_name='chess_pipeline',
destination='duckdb',
dataset_name='games_data'
)
# use chess.com API to grab data about a few players
data = chess(['magnuscarlsen', 'rpragchess'], start_month='2022/11', end_month='2022/12')
# extract, normalize, and load the data
pipeline.run(data)
data load tool (dlt) is a simple, open source Python library that makes data loading easy
- Automatically turn the JSON returned by any API into a live dataset stored wherever you want it
pip install python-dlt
and then includeimport dlt
to use it in your Python loading script- The dlt library is licensed under the Apache License 2.0, so you can use it for free forever
Read more about it on the dlt Docs
semantic versioning
python-dlt
will follow the semantic versioning with MAJOR.MINOR.PATCH
pattern. Currently we do pre-release versioning with major version being 0.
minor
version change means breaking changespatch
version change means new features that should be backward compatible- any suffix change ie.
a10
->a11
is a patch
development
python-dlt
uses poetry
to manage, build and version the package. It also uses make
to automate tasks. To start
make install-poetry # will install poetry, to be run outside virtualenv
then
make dev # will install all deps including dev
Executing poetry shell
and working in it is very convenient at this moment.
python version
Use python 3.8 for development which is the lowest supported version for python-dlt
. You'll need distutils
and venv
:
sudo apt-get install python3.8
sudo apt-get install python3.8-distutils
sudo apt install python3.8-venv
You may also use pyenv
as poetry suggests.
bumping version
Please use poetry version prerelease
to bump patch and then make build-library
to apply changes. The source of the version is pyproject.toml
and we use poetry to manage it.
testing and linting
python-dlt
uses mypy
and flake8
with several plugins for linting. We do not reorder imports or reformat code.
pytest
is used as test harness. make test-common
will run tests of common components and does not require any external resources.
testing destinations
To test destinations use make test
. You will need following external resources
BigQuery
projectRedshift
clusterPostgres
instance. You can find a docker compose for postgres instance here. When run the instance is configured to work with the tests.
cd tests/load/postgres/
docker-compose up --build -d
See tests/.example.env
for the expected environment variables and command line example to run the tests. Then create tests/.env
from it. You configure the tests as you would configure the dlt pipeline.
We'll provide you with access to the resources above if you wish to test locally.
To test local destinations (duckdb
and postgres
) run make test-local
. You can run this tests without additional credentials (just copy .example.env
into .env
)
publishing
- Make sure that you are on
devel
branch and you have the newest code that passed all tests on CI. - Verify the current version with
poetry version
- You'll need
pypi
access token and usepoetry config pypi-token.pypi your-api-token
then
make publish-library
- Make a release on github, use version and git tag as release name
contributing
To contribute via pull request:
- Create an issue with your idea for a feature etc.
- Write your code and tests
- Lint your code with
make lint
. Test the common modules withmake test-common
- If you work on a destination code then contact us to get access to test destinations
- Create a pull request
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file python_dlt-0.2.0a32.tar.gz
.
File metadata
- Download URL: python_dlt-0.2.0a32.tar.gz
- Upload date:
- Size: 242.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.8.11 Linux/4.19.128-microsoft-standard
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9a85234c99a9cd77cf1774e07dae2d3334b0295d82790f4c592f4d625032af76 |
|
MD5 | 3b9eee3a5371b32438348420bda32f45 |
|
BLAKE2b-256 | ffa13d89a088c9a6f9c05dd57101fe80db36183fccf2b9f80e78126cdfc7bcdb |
File details
Details for the file python_dlt-0.2.0a32-py3-none-any.whl
.
File metadata
- Download URL: python_dlt-0.2.0a32-py3-none-any.whl
- Upload date:
- Size: 318.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.8.11 Linux/4.19.128-microsoft-standard
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1a91d842ed1352e25a67f54d947721af1ecd5688c7dbeea3b2e43eeb46c7a083 |
|
MD5 | 089254a72068c05488d8f1f6d4c5cc40 |
|
BLAKE2b-256 | a7c7f967752c8aacb6cb082829eac053656efefcceb17babd12525c2334677e6 |