A simple data ingestion library to guide data flows from some places to other places

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Viadot

Documentation: https://dyvenia.github.io/viadot/

Source Code: https://github.com/dyvenia/viadot

A simple data ingestion library to guide data flows from some places to other places.

Getting Data from a Source

Viadot supports several API and RDBMS sources, private and public. Currently, we support the UK Carbon Intensity public API and base the examples on it.

from viadot.sources.uk_carbon_intensity import UKCarbonIntensity

ukci = UKCarbonIntensity()
ukci.query("/intensity")
df = ukci.to_df()

print(df)

Output:

	from	to	forecast	actual	index
0	2021-08-10T11:00Z	2021-08-10T11:30Z	211	216	moderate

The above df is a pandas DataFrame object. It contains data downloaded by viadot from the Carbon Intensity UK API.

Loading Data to a Source

Depending on the source, viadot provides different methods of uploading data. For instance, for SQL sources, this would be bulk inserts. For data lake sources, it would be a file upload. For ready-made pipelines including data validation steps using dbt, see prefect-viadot.

Getting started

Prerequisites

We assume that you have Docker installed.

Installation

Clone the 2.0 branch, and set up and run the environment:

git clone https://github.com/dyvenia/viadot.git -b 2.0 && \
  cd viadot/docker && \
  sh update.sh  && \
  sh run.sh && \
  cd ../

Configuration

In order to start using sources, you must configure them with required credentials. Credentials can be specified either in the viadot config file (by default, $HOME/.config/viadot/config.yaml), or passed directly to each source's credentials parameter.

You can find specific information about each source's credentials in the documentation.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

2.0a23 pre-release

Apr 11, 2024

2.0a22 pre-release

Mar 28, 2024

2.0a21 pre-release

Mar 28, 2024

2.0a20 pre-release

Mar 21, 2024

2.0a19 pre-release

Mar 19, 2024

2.0a18 pre-release

Mar 5, 2024

2.0a17 pre-release

Mar 5, 2024

2.0a16 pre-release

Feb 28, 2024

2.0a15 pre-release

Jan 8, 2024

This version

2.0a14 pre-release

Jan 8, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

viadot2-2.0a14.tar.gz (62.1 kB view hashes)

Uploaded Jan 8, 2024 Source

Built Distribution

viadot2-2.0a14-py3-none-any.whl (76.2 kB view hashes)

Uploaded Jan 8, 2024 Python 3

Hashes for viadot2-2.0a14.tar.gz

Hashes for viadot2-2.0a14.tar.gz
Algorithm	Hash digest
SHA256	`e2bde680d641fbff5305bff4976d9d90afc99da9dc69cfa8ba3707c2450b439a`
MD5	`761dfe0009a25e20dd73c6d9180b696c`
BLAKE2b-256	`783a04da5607984f5d5646b30f889e5d35ad281830d8fa93965b14f06e272bff`

Hashes for viadot2-2.0a14-py3-none-any.whl

Hashes for viadot2-2.0a14-py3-none-any.whl
Algorithm	Hash digest
SHA256	`169edd0702147f9dee037adbb25b4844805154811fbfbd83992eb53bf412bb8b`
MD5	`1ee9b707b5c4440692bcb6475b759872`
BLAKE2b-256	`a3c22ecc235e8ef56039cbe2505f276d51524585cf00bcd0ede557d664221459`