Skip to main content

A library used to fetch data from deltalake tables locally.

Project description

DeltaAgent - Deltalake Agent

This library can used to fetch data from deltalake tables without the dependency on Spark clusters. It is developed based on the pandas, adlfs and Office365-REST-Python-Client libraries.

Use cases and benefits

To use the library, firstly we need to install the it by

pip install DeltaAgent

It requires the datalake account_name and account_key for setting up the connection to a Gen2 Azure blob storage account.

from DeltaAgent import DeltaAgent

da = DeltaAgent(account_name="account_name", account_key="account_key")

With the established connection agent, we can then parse the paths of valid parquet files and their corresponding partition information, by the method parse_log_as_df. The result is returned in the format of pandas DataFrame, with an additonal method fetch_data. At this stage we can perform inspections and the normal DataFrame loc method for efficient filtering operations.

df_log = da.parse_log_as_df(container_name='container_name', table_path='deltatable_name')

df_log_filtered = df_log.loc[df_log.partition=='partition_value']

By calling the fetch_data method on above the parquet files path DataFrame, we can fetch the actual data from a deltalake table.

df_delta = df_log_filtered.fetch_data()

Note that the values for container_name and delta_table can be also assigned when setting up the agent connection, as below:

da = DeltaAgent(account_name="account_name", account_key="account_key", container_name='container_name', table_path='deltatable_name')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltaagent-0.0.8.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltaagent-0.0.8-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file deltaagent-0.0.8.tar.gz.

File metadata

  • Download URL: deltaagent-0.0.8.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.8.tar.gz
Algorithm Hash digest
SHA256 5fc6455ad8eec9a34248dd96fadce68a4fa135055a0796015d2b8d0199976bca
MD5 fcbc61a6dcdf1886b268894308ebdbd6
BLAKE2b-256 abb0d95290bf8b809051d00af0dd73fcb190c873dea7233554cdce9db067d06c

See more details on using hashes here.

File details

Details for the file deltaagent-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: deltaagent-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 624646bc1f6d657ebfe0897e253a46cde3fc00626ce92c357b8b34090688e816
MD5 325ed383ef2d803b13a8f30a84925a35
BLAKE2b-256 616f482ff01bfc4e60d66c8b079053d788ddec790accb823376bcaac93093f09

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page