Skip to main content

A library used to fetch data from deltalake tables locally.

Project description

DeltaAgent - Deltalake Agent

This library can used to fetch data from deltalake tables without the dependency on Spark clusters. It is developed based on the pandas, adlfs and Office365-REST-Python-Client libraries.

Use cases and benefits

To use the library, firstly we need to install the it by

pip install DeltaAgent

It requires the datalake account_name and account_key for setting up the connection to a Gen2 Azure blob storage account.

from DeltaAgent import DeltaAgent

da = DeltaAgent(account_name="account_name", account_key="account_key")

With the established connection agent, we can then parse the paths of valid parquet files and their corresponding partition information, by the method parse_log_as_df. The result is returned in the format of pandas DataFrame, with an additonal method fetch_data. At this stage we can perform inspections and the normal DataFrame loc method for efficient filtering operations.

df_log = da.parse_log_as_df(container_name='container_name', table_path='deltatable_name')

df_log_filtered = df_log.loc[df_log.partition=='partition_value']

By calling the fetch_data method on the above delta log DataFrame, we can fetch the actual data from a deltalake table.

df_delta = df_log_filtered.fetch_data()

Note that the values for container_name and delta_table can be also assigned when setting up the agent connection, as below:

da = DeltaAgent(account_name="account_name", account_key="account_key", container_name='container_name', table_path='deltatable_name')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltaagent-0.0.9.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltaagent-0.0.9-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file deltaagent-0.0.9.tar.gz.

File metadata

  • Download URL: deltaagent-0.0.9.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.9.tar.gz
Algorithm Hash digest
SHA256 b6830e72b59f48488f1f7429ba224ba027de69550dcba8762998ad402300c197
MD5 0af9a1c42216d4c7f4c46bb74a69677a
BLAKE2b-256 a2aaae56fb2a353a9ae9f46d34fbcf1cc19e5b742747cb545400e6a76eefd7bb

See more details on using hashes here.

File details

Details for the file deltaagent-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: deltaagent-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 10.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 aed8d82a65f7268253392709d6b8c9ca8b76e4ae8cf5762d991f1aebefecdac2
MD5 a7887ac1c7d2f5802f5cf36bad9ba2b4
BLAKE2b-256 f7ccebe1b6aafe7ed1a72732884098b1c6c2918c0dabe218e91e5fc7ca518f6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page