Skip to main content

A library used to fetch data from deltalake tables locally.

Project description

DeltaAgent - Deltalake Agent

This library can used to fetch data from deltalake tables without the dependency on Spark clusters. It is developed based on the pandas, adlfs and Office365-REST-Python-Client libraries.

Use cases and benefits

To use the library, firstly we need to install the it by

pip install DeltaAgent

It requires the datalake account_name and account_key for setting up the connection to a Gen2 Azure blob storage account.

from DeltaAgent import DeltaAgent

da = DeltaAgent(account_name="account_name", account_key="account_key")

With the established connection agent, we can then parse the paths of valid parquet files and their corresponding partition information, by the method parse_log_as_df. The result is returned in the format of pandas DataFrame, with an additonal method fetch_data. At this stage we can perform inspections and the normal DataFrame loc method for efficient filtering operations.

df_log = da.parse_log_as_df(container_name='container_name', table_path='deltatable_name')

df_log_filtered = df_log.loc[df_log.partition=='partition_value']

By calling the fetch_data method on above the parquet files path DataFrame, we can fetch the actual data from a deltalake table.

df_delta = df_log_filtered.fetch_data()

Note that the values for container_name and delta_table can be also assigned when setting up the agent connection, as below:

da = DeltaAgent(account_name="account_name", account_key="account_key", container_name='container_name', table_path='deltatable_name')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltaagent-0.0.7.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltaagent-0.0.7-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file deltaagent-0.0.7.tar.gz.

File metadata

  • Download URL: deltaagent-0.0.7.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.7.tar.gz
Algorithm Hash digest
SHA256 b4d06aa2b5a55beb77352803aa7c54c2de2e6942c0d5503f8e208812878f8710
MD5 dd73cdd6b87f4397826768979c93e2d6
BLAKE2b-256 d382ff719b3349ba4544fe70920224020be1d7042e12290af7e424df43b22537

See more details on using hashes here.

File details

Details for the file deltaagent-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: deltaagent-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 05a1b21b49c4ea6cd70458dc0ef72f75b054555018b1126c24aa72118a044db7
MD5 d37a822ae1b9c6b45990ed1eb07be7c4
BLAKE2b-256 774a84a8fee85e2d1f7ebe505dcf70fd782bc7ec3db20374225b9eb8ea796aad

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page