Skip to main content

A library used to fetch data from deltalake tables locally.

Project description

DeltaAgent - Deltalake Agent

This library can used to fetch data from deltalake tables without the dependency on Spark clusters. It is developed based on the pandas, adlfs and Office365-REST-Python-Client libraries.

Use cases and benefits

To use the library, firstly we need to install the it by

pip install DeltaAgent

It requires the datalake account_name and account_key for setting up the connection to a Gen2 Azure blob storage account.

from DeltaAgent import DeltaAgent

da = DeltaAgent(account_name="account_name", account_key="account_key")

With the established connection agent, we can then parse the paths of valid parquet files and their corresponding partition information, by the method parse_log_as_df. The result is returned in the format of pandas DataFrame, with an additonal method fetch_data. At this stage we can perform inspections and the normal DataFrame loc method for efficient filtering operations.

df_log = da.parse_log_as_df(container_name='container_name', table_path='deltatable_name')

df_log_filtered = df_log.loc[df_log.partition=='partition_value']

By calling the fetch_data method on the above delta log DataFrame, we can fetch the actual data from a deltalake table.

df_delta = df_log_filtered.fetch_data()

Note that the values for container_name and delta_table can be also assigned when setting up the agent connection, as below:

da = DeltaAgent(account_name="account_name", account_key="account_key", container_name='container_name', table_path='deltatable_name')

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deltaagent-0.0.10.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

deltaagent-0.0.10-py3-none-any.whl (10.4 kB view details)

Uploaded Python 3

File details

Details for the file deltaagent-0.0.10.tar.gz.

File metadata

  • Download URL: deltaagent-0.0.10.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.10.tar.gz
Algorithm Hash digest
SHA256 f878658f0f2c0f10418cb28ccae2a28f63a8a7dcfb3f136fa183d6495eec9ec4
MD5 d44fa5a8deb478a4637a1bfd7c829407
BLAKE2b-256 e3e520ec8f96d6c7f7c03ac9937271a2e47506c849cf9b5de9693c3298cce1f7

See more details on using hashes here.

File details

Details for the file deltaagent-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: deltaagent-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 10.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for deltaagent-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 157ad9722b428abf0142f68cfdadd15b12d3297937280a0693a8d03f89fe62f9
MD5 f32e2d6ac287f3612b1d31534dee3898
BLAKE2b-256 987c843e184b9a44ec523ff795d67e8995c91f3505db7bad25c2226dd2108df6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page