A library used to fetch data from deltalake tables locally.
Project description
DeltaAgent - Deltalake Agent
This library can used to fetch data from deltalake tables without the dependency on Spark clusters. It is developed based on the pandas, adlfs and Office365-REST-Python-Client libraries.
Use cases and benefits
To use the library, firstly we need to install the it by
pip install DeltaAgent
It requires the datalake account_name and account_key for setting up the connection to a Gen2 Azure blob storage account.
from DeltaAgent import DeltaAgent
da = DeltaAgent(account_name="account_name", account_key="account_key")
With the established connection agent, we can then parse the paths of valid parquet files and their corresponding partition information, by the method parse_log_as_df. The result is returned in the format of pandas DataFrame, with an additonal method fetch_data. At this stage we can perform inspections and the normal DataFrame loc method for efficient filtering operations.
df_log = da.parse_log_as_df(container_name='container_name', table_path='deltatable_name')
df_log_filtered = df_log.loc[df_log.partition=='partition_value']
By calling the fetch_data method on the above delta log DataFrame, we can fetch the actual data from a deltalake table.
df_delta = df_log_filtered.fetch_data()
Note that the values for container_name and delta_table can be also assigned when setting up the agent connection, as below:
da = DeltaAgent(account_name="account_name", account_key="account_key", container_name='container_name', table_path='deltatable_name')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deltaagent-0.0.9.tar.gz.
File metadata
- Download URL: deltaagent-0.0.9.tar.gz
- Upload date:
- Size: 7.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b6830e72b59f48488f1f7429ba224ba027de69550dcba8762998ad402300c197
|
|
| MD5 |
0af9a1c42216d4c7f4c46bb74a69677a
|
|
| BLAKE2b-256 |
a2aaae56fb2a353a9ae9f46d34fbcf1cc19e5b742747cb545400e6a76eefd7bb
|
File details
Details for the file deltaagent-0.0.9-py3-none-any.whl.
File metadata
- Download URL: deltaagent-0.0.9-py3-none-any.whl
- Upload date:
- Size: 10.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aed8d82a65f7268253392709d6b8c9ca8b76e4ae8cf5762d991f1aebefecdac2
|
|
| MD5 |
a7887ac1c7d2f5802f5cf36bad9ba2b4
|
|
| BLAKE2b-256 |
f7ccebe1b6aafe7ed1a72732884098b1c6c2918c0dabe218e91e5fc7ca518f6f
|