lakeFS Python SDK Wrapper
Project description
lakeFS High-Level Python SDK
lakeFS High Level SDK for Python, provides developers with the following features:
- Simpler programming interface with less configuration
- Inferring identity from environment
- Better abstractions for common, more complex operations (I/O, transactions, imports)
Requirements
Python 3.9+
Installation & Usage
pip install
pip install lakefs
Import the package
import lakefs
Getting Started
Please follow the installation procedure and afterward refer to the following example snippet for a quick start:
import lakefs
from lakefs.client import Client
# Using default client will attempt to authenticate with lakeFS server using configured credentials
# If environment variables or .lakectl.yaml file exist
repo = lakefs.repository(repository_id="my-repo")
# Or explicitly initialize and provide a Client object
clt = Client(username="<lakefs_access_key_id>", password="<lakefs_secret_access_key>", host="<lakefs_endpoint>")
repo = lakefs.Repository(repository_id="my-repo", client=clt)
# From this point, proceed using the package according to documentation
main_branch = repo.create(storage_namespace="<storage_namespace>").branch(branch_id="main")
...
Examples
Print sizes of all objects in lakefs://repo/main~2
ref = lakefs.Repository("repo").ref("main~2")
for obj in ref.objects():
print(f"{o.path}: {o.size_bytes}")
Difference between two branches
for i in lakefs.Repository("repo").ref("main").diff("twig"):
print(i)
You can also use the ref expressions here, for instance
.diff("main~2")
also works. Ref expressions are the lakeFS analogues of
how Git specifies revisions.
Search a stored object for a string
with lakefs.Repository("repo").ref("main").object("path/to/data").reader(mode="r") as f:
for l in f:
if "quick" in l:
print(l)
Upload and commit some data
with lakefs.Repository("golden").branch("main").object("path/to/new").writer(mode="wb") as f:
f.write(b"my data")
# Returns a Reference
lakefs.Repository("golden").branch("main").commit("added my data using lakeFS high-level SDK")
# Prints "my data"
with lakefs.Repository("golden").branch("main").object("path/to/new").reader(mode="r") as f:
for l in f:
print(l)
Unlike references, branches are readable. This example couldn't work if we used a ref.
Tests
To run the tests using pytest
, first clone the lakeFS git repository
git clone https://github.com/treeverse/lakeFS.git
cd lakefs/clients/python-wrapper
Unit Tests
Inside the tests
folder, execute pytest utests
to run the unit tests.
Integration Tests
See testing documentation for more information
Documentation
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lakefs-0.7.1.tar.gz
.
File metadata
- Download URL: lakefs-0.7.1.tar.gz
- Upload date:
- Size: 41.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c62df77a5f292ac312aa435052193f31e18655a3e6f3c89253b5caee08ad2d52 |
|
MD5 | 47508619f80b02fa53dfcd4c61c0f3b4 |
|
BLAKE2b-256 | 1615193670e83bf001658d43620784699f33219a8407264e8c178197460ce02e |
File details
Details for the file lakefs-0.7.1-py3-none-any.whl
.
File metadata
- Download URL: lakefs-0.7.1-py3-none-any.whl
- Upload date:
- Size: 50.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f95b52827292a8834ed3c1da73fe975e29a4e0828470d231efb4cd3628cd2f35 |
|
MD5 | 63a28cb39ed238b35f1d8b749695777a |
|
BLAKE2b-256 | 43045dd7b155851d61612142431a970e4f10d604f67cb518e6b3c75f54f15b8e |