Pick, choose and chain your RAG tools in a pythonic way
Project description
wurzel
wurzel is the german word for root. So with this framework we provide a combination of the best out of the whole domain of retrieval and beyond.
wurzel is an open-source Python library built to address advanced Extract, Transform, Load (ETL) needs for Retrieval-Augmented Generation (RAG) systems. It is designed to streamline ETL processes while offering essential features like multi-tenancy, cloud-native deployment support, and job scheduling.
The repository includes initial implementations for widely-used frameworks in the RAG ecosystem, such as Qdrant, Milvus, and Hugging Face, providing users with a strong starting point for building scalable and efficient RAG pipelines.
Features
- Advanced ETL Pipelines: Tailored for the specific needs of RAG systems.
- Multi-Tenancy: Easily manage multiple tenants or projects within a single system.
- Cloud-Native Deployment: Designed for seamless integration with Kubernetes, Docker, and other cloud platforms.
- Scheduling Capabilities: Schedule and manage ETL tasks using built-in or external tools.
- Framework Integrations: Pre-built support for popular tools like Qdrant, Milvus, and Hugging Face.
- Type Security: By leveraging capabilities of pydantic and pandera we ensure type security
Installation
To get started with wurzel, install the library using pip:
pip install wurzel
Run a Step (Two Ways)
1. CLI-based Execution
Run a step using the CLI:
wurzel run <step_file_path> --inputs ./data --output ./out
To inspect the step requirements:
wurzel inspect wurzel.<step_path>
2. Programmatic Execution (Python)
Run a step using the snippet below:
import os
from pathlib import Path
from wurzel.step_executor import BaseStepExecutor
from wurzel.steps.manual_markdown import ManualMarkdownStep
# Create input dir and set folder (required by ManualMarkdownStep)
input_dir = Path("./input")
input_dir.mkdir(exist_ok=True)
abs_input = input_dir.resolve()
os.environ["MANUALMARKDOWNSTEP__FOLDER_PATH"] = str(abs_input)
with BaseStepExecutor() as ex:
ex(ManualMarkdownStep, {abs_input}, Path("./output"))
Building your one step
For detailed instructions and examples on how to use wurzel, please refer to our official documentation.
Code of Conduct
This project has adopted the Contributor Covenant in version 2.1 as our code of conduct. Please see the details in our CODE_OF_CONDUCT.md. All contributors must abide by the code of conduct.
By participating in this project, you agree to abide by its Code of Conduct at all times.
Licensing
This project follows the REUSE standard for software licensing. Each file contains copyright and license information, and license texts can be found in the ./LICENSES folder. For more information visit https://reuse.software/. You can find a guide for developers at https://telekom.github.io/reuse-template/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wurzel-2.5.1.tar.gz.
File metadata
- Download URL: wurzel-2.5.1.tar.gz
- Upload date:
- Size: 10.3 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
867d277f85c548643c0899eba80e40c416d594b5adb8dc53c31369f1a40e0fa0
|
|
| MD5 |
5685b80e47d01c3ac43f3bfafb51ae67
|
|
| BLAKE2b-256 |
af36e9b70c0b2fa1d61f7de1cbeb5cd9d3cc965f9915be45b8dc2a4e9cf3d950
|
Provenance
The following attestation bundles were made for wurzel-2.5.1.tar.gz:
Publisher:
publish-to-pypi.yml on telekom/wurzel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wurzel-2.5.1.tar.gz -
Subject digest:
867d277f85c548643c0899eba80e40c416d594b5adb8dc53c31369f1a40e0fa0 - Sigstore transparency entry: 1202865812
- Sigstore integration time:
-
Permalink:
telekom/wurzel@3a00a77d2fb3951de55c0a7bf4d04b40abdeafc0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telekom
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@3a00a77d2fb3951de55c0a7bf4d04b40abdeafc0 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file wurzel-2.5.1-py3-none-any.whl.
File metadata
- Download URL: wurzel-2.5.1-py3-none-any.whl
- Upload date:
- Size: 10.3 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d3b04fc20a4c229055c7caf898c045aaa8adcea204fc2703caac9d137e0c4692
|
|
| MD5 |
87ecdec14b6fc2a087b62251dbc8a663
|
|
| BLAKE2b-256 |
71c0b97450e8c957e8d8f7154950618a4a93bd05413caa6067f706583c7a4576
|
Provenance
The following attestation bundles were made for wurzel-2.5.1-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on telekom/wurzel
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wurzel-2.5.1-py3-none-any.whl -
Subject digest:
d3b04fc20a4c229055c7caf898c045aaa8adcea204fc2703caac9d137e0c4692 - Sigstore transparency entry: 1202865817
- Sigstore integration time:
-
Permalink:
telekom/wurzel@3a00a77d2fb3951de55c0a7bf4d04b40abdeafc0 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/telekom
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@3a00a77d2fb3951de55c0a7bf4d04b40abdeafc0 -
Trigger Event:
workflow_dispatch
-
Statement type: