Skip to main content

Pick, choose and chain your RAG tools in a pythonic way

Project description

Build Status Docs pre-commit License MIT Pydantic v2

wurzel

wurzel is the german word for root. So with this framework we provide a combination of the best out of the whole domain of retrieval and beyond.

wurzel is an open-source Python library built to address advanced Extract, Transform, Load (ETL) needs for Retrieval-Augmented Generation (RAG) systems. It is designed to streamline ETL processes while offering essential features like multi-tenancy, cloud-native deployment support, and job scheduling.

The repository includes initial implementations for widely-used frameworks in the RAG ecosystem, such as Qdrant, Milvus, and Hugging Face, providing users with a strong starting point for building scalable and efficient RAG pipelines.

Sample Pipeline

Features

  • Advanced ETL Pipelines: Tailored for the specific needs of RAG systems.
  • Multi-Tenancy: Easily manage multiple tenants or projects within a single system.
  • Cloud-Native Deployment: Designed for seamless integration with Kubernetes, Docker, and other cloud platforms.
  • Scheduling Capabilities: Schedule and manage ETL tasks using built-in or external tools.
  • Framework Integrations: Pre-built support for popular tools like Qdrant, Milvus, and Hugging Face.
  • Type Security: By leveraging capabilities of pydantic and pandera we ensure type security

Installation

To get started with wurzel, install the library using pip:

pip install wurzel

Run a Step (Two Ways)

1. CLI-based Execution

Run a step using the CLI:

wurzel run <step_file_path> --inputs ./data --output ./out

To inspect the step requirements:

wurzel inspect wurzel.<step_path>

2. Programmatic Execution (Python)

Run a step using the snippet below:

import os
from pathlib import Path

from wurzel.step_executor import BaseStepExecutor
from wurzel.steps.manual_markdown import ManualMarkdownStep

# Create input dir and set folder (required by ManualMarkdownStep)
input_dir = Path("./input")
input_dir.mkdir(exist_ok=True)
abs_input = input_dir.resolve()
os.environ["MANUALMARKDOWNSTEP__FOLDER_PATH"] = str(abs_input)
with BaseStepExecutor() as ex:
    ex(ManualMarkdownStep, {abs_input}, Path("./output"))

Building your one step

For detailed instructions and examples on how to use wurzel, please refer to our official documentation.

Code of Conduct

This project has adopted the Contributor Covenant in version 2.1 as our code of conduct. Please see the details in our CODE_OF_CONDUCT.md. All contributors must abide by the code of conduct.

By participating in this project, you agree to abide by its Code of Conduct at all times.

Licensing

This project follows the REUSE standard for software licensing. Each file contains copyright and license information, and license texts can be found in the ./LICENSES folder. For more information visit https://reuse.software/. You can find a guide for developers at https://telekom.github.io/reuse-template/.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wurzel-2.5.1.tar.gz (10.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wurzel-2.5.1-py3-none-any.whl (10.3 MB view details)

Uploaded Python 3

File details

Details for the file wurzel-2.5.1.tar.gz.

File metadata

  • Download URL: wurzel-2.5.1.tar.gz
  • Upload date:
  • Size: 10.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wurzel-2.5.1.tar.gz
Algorithm Hash digest
SHA256 867d277f85c548643c0899eba80e40c416d594b5adb8dc53c31369f1a40e0fa0
MD5 5685b80e47d01c3ac43f3bfafb51ae67
BLAKE2b-256 af36e9b70c0b2fa1d61f7de1cbeb5cd9d3cc965f9915be45b8dc2a4e9cf3d950

See more details on using hashes here.

Provenance

The following attestation bundles were made for wurzel-2.5.1.tar.gz:

Publisher: publish-to-pypi.yml on telekom/wurzel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file wurzel-2.5.1-py3-none-any.whl.

File metadata

  • Download URL: wurzel-2.5.1-py3-none-any.whl
  • Upload date:
  • Size: 10.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for wurzel-2.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d3b04fc20a4c229055c7caf898c045aaa8adcea204fc2703caac9d137e0c4692
MD5 87ecdec14b6fc2a087b62251dbc8a663
BLAKE2b-256 71c0b97450e8c957e8d8f7154950618a4a93bd05413caa6067f706583c7a4576

See more details on using hashes here.

Provenance

The following attestation bundles were made for wurzel-2.5.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on telekom/wurzel

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page