Skip to main content

Python client for RagFlow API

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

RagFlow Client

A Python client library for interacting with the RagFlow API. This package provides a clean interface for creating and managing datasets, documents, and chat assistants through the RagFlow platform.

Table of Contents

Installation

You can install the RagFlow client package using pip:

pip install ragflow-client

Configuration

The client requires RagFlow API credentials to function. You can provide these in two ways:

Using environment variables

Create a .env file in your project root:

RAGFLOW_API_KEY=your_api_key_here
RAGFLOW_BASE_URL=https://your.ragflow.instance

Then the client will automatically load these credentials:

from ragflow_client import RagFlowClient

client = RagFlowClient()  # Loads credentials from environment variables

Passing credentials directly

from ragflow_client import RagFlowClient

client = RagFlowClient(
    api_key="your_api_key_here",
    base_url="https://your.ragflow.instance"
)

Getting Started

Here's a quick example to get started with the RagFlow client:

from ragflow_client import RagFlowClient

# Initialize client
client = RagFlowClient()

# Create a dataset
dataset_name = "my_dataset"
client.create_dataset(dataset_name)

# Upload documents
file_paths = ["document1.pdf", "document2.docx"]
client.upload_document(dataset_name, file_paths)

# Create a session for chat
session_name = "my_session"
client.create_session(dataset_name, session_name)

# Chat with the documents
response = client.chat(dataset_name, session_name, "What information can you provide about these documents?")
print(response)

API Reference

Dataset Management

Create a dataset

dataset_result = client.create_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to create

Returns:

  • dict: API response containing dataset information

Raises:

  • ValueError: If required credentials are missing
  • ResponseError: If API returns an error response

Get dataset information

dataset = client.get_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to retrieve

Returns:

  • dict: Dataset information including ID and other properties

Raises:

  • ValueError: If required credentials are missing or dataset is not found

Delete a dataset

success = client.delete_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to delete

Returns:

  • bool: True if deletion was successful, False otherwise

Raises:

  • ValueError: If required credentials are missing

Document Management

Upload documents

# Upload a single document
result = client.upload_document(dataset_name, "path/to/document.pdf")

# Upload multiple documents
result = client.upload_document(dataset_name, ["doc1.pdf", "doc2.docx"])

# Upload without progress bar
result = client.upload_document(dataset_name, file_paths, show_progress=False)

Parameters:

  • dataset_name (str): Name of the dataset to upload documents to
  • file_paths (str or list): Path to file or list of file paths to upload
  • show_progress (bool, optional): Whether to show progress bar. Defaults to True.

Returns:

  • dict: Upload status and list of uploaded documents

Raises:

  • ValueError: If required credentials are missing or dataset is not found
  • FileNotFoundError: If any of the files don't exist

List documents

documents = client.list_documents(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to list documents from

Returns:

  • list: List of document information dictionaries

Raises:

  • ValueError: If required credentials are missing or dataset is not found

Delete documents

# Delete specific documents
success = client.delete_documents(dataset_name, document_ids=["doc_id1", "doc_id2"])

# Delete all documents in dataset
success = client.delete_documents(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset containing documents to delete
  • document_ids (list, optional): List of document IDs to delete. If None, deletes all documents.

Returns:

  • bool: True if deletion was successful, False otherwise

Raises:

  • ValueError: If required credentials are missing

Chat Assistant Management

Create a chat assistant

assistant = client.create_chat_assistant(
    dataset_name, 
    temperature=0.3, 
    top_p=1.0, 
    presence_penalty=0.4
)

Parameters:

  • dataset_name (str): Name of the dataset to create assistant for
  • temperature (float, optional): LLM temperature parameter. Defaults to 0.3.
  • top_p (float, optional): LLM top_p parameter. Defaults to 1.0.
  • presence_penalty (float, optional): LLM presence penalty. Defaults to 0.4.

Returns:

  • dict: API response containing chat assistant information

Raises:

  • ValueError: If required credentials are missing or dataset is not found
  • ResponseError: If API returns an error response

Note: In most cases, you don't need to explicitly create a chat assistant as it will be automatically created when needed by the chat method.

Session Management

Create a session

session = client.create_session(dataset_name, session_name)

Parameters:

  • dataset_name (str): Name of the dataset to create session for
  • session_name (str): Name of the session to create

Returns:

  • dict: API response containing session information

Raises:

  • ValueError: If required credentials are missing or chat assistant is not found
  • ResponseError: If API returns an error response

List sessions

sessions = client.list_sessions(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to list sessions for

Returns:

  • dict: API response containing list of sessions

Raises:

  • ValueError: If required credentials are missing or chat assistant not found
  • ResponseError: If API returns an error response

Delete a session

result = client.delete_session(dataset_name, session_name)

Parameters:

  • dataset_name (str): Name of the dataset containing the session
  • session_name (str): Name of the session to delete

Returns:

  • dict: API response or status information

Raises:

  • ValueError: If required credentials are missing or chat session not found
  • ResponseError: If API returns an error response

Chat Functionality

Send a chat message

# Simple usage - returns just the answer string
answer = client.chat(dataset_name, session_name, user_message)

# Get full response details
response = client.chat(dataset_name, session_name, user_message, stream=False)

Parameters:

  • dataset_name (str): Name of the dataset to chat with
  • session_name (str): Name of the chat session to use
  • user_message (str): User's message to send
  • stream (bool, optional): Whether to stream the response. Defaults to False.

Returns:

  • str or dict: If successful, returns the answer string. Otherwise, returns full response dict.

Raises:

  • ValueError: If required credentials are missing or chat assistant/session not found
  • ResponseError: If API returns an error response

Note: If the session doesn't exist, it will be created automatically. If the chat assistant doesn't exist, it will also be created automatically.

Examples

Complete Workflow Example

import os
from ragflow_client import RagFlowClient

# Initialize client
client = RagFlowClient()

# Create a dataset
dataset_name = "research_papers"
client.create_dataset(dataset_name)
print(f"Dataset '{dataset_name}' created")

# Upload documents
pdf_folder = "research_pdfs"
files = ["document1.pdf", "document2.docx", "document3.txt", "document4.xlsx"]
upload_result = client.upload_document(dataset_name, pdf_files)
print(f"Uploaded {upload_result['count']} documents")

# Create a session
session_name = "research_session"

# Ask questions
questions = [
    "What are the main findings from these research papers?",
    "Summarize the methodologies used in these papers",
    "What are the common limitations mentioned in these studies?"
]

for question in questions:
    print(f"\nQ: {question}")
    answer = client.chat(dataset_name, session_name, question)
    print(f"A: {answer}")

# Cleanup
client.delete_session(dataset_name, session_name)
client.delete_documents(dataset_name)
client.delete_dataset(dataset_name)
print("Cleanup completed")

Error Handling Example

from ragflow_client import RagFlowClient
from ragflow_client.utils.api_utils import ResponseError

client = RagFlowClient()

try:
    # Try to get a non-existent dataset
    dataset = client.get_dataset("non_existent_dataset")
    
    # Check if dataset was found
    if not dataset.get("id"):
        print("Dataset not found, creating it...")
        client.create_dataset("non_existent_dataset")
    
    # Try to chat with a non-existent session
    response = client.chat("non_existent_dataset", "new_session", "Hello")
    print(f"Response: {response}")
    
except ValueError as e:
    print(f"Validation error: {str(e)}")
except ResponseError as e:
    print(f"API error (Status {e.status_code}): {e.message}")
except Exception as e:
    print(f"Unexpected error: {str(e)}")

CLI Usage

The RagFlow client package also includes a command-line interface for interacting with the API:

# Show help
ragflow --help

# Create a dataset
ragflow dataset create my_dataset

# Upload documents
ragflow document upload my_dataset document1.pdf document2.pdf

# Create a session
ragflow chat create-session my_dataset my_session

# List all sessions
ragflow chat list-sessions my_dataset

# Send a chat message
ragflow chat send my_dataset my_session "What information is in these documents?"

# Interactive chat mode
ragflow chat interactive my_dataset my_session

You can provide API credentials via environment variables or command-line options:

ragflow --api-key YOUR_API_KEY --base-url YOUR_BASE_URL dataset create my_dataset

Troubleshooting

Common Issues

  1. Authentication Errors:

    • Ensure your API key is correct
    • Check that your base URL has the correct format without trailing slashes
  2. Document Upload Issues:

    • Verify file paths are correct and the files exist
    • Check file permissions
    • Ensure document formats are supported (PDF, DOCX, TXT, etc.)
  3. Chat Response Issues:

    • Verify the dataset contains properly parsed documents
    • Ensure the question is clear and relevant to the document content

Enabling Debug Logging

To help troubleshoot issues, you can enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

client = RagFlowClient()
# The client will now log detailed debug information

Contributing

Contributions to RagFlow Client are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragflow_client-0.1.1.tar.gz (21.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragflow_client-0.1.1-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file ragflow_client-0.1.1.tar.gz.

File metadata

  • Download URL: ragflow_client-0.1.1.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragflow_client-0.1.1.tar.gz
Algorithm Hash digest
SHA256 41013bf3e513633cd811708006b06e5b5fa5977a03a32ede2a891c985518a564
MD5 4eae321377388069000154bed9329d1b
BLAKE2b-256 cd5b61e0fe4dfd34df0fb31f3ffdac5612d9f5350bff15ba484f8508122d3764

See more details on using hashes here.

File details

Details for the file ragflow_client-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: ragflow_client-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 24.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragflow_client-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e2c7ad664619d845bd548cc63e1f440ff90938629ac0c8c8050f2faaaf6516df
MD5 0a456539557844d77c2b76ad6a460250
BLAKE2b-256 e5f9a351e8d696e62edb2bc3f11a96ea3f6c19f349bf174c3ce87c7149bbdaf1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page