Skip to main content

Python client for RagFlow API

This project has been archived.

The maintainers of this project have marked this project as archived. No new releases are expected.

Project description

RagFlow Client

A Python client library for interacting with the RagFlow API. This package provides a clean interface for creating and managing datasets, documents, and chat assistants through the RagFlow platform.

Table of Contents

Installation

You can install the RagFlow client package using pip:

pip install ragflow-client

Configuration

The client requires RagFlow API credentials to function. You can provide these in two ways:

Using environment variables

Create a .env file in your project root:

RAGFLOW_API_KEY=your_api_key_here
RAGFLOW_BASE_URL=https://your.ragflow.instance

Then the client will automatically load these credentials:

from ragflow_client import RagFlowClient

client = RagFlowClient()  # Loads credentials from environment variables

Passing credentials directly

from ragflow_client import RagFlowClient

client = RagFlowClient(
    api_key="your_api_key_here",
    base_url="https://your.ragflow.instance"
)

Getting Started

Here's a quick example to get started with the RagFlow client:

from ragflow_client import RagFlowClient

# Initialize client
client = RagFlowClient()

# Create a dataset
dataset_name = "my_dataset"
client.create_dataset(dataset_name)

# Upload documents
file_paths = ["document1.pdf", "document2.docx"]
client.upload_document(dataset_name, file_paths)

# Create a session for chat
session_name = "my_session"
client.create_session(dataset_name, session_name)

# Chat with the documents
response = client.chat(dataset_name, session_name, "What information can you provide about these documents?")
print(response)

API Reference

Dataset Management

Create a dataset

dataset_result = client.create_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to create

Returns:

  • dict: API response containing dataset information

Raises:

  • ValueError: If required credentials are missing
  • ResponseError: If API returns an error response

Get dataset information

dataset = client.get_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to retrieve

Returns:

  • dict: Dataset information including ID and other properties

Raises:

  • ValueError: If required credentials are missing or dataset is not found

Delete a dataset

success = client.delete_dataset(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to delete

Returns:

  • bool: True if deletion was successful, False otherwise

Raises:

  • ValueError: If required credentials are missing

Document Management

Upload documents

# Upload a single document
result = client.upload_document(dataset_name, "path/to/document.pdf")

# Upload multiple documents
result = client.upload_document(dataset_name, ["doc1.pdf", "doc2.docx"])

# Upload without progress bar
result = client.upload_document(dataset_name, file_paths, show_progress=False)

Parameters:

  • dataset_name (str): Name of the dataset to upload documents to
  • file_paths (str or list): Path to file or list of file paths to upload
  • show_progress (bool, optional): Whether to show progress bar. Defaults to True.

Returns:

  • dict: Upload status and list of uploaded documents

Raises:

  • ValueError: If required credentials are missing or dataset is not found
  • FileNotFoundError: If any of the files don't exist

List documents

documents = client.list_documents(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to list documents from

Returns:

  • list: List of document information dictionaries

Raises:

  • ValueError: If required credentials are missing or dataset is not found

Delete documents

# Delete specific documents
success = client.delete_documents(dataset_name, document_ids=["doc_id1", "doc_id2"])

# Delete all documents in dataset
success = client.delete_documents(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset containing documents to delete
  • document_ids (list, optional): List of document IDs to delete. If None, deletes all documents.

Returns:

  • bool: True if deletion was successful, False otherwise

Raises:

  • ValueError: If required credentials are missing

Chat Assistant Management

Create a chat assistant

assistant = client.create_chat_assistant(
    dataset_name, 
    temperature=0.3, 
    top_p=1.0, 
    presence_penalty=0.4
)

Parameters:

  • dataset_name (str): Name of the dataset to create assistant for
  • temperature (float, optional): LLM temperature parameter. Defaults to 0.3.
  • top_p (float, optional): LLM top_p parameter. Defaults to 1.0.
  • presence_penalty (float, optional): LLM presence penalty. Defaults to 0.4.

Returns:

  • dict: API response containing chat assistant information

Raises:

  • ValueError: If required credentials are missing or dataset is not found
  • ResponseError: If API returns an error response

Note: In most cases, you don't need to explicitly create a chat assistant as it will be automatically created when needed by the chat method.

Session Management

Create a session

session = client.create_session(dataset_name, session_name)

Parameters:

  • dataset_name (str): Name of the dataset to create session for
  • session_name (str): Name of the session to create

Returns:

  • dict: API response containing session information

Raises:

  • ValueError: If required credentials are missing or chat assistant is not found
  • ResponseError: If API returns an error response

List sessions

sessions = client.list_sessions(dataset_name)

Parameters:

  • dataset_name (str): Name of the dataset to list sessions for

Returns:

  • dict: API response containing list of sessions

Raises:

  • ValueError: If required credentials are missing or chat assistant not found
  • ResponseError: If API returns an error response

Delete a session

result = client.delete_session(dataset_name, session_name)

Parameters:

  • dataset_name (str): Name of the dataset containing the session
  • session_name (str): Name of the session to delete

Returns:

  • dict: API response or status information

Raises:

  • ValueError: If required credentials are missing or chat session not found
  • ResponseError: If API returns an error response

Chat Functionality

Send a chat message

# Simple usage - returns just the answer string
answer = client.chat(dataset_name, session_name, user_message)

# Get full response details
response = client.chat(dataset_name, session_name, user_message, stream=False)

Parameters:

  • dataset_name (str): Name of the dataset to chat with
  • session_name (str): Name of the chat session to use
  • user_message (str): User's message to send
  • stream (bool, optional): Whether to stream the response. Defaults to False.

Returns:

  • str or dict: If successful, returns the answer string. Otherwise, returns full response dict.

Raises:

  • ValueError: If required credentials are missing or chat assistant/session not found
  • ResponseError: If API returns an error response

Note: If the session doesn't exist, it will be created automatically. If the chat assistant doesn't exist, it will also be created automatically.

Examples

Complete Workflow Example

import os
from ragflow_client import RagFlowClient

# Initialize client
client = RagFlowClient()

# Create a dataset
dataset_name = "research_papers"
client.create_dataset(dataset_name)
print(f"Dataset '{dataset_name}' created")

# Upload documents
pdf_folder = "research_pdfs"
pdf_files = [os.path.join(pdf_folder, f) for f in os.listdir(pdf_folder) if f.endswith('.pdf')]
upload_result = client.upload_document(dataset_name, pdf_files)
print(f"Uploaded {upload_result['count']} documents")

# Create a session
session_name = "research_session"
client.create_session(dataset_name, session_name)
print(f"Session '{session_name}' created")

# Ask questions
questions = [
    "What are the main findings from these research papers?",
    "Summarize the methodologies used in these papers",
    "What are the common limitations mentioned in these studies?"
]

for question in questions:
    print(f"\nQ: {question}")
    answer = client.chat(dataset_name, session_name, question)
    print(f"A: {answer}")

# Cleanup
client.delete_session(dataset_name, session_name)
client.delete_documents(dataset_name)
client.delete_dataset(dataset_name)
print("Cleanup completed")

Error Handling Example

from ragflow_client import RagFlowClient
from ragflow_client.utils.api_utils import ResponseError

client = RagFlowClient()

try:
    # Try to get a non-existent dataset
    dataset = client.get_dataset("non_existent_dataset")
    
    # Check if dataset was found
    if not dataset.get("id"):
        print("Dataset not found, creating it...")
        client.create_dataset("non_existent_dataset")
    
    # Try to chat with a non-existent session
    response = client.chat("non_existent_dataset", "new_session", "Hello")
    print(f"Response: {response}")
    
except ValueError as e:
    print(f"Validation error: {str(e)}")
except ResponseError as e:
    print(f"API error (Status {e.status_code}): {e.message}")
except Exception as e:
    print(f"Unexpected error: {str(e)}")

CLI Usage

The RagFlow client package also includes a command-line interface for interacting with the API:

# Show help
ragflow --help

# Create a dataset
ragflow dataset create my_dataset

# Upload documents
ragflow document upload my_dataset document1.pdf document2.pdf

# Create a session
ragflow chat create-session my_dataset my_session

# List all sessions
ragflow chat list-sessions my_dataset

# Send a chat message
ragflow chat send my_dataset my_session "What information is in these documents?"

# Interactive chat mode
ragflow chat interactive my_dataset my_session

You can provide API credentials via environment variables or command-line options:

ragflow --api-key YOUR_API_KEY --base-url YOUR_BASE_URL dataset create my_dataset

Troubleshooting

Common Issues

  1. Authentication Errors:

    • Ensure your API key is correct
    • Check that your base URL has the correct format without trailing slashes
  2. Document Upload Issues:

    • Verify file paths are correct and the files exist
    • Check file permissions
    • Ensure document formats are supported (PDF, DOCX, TXT, etc.)
  3. Chat Response Issues:

    • Verify the dataset contains properly parsed documents
    • Ensure the question is clear and relevant to the document content

Enabling Debug Logging

To help troubleshoot issues, you can enable debug logging:

import logging
logging.basicConfig(level=logging.DEBUG)

client = RagFlowClient()
# The client will now log detailed debug information

Contributing

Contributions to RagFlow Client are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ragflow_client-0.1.0.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ragflow_client-0.1.0-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file ragflow_client-0.1.0.tar.gz.

File metadata

  • Download URL: ragflow_client-0.1.0.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragflow_client-0.1.0.tar.gz
Algorithm Hash digest
SHA256 aaa1ff34405cc505ad76051672dfaf910cd741e6b9236cdf81e04b299f952e22
MD5 be927ca92fe469f2948cbe04a053a547
BLAKE2b-256 867506c1c0d39e6fcc05b6079f3c27b0a57f3e1f809380db778583bbadd79722

See more details on using hashes here.

File details

Details for the file ragflow_client-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ragflow_client-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ragflow_client-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b65625eef7c867e5ed5e7b806a73e0ac292f68d19973169a4bcafca133f98a07
MD5 535c8b1373e127827da257cac31a52ea
BLAKE2b-256 3136acf9d8023e87a97a8122244acbdf23d208999b368be858b7595d14ef1eb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page