Skip to main content

No project description provided

Project description

BotRun Ask Folder

這個專案提供了一個從 Google Drive 資料夾下載文件並處理成嵌入式向量,最後將其上傳到 Qdrant 的工具。以下是如何使用這個工具的說明。


安裝

請先確保您已經安裝 Python 以及 pip。然後,您可以使用以下指令來安裝這個專案的依賴套件:

pip install botrun-ask-folder

使用方法

調用 botrun_ask_folder

botrun_ask_folder 函數可以幫助您下載指定 Google Drive 資料夾中的文件,進行處理並上傳到 Qdrant。

from botrun_ask_folder import botrun_ask_folder

# Google Drive 資料夾ID
google_drive_folder_id = "your_google_drive_folder_id"

botrun_ask_folder(google_drive_folder_id)

所需環境變數

在運行此工具前,請設置以下環境變數:

環境變數 說明
GOOGLE_APPLICATION_CREDENTIALS 用於Google服務帳戶的憑證路徑
QDRANT_HOST Qdrant 伺服器的主機名 (default為 "qdrant")
QDRANT_PORT Qdrant 伺服器的埠號 (default為 6333)

各個函數的詳細用法

drive_download

從 Google Drive 下載文件。

from drive_download import drive_download

google_service_account_key_path = "/path/to/google_service_account_key.json"
google_drive_folder_id = "your_google_drive_folder_id"
max_results = 9999999
output_folder = "./data/your_google_drive_folder_id"

drive_download(google_service_account_key_path, google_drive_folder_id, max_results, output_folder)

run_split_txts

將下載的文件切分成指定大小的文本片段。

from run_split_txts import run_split_txts

input_folder = "./data/your_google_drive_folder_id"
split_size = 2000  # 每個文本片段的最大字符數
verbose = False

run_split_txts(input_folder, split_size, verbose)

embeddings_to_qdrant

將文本片段轉換為嵌入式向量並上傳到 Qdrant。

import asyncio
from embeddings_to_qdrant import embeddings_to_qdrant

input_folder = "./data/your_google_drive_folder_id"
embedding_model_name = "openai/text-embedding-3-large"
batch_size = 3072
concurrency = 30
collection_name = "your_google_drive_folder_id"
qdrant_host = "qdrant"
qdrant_port = 6333

asyncio.run(embeddings_to_qdrant(input_folder, embedding_model_name, batch_size, concurrency, collection_name, qdrant_host, qdrant_port))

botrun_drive_manager

管理和更新 .botrun 提示工程的模板與副本。

from botrun_drive_manager import botrun_drive_manager

botrun_template_name = "your_botrun_template_name"
google_drive_folder_id = "your_google_drive_folder_id"

botrun_drive_manager(botrun_template_name, google_drive_folder_id)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

botrun_ask_folder-4.6.96.tar.gz (20.8 kB view hashes)

Uploaded Source

Built Distribution

botrun_ask_folder-4.6.96-py2.py3-none-any.whl (32.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page