一个使用智谱AI GLM4V-Plus模型进行图片描述的Python包
Project description
shiertier_caption
一个使用智谱AI GLM4V系列模型进行图片描述的Python包。支持GLM4V-Flash、GLM4V、GLM4V-Plus等多个模型。
特点
- 支持多种GLM4V模型
- 支持本地图片和URL图片输入
- 支持多API密钥并行处理
- 支持MongoDB存储和批量处理
- 自动处理JSON响应格式
- 完整的错误处理机制
安装
pip install shiertier_caption
基础使用
单图片处理
from shiertier_caption import GLM4V
# 初始化客户端
client = GLM4V(api_key="your_api_key", model="glm-4v-plus-0111")
# 使用本地图片
response = client.prompt(
image_path_or_url="path/to/your/image.jpg",
prompt="描述这张图片"
)
print(response)
# 使用URL图片
response = client.prompt(
image_path_or_url="https://example.com/image.jpg",
prompt="描述这张图片",
is_url=True
)
print(response)
批量处理
from shiertier_caption import MultiGLM4V
# 使用多个API密钥初始化
api_keys = ["key1", "key2", "key3"]
client = MultiGLM4V(api_keys=api_keys, max_workers=64)
# 处理文件夹中的所有图片
results = client.prompt_folder("path/to/image/folder")
# 处理指定的图片列表
image_paths = ["image1.jpg", "image2.jpg", "image3.jpg"]
results = client.prompt_images(image_paths)
MongoDB支持
from shiertier_caption import MultiGLM4V_Mongo
# 初始化MongoDB支持的客户端
client = MultiGLM4V_Mongo(
api_keys=api_keys,
mongo_url="mongodb://localhost:27017",
max_workers=64
)
# 处理任务
results = client.prompt_images()
API文档
GLM4V类
GLM4V(api_key: str, model: str = "glm-4v-plus-0111")
参数:
api_key: 智谱AI的API密钥model: 模型名称,可选值:- glm-4v-flash
- glm-4v
- glm-4v-plus
- glm-4v-plus-0111
prompt方法
prompt(
image_path_or_url: str,
prompt: str = "",
need_json: bool = True,
temperature: float = 0.8,
is_url: bool = False
) -> str
参数:
image_path_or_url: 图片的本地路径或URLprompt: 关于图片的提问,默认使用预设的描述提示词need_json: 是否需要JSON格式的响应temperature: 采样温度,控制输出的随机性is_url: 是否为URL链接
返回值:
- 如果need_json为True,返回JSON格式的描述
- 如果need_json为False,返回文本格式的描述
MultiGLM4V类
MultiGLM4V(api_keys: Union[List[str], str], max_workers: int = 64, model: str = "glm-4v-plus-0111")
参数:
api_keys: API密钥列表或包含多个密钥的字符串max_workers: 最大并行工作线程数model: 使用的模型名称
MultiGLM4V_Mongo类
MultiGLM4V_Mongo(api_keys: Union[List[str], str], mongo_url: str, max_workers: int = 64, model: str = "glm-4v-plus-0111")
SiliconFlow类
from shiertier_caption import SiliconFlow
s = SiliconFlow(api_key: str, model: str = "Qwen/Qwen2-VL-72B-Instruct")
s.prompt(
image_path_or_url: str,
prompt: str = "",
is_url: bool = False
) -> str
参数:
api_key: 硅基流动的API密钥model: 使用的模型名称
许可证
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
shiertier_caption-0.3.0.tar.gz
(11.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shiertier_caption-0.3.0.tar.gz.
File metadata
- Download URL: shiertier_caption-0.3.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c99cd9e8c8273e88d03c76c23ee42768a309f3296a8e7a5b7ca4b58f60e019fe
|
|
| MD5 |
41ee32e1476e45aceec98df7264b0d17
|
|
| BLAKE2b-256 |
4f19ed91dc3aab1b766bf2bd7c080064ea9e49ee8ca8d2c02af9bc9cabb03aad
|
File details
Details for the file shiertier_caption-0.3.0-py3-none-any.whl.
File metadata
- Download URL: shiertier_caption-0.3.0-py3-none-any.whl
- Upload date:
- Size: 12.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a825cfa8c44d276a728e62d487bcbb5623da55f40f48818764b20a3d1b15213a
|
|
| MD5 |
c55fe5b3a25e7afec19acb49c0415f5e
|
|
| BLAKE2b-256 |
e96962fded6c0106aed77ff576537960b120f3d384a4f96d489bedac641c5502
|