A library for managing LLM models

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Project description

Description

📦 ModelhubClient: A Python client for the Modelhub. Support various models including LLMs, embedding models, audio models and multi-modal models. These models are implemented by either 3rdparty APIs or self-host instances.

Installation

pip install puyuan_modelhub --user

Quick Start

ModelhubClient

Initialization

from modelhub import ModelhubClient

client = ModelhubClient(
    host="https://modelhub.puyuan.tech/api/",
    user_name="xxxx",
    user_password="xxxx",
    model="xxx", # Optional
)

Get supported models

client.supported_models

Create a stateless chat

response = client.chat(
    query,
    model="xxx", # Optional(be None to use the model specific in initialization)
    history=history,
    parameters=dict(
        key1=value1,
        key2=value2
    )
)

Get embeddings

client.get_embeddings(["你好", "Hello"], model="m3e")

Context Compression/Distillation

Chat using lingua will return a compressed/distillated context. Currently we use Llama-2-7B-Chat-GPTQ as LLMlingua backend. Theorically, any local model(Baichuan, ChatGLM, etc.) which can be loaded using AutoModelForCasualLM can be used as the backend, thus should provide a compress API for every local model, this is a future work since LLMlingua doesn't support it naively.

Parameters for lingua model:

client.chat(
    prompt: str,
    model = "lingua",
    history: List[Dict[str, str]],
    parameters = dict(
        question: str = "",
        ratio: float = 0.5,
        target_token: float = -1,
        iterative_size: int = 200,
        force_context_ids: List[int] = None,
        force_context_number: int = None,
        use_sentence_level_filter: bool = False,
        use_context_level_filter: bool = True,
        use_token_level_filter: bool = True,
        keep_split: bool = False,
        keep_first_sentence: int = 0,
        keep_last_sentence: int = 0,
        keep_sentence_number: int = 0,
        high_priority_bonus: int = 100,
        context_budget: str = "+100",
        token_budget_ratio: float = 1.4,
        condition_in_question: str = "none",
        reorder_context: str = "original",
        dynamic_context_compression_ratio: float = 0.0,
        condition_compare: bool = False,
        add_instruction: bool = False,
        rank_method: str = "llmlingua",
        concate_question: bool = True,
    )
)

Async Support

Every sync method has the corresponding async one starts with "a"(See API Documentation below). For example:

Use async mechanism to make concurrent requests.

Note Unlike API models, local models are now single threaded, requested will be queued when using async. In the future, we need to adopt a more flexible inference pipeline. Github Topic

import anyio

async with anyio.create_task_group() as tg:
    async def query(question):
        print(await client.achat(question, model="gpt-3.5-turbo"))
    questions = ["hello", "nihao", "test", "test1", "test2"]
    for q in questions:
        tg.start_soon(query, q)

`gemini-pro` embedding need extra parameters

Use the embed_content method to generate embeddings. The method handles embedding for the following tasks (task_type):

Task Type	Description
RETRIEVAL_QUERY	Specifies the given text is a query in a search/retrieval setting.
RETRIEVAL_DOCUMENT	Specifies the given text is a document in a search/retrieval setting. Using this task type requires a `title`.
SEMANTIC_SIMILARITY	Specifies the given text will be used for Semantic Textual Similarity (STS).
CLASSIFICATION	Specifies that the embeddings will be used for classification.
CLUSTERING	Specifies that the embeddings will be used for clustering.

Response

generated_text: response_text from model
history: generated history, **only chatglm3 return this currently.**
details: generation details. Include tokens used, request duration, ...

History Parameter

You can either use list of pre-defined message types or raw dicts containing role and content KV as history.

Note that not every model support role type like system

# import some pre-defined message types
from modelhub.common.types import SystemMessage, AIMessage, UserMessage
# construct history of your own
history = [
    SystemMessage(content="xxx", other_value="xxxx"),
    UserMessage(content="xxx", other="xxxx"),
]

VLMClient (Deprecated)

No Visual Language Models are hosted currently, cogvlm specifically. And this Client will be migrated to ModelhubClient in the future.

Initailization

from modelhub import VLMClient
client = VLMClient(...)
client.chat(prompt=..., image_path=..., parameters=...)

Chat with model

VLM Client chat add image_path param to Modelhub Client and other params are same.

client.chat("Hello?", image_path="xxx", model="m3e")

OpenAI Client

Only a small subset of models are supported to use in this manner currently. Others will rasie Exception.

from openai import OpenAI

client = OpenAI(
    api_key=f"{user_name}:{user_password}",
    base_url="https://modelhub.puyuan.tech/api/v1"
)

client.chat.completions.create(..., model="self-host-models")

Examples

Use ChatCLM3 for tools calling

from modelhub import ModelhubClient, VLMClient
from modelhub.common.types import SystemMessage

client = ModelhubClient(
    host="https://xxxxx/api/",
    user_name="xxxxx",
    user_password="xxxxx",
)
tools = [
    {
        "name": "track",
        "description": "追踪指定股票的实时价格",
        "parameters": {
            "type": "object",
            "properties": {"symbol": {"description": "需要追踪的股票代码"}},
            "required": ["symbol"],
        },
    },
    {
        "name": "text-to-speech",
        "description": "将文本转换为语音",
        "parameters": {
            "type": "object",
            "properties": {
                "text": {"description": "需要转换成语音的文本"},
                "voice": {"description": "要使用的语音类型（男声、女声等）"},
                "speed": {"description": "语音的速度（快、中等、慢等）"},
            },
            "required": ["text"],
        },
    },
]

# construct system history
history = [
    SystemMessage(
        content="Answer the following questions as best as you can. You have access to the following tools:",
        tools=tools,
    )
]
query = "帮我查询股票10111的价格"

# call ChatGLM3
response = client.chat(query, model="ChatGLM3", history=history)
history = response.history
print(response.generated_text)

Output:
{"name": "track", "parameters": {"symbol": "10111"}}

# generate a fake result for track function call

result = {"price": 1232}

res = client.chat(
    json.dumps(result),
    parameters=dict(role="observation"), # Tell ChatGLM3 this is a function call result
    model="ChatGLM3",
    history=history,
)
print(res.generated_text)

Output:
根据API调用结果，我得知当前股票的价格为1232。请问您需要我为您做什么？

Contact

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Development Status
- 3 - Alpha
Intended Audience
- Developers
License
- OSI Approved :: MIT License
Programming Language

Release history Release notifications | RSS feed

This version

1.1.7

Mar 12, 2024

1.1.6

Mar 8, 2024

1.1.5

Feb 15, 2024

1.1.4

Jan 29, 2024

1.1.3

Jan 26, 2024

1.1.2

Dec 25, 2023

1.1.1

Dec 23, 2023

1.1.0

Dec 22, 2023

1.0.17

Dec 22, 2023

1.0.16

Dec 22, 2023

1.0.15

Dec 19, 2023

1.0.14

Dec 5, 2023

1.0.13

Nov 24, 2023

1.0.12

Nov 24, 2023

1.0.11

Nov 24, 2023

1.0.10

Nov 22, 2023

1.0.9

Nov 21, 2023

1.0.8

Nov 11, 2023

1.0.7

Nov 3, 2023

1.0.6

Nov 3, 2023

1.0.5

Nov 2, 2023

1.0.4

Nov 2, 2023

1.0.3

Oct 18, 2023

1.0.2

Oct 13, 2023

1.0.1

Oct 10, 2023

1.0.0

Oct 10, 2023

0.0.10

Nov 22, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

puyuan_modelhub-1.1.7.tar.gz (9.6 kB view hashes)

Uploaded Mar 12, 2024 Source

Built Distribution

puyuan_modelhub-1.1.7-py2.py3-none-any.whl (13.3 kB view hashes)

Uploaded Mar 12, 2024 Python 2 Python 3

Hashes for puyuan_modelhub-1.1.7.tar.gz

Hashes for puyuan_modelhub-1.1.7.tar.gz
Algorithm	Hash digest
SHA256	`fe70d1f1b4b5769274a028894ed13af2f58bc0649c8f7700d8dca88684ccbdd5`
MD5	`7bb748b9084f77f482d5a118b24742b9`
BLAKE2b-256	`bf2929a48f6d7d149a2b2d3d2d7638178ac50c9d0ead790dc65db70a1d7f5423`

Hashes for puyuan_modelhub-1.1.7-py2.py3-none-any.whl

Hashes for puyuan_modelhub-1.1.7-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`d8e5101be7c1620084d6d3f09740d762b2d1acb5787098972b2fe2df377f82a0`
MD5	`554fd7d3ff7626e2057f5e20db6837c0`
BLAKE2b-256	`187c775d01fa5483e952d1e72d486ae7357b8fd4332c567fd91af6b5559140a4`

puyuan-modelhub 1.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Project description

Description

Installation

Quick Start

ModelhubClient

Initialization

Get supported models

Create a stateless chat

Get embeddings

Context Compression/Distillation

Async Support

gemini-pro embedding need extra parameters

Response

History Parameter

VLMClient (Deprecated)

Initailization

Chat with model

OpenAI Client

Examples

Use ChatCLM3 for tools calling

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

GitHub Statistics

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

`gemini-pro` embedding need extra parameters