Bytez langchain integration
Project description
langchain_bytez
This package allows you to use the Bytez API in langchain. Note, only text-generation, chat, image-text-to-text, video-text-to-text, and audio-text-to-text are currently supported.
Fully supports streaming + native async!
Curious about what else Bytez has to offer? You can check out Bytez here.
Want to know more about our API? Check out the docs!
Chat Example
import os
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage, SystemMessage
from langchain_bytez import BytezChatModel
API_KEY = os.environ.get("API_KEY")
bytez_chat_model_phi = BytezChatModel(
model_id="microsoft/Phi-3-mini-4k-instruct",
api_key=API_KEY,
capacity={
"min": 1,
"max": 1, # up to 10 instances
},
params={"max_new_tokens": 64},
timeout=10, # minutes before expiring
streaming=True,
callbacks=[StreamingStdOutCallbackHandler()],
)
messages = [
SystemMessage(
content="You are a helpful assistant that answers questions clearly and concisely."
),
HumanMessage(content="List the phylums in the biological taxonomy"),
]
results = bytez_chat_model_phi.invoke(messages)
Text generation (LLM) example
import os
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage, SystemMessage
from langchain_bytez import BytezLLM
API_KEY = os.environ.get("API_KEY")
bytez_chat_model_phi = BytezLLM(
model_id="microsoft/phi-2",
api_key=API_KEY,
capacity={
"min": 1,
"max": 1, # up to 10 instances
},
params={"max_new_tokens": 64},
timeout=10, # minutes before expiring
streaming=True,
callbacks=[StreamingStdOutCallbackHandler()],
)
messages = [
SystemMessage(
content="You are a helpful assistant that answers questions clearly and concisely."
),
HumanMessage(content="List the phylums in the biological taxonomy"),
]
results = bytez_chat_model_phi.invoke(messages)
Extending callback handlers for better observability
NOTE this is experimental and we're working to enhance it. In the meantime it will help bootstrap you in doing whatever you need to do with a model's "run" lifecycle.
import os
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage, SystemMessage
from langchain_bytez import BytezChatModel, BytezStdOutCallbackHandler
API_KEY = os.environ.get("API_KEY")
bytez_chat_model_phi = BytezChatModel(
model_id="microsoft/Phi-3-mini-4k-instruct",
api_key=API_KEY,
capacity={
"min": 1,
"max": 1, # up to 10 instances
},
params={"max_new_tokens": 64},
timeout=10, # minutes before expiring
streaming=True,
callbacks=[StreamingStdOutCallbackHandler(), BytezStdOutCallbackHandler()],
)
messages = [
SystemMessage(
content="You are a helpful assistant that answers questions clearly and concisely."
),
HumanMessage(content="List the phylums in the biological taxonomy"),
]
results = bytez_chat_model_phi.invoke(messages)
To roll our own implementation that better suites your needs, check out the implementation here
Shutdown your cluster
bytez_chat_model_phi.shutdown_cluster()
Update your cluster
bytez_chat_model_phi.capacity = {
"min": 2, # we've increased the minimum number of instances
"max": 3, # up to 10 instances
}
bytez_chat_model_phi.update_cluster()
kwargs for BytezChatModel and BytezLLM
model_id: str = Field(..., description="The unique model ID for the Bytez LLM.")
api_key: str = Field(..., description="The API key for accessing the Bytez LLM.")
capacity: dict = Field(
default_factory=dict,
description="Controls the scaling behavior, contains one or all keys 'desired': int, 'min': int, and 'max': int",
)
timeout: int = Field(
None,
description="Controls how many minutes to wait after the last inference to shutdown the cluster",
)
streaming: bool = Field(
False, description="Enable streaming responses from the API."
)
params: dict = Field(
default_factory=dict, description="Parameters passed to the Bytez API."
)
headers: dict = Field(
default_factory=dict,
description="Additional headers for the Bytez API. Matching keys override the defaults.",
)
http_timeout_s: float = Field(
60 * 5.0,
description="How long to wait in seconds for a response from the model before timing out",
)
API Playground
Explore our API endpoints in the documentation here.
Status
Check out the status of our API
Resources
Get to know our story, our mission, and our roadmap here.
Feedback
We’re committed to building the best developer experience for AI builders. Have feedback? Let us know on Discord or open an issue on GitHub.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langchain_bytez-0.0.7.tar.gz.
File metadata
- Download URL: langchain_bytez-0.0.7.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d89bbc06d88e321c89a7532ad1cbaa8517d6692280231699a69421b4e460a9c
|
|
| MD5 |
3bb19e20d00ddaa2257aaa388e84cbf3
|
|
| BLAKE2b-256 |
0de9d6b90d8f8b777536a129109efe5b3a012a641f14c0fd3f442cdf6a64a2f1
|
File details
Details for the file langchain_bytez-0.0.7-py3-none-any.whl.
File metadata
- Download URL: langchain_bytez-0.0.7-py3-none-any.whl
- Upload date:
- Size: 14.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.21
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01a3ad052da839d036b25decd6fd9d84cc0572a85179cea0eeaaa273331ebb64
|
|
| MD5 |
fe2e80e4a0c5aaa020ead23f258de18d
|
|
| BLAKE2b-256 |
891ee85fdb06ff59eeff8bd5925dd290c25ae940d070b21b07fde1805d666203
|