No project description provided
Project description
Semantix GenAI Serve
Semantix GenAI Serve is a library designed for users of Semantix GenAI Hub to create their own servers running AI models. This library provides an easy-to-use interface for serving AI models using the scalable infrastructure of Semantix GenAI Hub.
Features
- Easy integration with Hugging Face Transformers models
- Support for serving Seq2Seq models
- Customizable server settings
- Built-in support for GPU acceleration
Installation
To install the Semantix GenAI Serve library, run the following command:
pip install semantix-genai-serve
Usage
Basic Example
Here's a simple example of how to use the Semantix GenAI Serve library to serve a Hugging Face Transformers model:
from semantix_genai_serve import SemantixTorchKserve
from semantix_genai_serve.huggingface import ServeAutoSeq2SeqLM
class MyModel(ServeAutoSeq2SeqLM):
def predict(self, payload, headers):
# Implement your custom inference logic here
pass
model = MyModel(checkpoint="facebook/bart-large-cnn")
model.start_server()
Customizing Server Settings
You can customize the server settings by passing additional arguments to the SemantixTorchKserve
constructor:
model = MyModel(
checkpoint="facebook/bart-large-cnn",
name="my_model",
base_cache_dir="/path/to/cache",
force_local_load=True
)
name
: The name of the predictor (default: "predictor")base_cache_dir
: The base directory for caching models (default: "/mnt/models"). Do not change that when deploying to Semantix GenAI Hub.force_local_load
: If set toTrue
, the model will be loaded from the local cache directory instead of downloading from Hugging Face's model hub (default:False
). Do not change that when deploying to Semantix GenAI Hub.
Implementing Custom Inference Logic
To implement custom inference logic, you need to override the predict
method in your model class:
class MyModel(ServeAutoSeq2SeqLM):
def predict(self, payload, headers):
# Implement your custom inference logic here
input_example = payload["input_example"]
# do something for inference
# your output must be a dictionary so it gets automatically converted to JSON
return {"response": "output"}
Contributing
We welcome contributions to the Semantix GenAI Serve library! If you have any suggestions, bug reports, or feature requests, please open an issue on our GitHub repository.
License
Semantix GenAI Serve is released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for semantix_genai_serve-0.0.3.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | d5c176d53ca785e554ebd47c5688afb2bee6b9d9bd3a46c8ef6f0c3f319d3a11 |
|
MD5 | 1848fb85f594247c53ce29505d9a25a9 |
|
BLAKE2b-256 | 1fe083ed0963e78ebe39756522430ff14969cf02795d29ed48ffb776755ddff7 |
Hashes for semantix_genai_serve-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a5c03a37a3d2d4b62bf13718f1fbb6b45847904ddf4e8c68c8de6bbe7664ff6 |
|
MD5 | 5f7594dada8561c47b0ce97113b9b550 |
|
BLAKE2b-256 | e9fe152c44806ca99010062e835fda1e104372c7440edb43f420307cf62261de |