Skip to main content

No project description provided

Project description

Semantix GenAI Serve

Semantix GenAI Serve is a library designed for users of Semantix GenAI Hub to create their own servers running AI models. This library provides an easy-to-use interface for serving AI models using the scalable infrastructure of Semantix GenAI Hub.

Features

  • Easy integration with Hugging Face Transformers models
  • Support for serving Seq2Seq models
  • Customizable server settings
  • Built-in support for GPU acceleration

Installation

To install the Semantix GenAI Serve library, run the following command:

pip install semantix-genai-serve

Usage

Basic Example

Here's a simple example of how to use the Semantix GenAI Serve library to serve a Hugging Face Transformers model:

from semantix_genai_serve import SemantixTorchKserve
from semantix_genai_serve.huggingface import ServeAutoSeq2SeqLM

class MyModel(ServeAutoSeq2SeqLM):
    def predict(self, payload, headers):
        # Implement your custom inference logic here
        pass

model = MyModel(checkpoint="facebook/bart-large-cnn")
model.start_server()

Customizing Server Settings

You can customize the server settings by passing additional arguments to the SemantixTorchKserve constructor:

model = MyModel(
    checkpoint="facebook/bart-large-cnn",
    name="my_model",
    base_cache_dir="/path/to/cache",
    force_local_load=True
)
  • name: The name of the predictor (default: "predictor")
  • base_cache_dir: The base directory for caching models (default: "/mnt/models"). Do not change that when deploying to Semantix GenAI Hub.
  • force_local_load: If set to True, the model will be loaded from the local cache directory instead of downloading from Hugging Face's model hub (default: False). Do not change that when deploying to Semantix GenAI Hub.

Implementing Custom Inference Logic

To implement custom inference logic, you need to override the predict method in your model class:

class MyModel(ServeAutoSeq2SeqLM):
    def predict(self, payload, headers):
        # Implement your custom inference logic here
        input_example = payload["input_example"]
        # do something for inference
        # your output must be a dictionary so it gets automatically converted to JSON
        return {"response": "output"}

Contributing

We welcome contributions to the Semantix GenAI Serve library! If you have any suggestions, bug reports, or feature requests, please open an issue on our GitHub repository.

License

Semantix GenAI Serve is released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantix_genai_serve-0.0.5.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

semantix_genai_serve-0.0.5-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file semantix_genai_serve-0.0.5.tar.gz.

File metadata

  • Download URL: semantix_genai_serve-0.0.5.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.2 CPython/3.10.0 Darwin/22.6.0

File hashes

Hashes for semantix_genai_serve-0.0.5.tar.gz
Algorithm Hash digest
SHA256 c88145b6c0b86f67341ed9d6bf71c2095d214f6572d03e4d03a9b4a75db8dc10
MD5 2aa6eeea073fcb0ea7b2437da3f51027
BLAKE2b-256 be611984e3a69b1e6532cc5b8af396d6a69cdbf2f7f53a01a12a9239393c8cca

See more details on using hashes here.

File details

Details for the file semantix_genai_serve-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for semantix_genai_serve-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 0b22530e44761c51dc7a8bd536e1d4b9752b7787e0bbde4dc51d93f55d78d6e5
MD5 25df76eb7d43ba30517d333b8c309a20
BLAKE2b-256 a69727d7b08c0b09195ef899f064e505d1e0915bf9033180cdf417bb85926afd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page