Aims to be the Simplest Machine Learning Model Inference Server
Simple, but Powerful.
Help wanted. Translation, rap lyrics, all wanted. Feel free to create an issue.
Pinferencia tries to be the simplest machine learning inference server ever!
Three extra lines and your model goes online.
Serving a model with GUI and REST API has never been so easy.
If you want to
- give your model a GUI and REST API
- find a simple but robust way to serve your model
- write minimal codes while maintain controls over you service
- avoid any heavy-weight solutions
- compatible with other tools/platforms
You're at the right place.
Pinferencia features include:
- Fast to code, fast to go alive. Minimal codes needed, minimal transformation needed. Just based on what you have.
- 100% Test Coverage: Both statement and branch coverages, no kidding. Have you ever known any model serving tool so seriously tested?
- Easy to use, easy to understand.
- A pretty and clean GUI out of box.
- Automatic API documentation page. All API explained in details with online try-out feature.
- Serve any model, even a single function can be served.
- Support Kserve API, compatible with Kubeflow, TF Serving, Triton and TorchServe. There is no pain switching to or from them, and Pinferencia is much faster for prototyping!
pip install "pinferencia[streamlit]"
pip install "pinferencia"
Serve Any Model
from pinferencia import Server class MyModel: def predict(self, data): return sum(data) model = MyModel() service = Server() service.register(model_name="mymodel", model=model, entrypoint="predict")
Hooray, your service is alive. Go to http://127.0.0.1:8501/ and have fun.
Any Deep Learning Models? Just as easy. Simple train or load your model, and register it with the service. Go alive immediately.
Details: HuggingFace Pipeline - Vision
from transformers import pipeline from pinferencia import Server vision_classifier = pipeline(task="image-classification") def predict(data): return vision_classifier(images=data) service = Server() service.register(model_name="vision", model=predict)
import torch from pinferencia import Server # train your models model = "..." # or load your models (1) # from state_dict model = TheModelClass(*args, **kwargs) model.load_state_dict(torch.load(PATH)) # entire model model = torch.load(PATH) # torchscript model = torch.jit.load('model_scripted.pt') model.eval() service = Server() service.register(model_name="mymodel", model=model)
import tensorflow as tf from pinferencia import Server # train your models model = "..." # or load your models (1) # saved_model model = tf.keras.models.load_model('saved_model/model') # HDF5 model = tf.keras.models.load_model('model.h5') # from weights model = create_model() model.load_weights('./checkpoints/my_checkpoint') loss, acc = model.evaluate(test_images, test_labels, verbose=2) service = Server() service.register(model_name="mymodel", model=model, entrypoint="predict")
Any model of any framework will just work the same way. Now run
uvicorn app:service --reload and enjoy!
If you'd like to contribute, details are here
Release history Release notifications | RSS feed
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Hashes for pinferencia-0.2.1-py3-none-any.whl