An easy-to-use, high-performance(?) backend for serving LLMs and other AI models, built on FastAPI.
Project description
FastMindAPI
An easy-to-use, high-performance(?) backend for serving LLMs and other AI models, built on FastAPI.
🚀 Quick Start
Install
pip install fastmindapi
Use
Run the server
# in Shell
fastmindapi-server --port 8000
# in Python
import fastmindapi as FM
server = FM.Server()
server.run()
Access via client / HTTP requests
curl http://IP:PORT/docs#/
import fastmindapi as FM
client = FM.Client(IP="x.x.x.x", PORT=xxx) # 127.0.0.1:8000 for default
client.add_model_info_list(model_info_list)
client.load_model(model_name)
client.generate(model_name, generation_request)
🪧 We primarily maintain the backend server; the client is provided for reference only. The main usage is through sending HTTP requests. (We might release FM-GUI in the future.)
✨ Features
Model: Support models with various backends
-
TransformersCausalLM
(AutoModelForCausalLM
)PeftCausalLM
(PeftModelForCausalLM
)
-
LlamacppLM
(Llama
)
-
...
Modules: More than just chatting with models
- Function Calling (extra tools in Python)
- Retrieval
- Agent
- ...
Flexibility: Easy to Use & Highly Customizable
- Load the model when coding / runtime
- Add any APIs you want
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastmindapi-0.0.4.tar.gz
(170.0 kB
view hashes)
Built Distribution
Close
Hashes for fastmindapi-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4424f2fb96740bdc479cd9100bde799099d6d31499b351becb1819da8ad0e8ce |
|
MD5 | 16ea206c0068c6450f56183f1c7a3d3e |
|
BLAKE2b-256 | 43e931d8730e9820ac3e078b44a0c90af91dd25aeaf63874b69e55f03be57388 |