An easy-to-use, high-performance(?) backend for serving LLMs and other AI models, built on FastAPI.
Project description
FastMindAPI
An easy-to-use, high-performance(?) backend for serving LLMs and other AI models, built on FastAPI.
🚀 Quick Start
Install
pip install fastmindapi
Use
Run the server
# in Shell
fastmindapi-server --port 8000
# in Python
import fastmindapi as FM
server = FM.Server()
server.run()
Access via client / HTTP requests
curl http://IP:PORT/docs#/
import fastmindapi as FM
client = FM.Client(IP="x.x.x.x", PORT=xxx) # 127.0.0.1:8000 for default
client.add_model_info_list(model_info_list)
client.load_model(model_name)
client.generate(model_name, generation_request)
🪧 We primarily maintain the backend server; the client is provided for reference only. The main usage is through sending HTTP requests. (We might release FM-GUI in the future.)
✨ Features
Model: Support models with various backends
-
TransformersCausalLM
(AutoModelForCausalLM
)PeftCausalLM
(PeftModelForCausalLM
)
-
LlamacppLM
(Llama
)
-
...
Modules: More than just chatting with models
- Function Calling (extra tools in Python)
- Retrieval
- Agent
- ...
Flexibility: Easy to Use & Highly Customizable
- Load the model when coding / runtime
- Add any APIs you want
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
fastmindapi-0.0.6.tar.gz
(173.4 kB
view hashes)
Built Distribution
Close
Hashes for fastmindapi-0.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1de784d374800b6f6979e54fe05bd9c8d23b19dee70dc15369399dc225516c58 |
|
MD5 | b9785994e7d92bcbf8797d820fd367a2 |
|
BLAKE2b-256 | b47d82de2cc59bdb585910a420899a4b9f3ee33493be878161f6377f73257418 |