The smallest possible LLM API
Project description
MicroLlama
The smallest possible LLM API. Build a question and answer interface to your own content in a few minutes. Uses OpenAI embeddings, gpt-3.5 and Faiss, via Langchain.
Usage
- Combine your source documents into a single JSON file called
source.json
. It should look like this:
[
{
"source": "Reference to the source of your content. This could be a URL or a title or a filename",
"content": "Your content as a single string. If there's a title or summary, put these first, separated by new lines."
},
...
]
See example.source.json
for an example.
- Install dependencies:
pip install langchain faiss-cpu openai fastapi "uvicorn[standard]"
-
Get an OpenAI API key and add it to the environment, e.g.
export OPENAI_API_KEY=sk-etc
. Note that indexing and querying require OpenAI credits, which aren't free. -
Run your server with
uvicorn serve:app
. If the search index doesn't exist, it'll be created and stored. -
Query your documents at /api/ask?your question or use the simple front-end at /
Deploying your API
On Fly.io
Sign up for a Fly.io account and install flyctl. Then:
fly launch # answer no to Postgres, Redis and deploying now
fly secrets set OPENAI_API_KEY=sk-etc
fly deploy
On Google Cloud Run
gcloud run deploy --source . --set-env-vars="OPENAI_API_KEY=sk-etc"
For Cloud Run and other serverless platforms you should probably generate the
FAISS index at container build time, to reduce cold starts. See the two
commented lines in Dockerfile
.
Based on
- Langchain
- Simon Willison's blog post, datasette-openai and datasette-faiss.
- FastAPI
- GPT Index
- Dagster blog post
TODO
- Use splitting which generates more meaningful fragments, e.g.
text_splitter =
SpacyTextSplitter(chunk_size=700, chunk_overlap=200, separator=" ")
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for microllama-0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c3245d59756e9c8c1c112fb8dc2695c9b160fe11d53a6380edadab8123ad0887 |
|
MD5 | e34d91388deaea0ef9a9096798167f67 |
|
BLAKE2b-256 | 21210a9387783a56abd8c8473c12e83248847027df13ce7066c0000977b2075b |