No project description provided

These details have not been verified by PyPI

Project description

LLM API Documentation

This API allows interaction with a distributed LLM architecture using RabbitMQ and Redis. Requests are processed asynchronously by a worker system (LLM-core) that generates responses and saves them to Redis. The API retrieves results from Redis and sends them back to the user.

Endpoints

`/generate`

Method: POST
Description: Sends a prompt for single message generation.
Request Body:
```
{
  "job_id": "string",
  "meta": {
    "temperature": 0.2,
    "tokens_limit": 8096,
    "stop_words": [
      "string"
    ],
    "model": "string"
  },
  "content": "string"
}
```
- job_id (string): Unique identifier for the task.
- meta (object): Metadata for generation:
  - temperature (float): The degree of randomness in generation (default 0.2).
  - tokens_limit (integer): Maximum tokens for the response (default 8096).
  - stop_words (list of strings): Words to stop generation.
  - model (string): Model to use for generation.
- content (string): The input text for generation.
Response:
```
{
  "content": "string"
}
```
- content (string): The generated text.

`/chat_completion`

Method: POST
Description: Sends a conversation history for chat-based completions.
Request Body:
```
{
  "job_id": "string",
  "meta": {
    "temperature": 0.2,
    "tokens_limit": 8096,
    "stop_words": [
      "string"
    ],
    "model": "string"
  },
  "messages": [
    {
      "role": "string",
      "content": "string"
    }
  ]
}
```
- job_id (string): Unique identifier for the task.
- meta (object): Metadata for chat completion:
  - temperature (float): The degree of randomness in responses (default 0.2).
  - tokens_limit (integer): Maximum tokens for the response (default 8096).
  - stop_words (list of strings): Words to stop the generation.
  - model (string): Model to use for chat completion.
- messages (list of objects): Conversation history:
  - role (string): Role of the message sender ("user", "assistant", etc.).
  - content (string): Message content.
Response:
```
{
  "content": "string"
}
```
- content (string): The generated response.

Environment Variables

These variables must be configured and synchronized with the LLM-core system:

RabbitMQ Configuration

RABBIT_MQ_HOST: RabbitMQ server hostname or IP.
RABBIT_MQ_PORT: RabbitMQ server port.
RABBIT_MQ_LOGIN: RabbitMQ login username.
RABBIT_MQ_PASSWORD: RabbitMQ login password.
QUEUE_NAME: Name of the RabbitMQ queue to process tasks.

Redis Configuration

REDIS_HOST: Redis server hostname or IP.
REDIS_PORT: Redis server port.
REDIS_PREFIX: Key prefix for task results in Redis.

Internal LLM-core Configuration

INNER_LLM_URL: URL for the LLM-core worker service.

Example `.env` File

# API
CELERY_BROKER_URL=amqp://admin:admin@127.0.0.1:5672/
CELERY_RESULT_BACKEND=redis://127.0.0.1:6379/0
REDIS_HOST=redis
REDIS_PORT=6379
RABBIT_MQ_HOST=rabbitmq
RABBIT_MQ_PORT=5672
RABBIT_MQ_LOGIN=admin
RABBIT_MQ_PASSWORD=admin
WEB_RABBIT_MQ=15672
API_PORT=6672

# RabbitMQ
RABBITMQ_DEFAULT_USER=admin
RABBITMQ_DEFAULT_PASS=admin

System Architecture

Below is the architecture diagram for the interaction between API, RabbitMQ, LLM-core, and Redis:

+-------------------+       +-----------------+       +----------------+       +-------------------+
|                   |       |                 |       |                |       |                   |
|       API         +------>+    RabbitMQ     +------>+    LLM-core    +------>+      Redis         |
|                   |       |                 |       |                |       |                   |
+-------------------+       +-----------------+       +----------------+       +-------------------+
        ^                             ^                                ^
        |                             |                                |
        |      Requests are queued    |    Worker retrieves tasks     | Results are stored in Redis
        |      Results are polled     |                                |
        +-----------------------------+--------------------------------+

Flow

API:
- Receives requests via endpoints (/generate, /chat_completion).
- Publishes tasks to RabbitMQ.
- Polls Redis for results based on task IDs.
RabbitMQ:
- Acts as a queue for task distribution.
- LLM-core workers subscribe to queues to process tasks.
LLM-core:
- Retrieves tasks from RabbitMQ.
- Processes prompts or chat completions using LLM models.
- Stores results in Redis.
Redis:
- Acts as the result storage.
- API retrieves results from Redis when tasks are completed.

Usage

Running the API

Configure environment variables in the .env file.
Start the API using:

app = FastAPI()

config = Config.read_from_env()

app.include_router(get_router(config))

Running the API Locally (without Docker)

To run the API locally using Uvicorn, use the following command:

uvicorn protollm_api.backend.main:app --host 127.0.0.1 --port 8000 --reload

Or use this main file:

app = FastAPI()

config = Config.read_from_env()

app.include_router(get_router(config))
    
if __name__ == "__main__":
    uvicorn.run("protollm_api.backend.main:app", host="127.0.0.1", port=8000, reload=True)

Example Request

Generate

curl -X POST "http://localhost:8000/generate" -H "Content-Type: application/json" -d '{
  "job_id": "12345",
  "meta": {
    "temperature": 0.5,
    "tokens_limit": 1000,
    "stop_words": ["stop"],
    "model": "gpt-model"
  },
  "content": "What is AI?"
}'

Chat Completion

curl -X POST "http://localhost:8000/chat_completion" -H "Content-Type: application/json" -d '{
  "job_id": "12345",
  "meta": {
    "temperature": 0.5,
    "tokens_limit": 1000,
    "stop_words": ["stop"],
    "model": "gpt-model"
  },
  "messages": [
    {"role": "user", "content": "What is AI?"},
    {"role": "assistant", "content": "Artificial Intelligence is..."}
  ]
}'

Notes

Ensure that RABBIT_MQ_HOST, RABBIT_MQ_PORT, REDIS_HOST, and other variables are synchronized between the API and LLM-core containers.
The system supports distributed scaling by adding more LLM-core workers to the RabbitMQ queue.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.5

Oct 7, 2025

1.1.4

Aug 25, 2025

1.1.3

Aug 11, 2025

1.1.2

Aug 11, 2025

1.1.1

Aug 6, 2025

1.1.0

Aug 5, 2025

1.0.5

Feb 10, 2025

This version

1.0.4

Feb 7, 2025

1.0.3

Feb 6, 2025

1.0.2

Feb 6, 2025

1.0.1

Feb 5, 2025

1.0.0

Dec 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

protollm_api-1.0.4.tar.gz (4.8 kB view details)

Uploaded Feb 7, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

protollm_api-1.0.4-py3-none-any.whl (6.4 kB view details)

Uploaded Feb 7, 2025 Python 3

File details

Details for the file protollm_api-1.0.4.tar.gz.

File metadata

Download URL: protollm_api-1.0.4.tar.gz
Upload date: Feb 7, 2025
Size: 4.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.10.11 Windows/10

File hashes

Hashes for protollm_api-1.0.4.tar.gz
Algorithm	Hash digest
SHA256	`1af7baf8c937f90540a3bd81499d60478cf6b56c89ab97aae5481f569eda3b1f`
MD5	`10003f4fff25863027b8277b904240c6`
BLAKE2b-256	`33e5fb7431821c220f3af6fbe13229bff554bf5c979bc4927989991e8d5d4b18`

See more details on using hashes here.

File details

Details for the file protollm_api-1.0.4-py3-none-any.whl.

File metadata

Download URL: protollm_api-1.0.4-py3-none-any.whl
Upload date: Feb 7, 2025
Size: 6.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.10.11 Windows/10

File hashes

Hashes for protollm_api-1.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`071399a50569abb7bb06002cb73016dcd1b0b280116f5ea96e8325c0a12a5fa5`
MD5	`85da6b0e5f6f640e1919158e10992c5f`
BLAKE2b-256	`d3c662c68aad98d1900a1c4c6fdabb73347600669b16e40c1ceb9786ac5c8b6e`

See more details on using hashes here.

protollm-api 1.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

LLM API Documentation

Endpoints

/generate

/chat_completion

Environment Variables

RabbitMQ Configuration

Redis Configuration

Internal LLM-core Configuration

Example .env File

System Architecture

Flow

Usage

Running the API

Running the API Locally (without Docker)

Example Request

Generate

Chat Completion

Notes

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`/generate`

`/chat_completion`

Example `.env` File