Skip to main content

A sequence-based LLM orchestration framework

Project description

Sequence-LLM

Sequence-LLM is a terminal-first CLI for running local LLMs through llama-server (llama.cpp) with profile-based model switching and automatic server lifecycle management.

It is designed for developers who run multiple local models and want a simple, reproducible workflow without writing shell scripts.

Cross-platform: Windows, Linux, macOS.

Why Sequence-LLM

Running local models often involves:

  • Manually starting and stopping servers
  • Remembering model paths and ports
  • Managing multiple configurations
  • Writing ad-hoc scripts to switch models

Sequence-LLM solves this by providing:

  • Named model profiles
  • Automatic start and shutdown of servers
  • Interactive chat interface
  • Consistent configuration across machines

Features

  • Interactive CLI built with Typer and Rich
  • Profile-based model switching (/brain, /coder, etc.)
  • Automatic shutdown of previous server before starting a new one
  • Health checking with readiness polling
  • Cross-platform process management using subprocess and psutil
  • OS-aware configuration directory creation
  • Conversation history management
  • Status panel showing active model and server info

Installation

Requirements

  • Python 3.9+
  • llama-server binary from llama.cpp

Install from PyPI:

pip install sequence-llm

Quick Start

Run the CLI:

pip install sequence-llm
seq-llm

On first launch, a configuration file is created automatically.

Config locations:

  • Windows: %APPDATA%\sequence-llm\config.yaml
  • Linux: ~/.config/sequence-llm/config.yaml
  • macOS: ~/Library/Application Support/sequence-llm/config.yaml

Configuration Example

llama_server: "/path/to/llama-server"

defaults:
  threads: 6
  threads_batch: 8
  batch_size: 512

profiles:
  brain:
    name: "Brain Model"
    model_path: "/path/to/model.gguf"
    system_prompt: "/path/to/system.txt"
    port: 8081
    ctx_size: 16384
    temperature: 0.7

  coder:
    name: "Coder Model"
    model_path: "/path/to/coder.gguf"
    system_prompt: "/path/to/coder.txt"
    port: 8082
    ctx_size: 32768
    temperature: 0.3

CLI Usage

/status   → show active model and server status
/brain    → switch to brain profile
/coder    → switch to coder profile
/clear    → clear conversation history
/quit     → stop server and exit

Typing any text sends a message to the active model.

Example Workflow

  1. Start CLI
  2. Automatically load default model
  3. Switch between models using commands
  4. Chat interactively without restarting processes manually

Architecture

Core components:

  • CLI: interactive interface and command routing
  • Server Manager: lifecycle control of llama-server
  • API Client: communication with local inference server
  • Config System: YAML-based profiles and defaults

Development

Clone repository:

git clone https://github.com/Ananay28425/Sequence-LLM.git
cd Sequence-LLM
pip install -e .

Run tests:

pytest -v

License

MIT License. See LICENSE file for details.

Contributing

Pull requests and issues are welcome.

GitHub: https://github.com/Ananay28425/Sequence-LLM


Sequence-LLM provides a lightweight and predictable way to manage local LLM workflows from the terminal.


Sequence-LLM - Orchestrate LLM sequences with ease.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequence_llm-0.1.1.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sequence_llm-0.1.1-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file sequence_llm-0.1.1.tar.gz.

File metadata

  • Download URL: sequence_llm-0.1.1.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.1.1.tar.gz
Algorithm Hash digest
SHA256 1b9f15b26453f31d334ce8bb451e983b68440d7239be0700c1893f5a11314039
MD5 4508d1cb1ccff7c8e977dd24f160dfdc
BLAKE2b-256 cbe641e95c6879c9d3a21982d447ef916a8c6517d60c91a4d3da0d51bc5c52f9

See more details on using hashes here.

File details

Details for the file sequence_llm-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sequence_llm-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bd29a3daa1cdd197b26ac6b1768f52c5dffbb79c2a66e33bc937b4c5a5023394
MD5 03890c63b71a8d56b11979df9772adb4
BLAKE2b-256 fc25f440d9b01a287ad27db9846db6bafb912c83a25a6066d3dfa211c5f2718e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page