A sequence-based LLM orchestration framework

These details have not been verified by PyPI

Project links

Project description

Sequence-LLM

Sequence-LLM is a terminal-first orchestration tool for running local large language models through llama.cpp (llama-server) with automatic server lifecycle management, profile-based switching, and reproducible workflows.

It removes the need to manually start servers, remember commands, or write shell scripts when working with multiple models.

Sequence-LLM works with any hardware supported by llama.cpp — CPU, CUDA GPUs, ROCm, Metal, and more.

Cross-platform: Windows, Linux, macOS.

Why Sequence-LLM

Running local models often involves:

Manually starting and stopping servers
Remembering model paths and ports
Managing multiple configurations
Writing ad-hoc scripts to switch models
Repeating setup across machines

Sequence-LLM solves this by providing:

Named model profiles
Automatic start and shutdown of servers
Interactive chat interface
Consistent configuration across machines
Deterministic, script-free workflows

Who Is This For

Developers running multiple local models
AI engineers building local pipelines
Researchers comparing architectures
Self-hosting enthusiasts
GPU workstation users
CLI-first workflows

If you use tools like llama.cpp, with upcoming support for Ollama, LM Studio, or custom scripts - Sequence-LLM simplifies the workflow.

Features

Interactive CLI built with Typer and Rich
Profile-based model switching (/brain, /coder, etc.)
Automatic shutdown of previous server before starting a new one
Health checking with readiness polling
Context-window safety guard (prevents overflow / crashes)
Cross-platform process management using subprocess and psutil
OS-aware configuration directory creation
Conversation history management
Status panel showing active model and server info
First-run configuration wizard

Hardware Support

Sequence-LLM does not perform inference itself.

It orchestrates llama-server, meaning it works with:

CPU inference
NVIDIA CUDA GPUs
AMD ROCm GPUs
Apple Metal
Any backend supported by llama.cpp

Comparison with Other Tools

Tool	Primary Focus	Sequence-LLM Advantage
Ollama	Easy installs	Multi-model orchestration workflow
LM Studio	GUI	Lightweight CLI automation
Raw llama.cpp	Flexible	No manual scripts needed
Open-WebUI	Web UI	Minimal overhead terminal workflow

Sequence-LLM sits between simplicity and flexibility.

Installation

Requirements

Python 3.9+
llama-server binary from llama.cpp

Install from PyPI:

pip install sequence-llm

Quick Start

Run the CLI:

seq-llm

On first launch, a configuration file is created automatically.

Config locations:

Windows: %APPDATA%\sequence-llm\config.yaml
Linux: ~/.config/sequence-llm/config.yaml
macOS: ~/Library/Application Support/sequence-llm/config.yaml

Configuration Example

llama_server: "/path/to/llama-server"

defaults:
  threads: 6
  threads_batch: 8
  batch_size: 512

profiles:
  brain:
    name: "Brain Model"
    model_path: "/path/to/model.gguf"
    system_prompt: "/path/to/system.txt"
    port: 8081
    ctx_size: 16384
    temperature: 0.7

  coder:
    name: "Coder Model"
    model_path: "/path/to/coder.gguf"
    system_prompt: "/path/to/coder.txt"
    port: 8082
    ctx_size: 32768
    temperature: 0.3

CLI Usage

/status   → show active model and server status
/brain    → switch to brain profile
/coder    → switch to coder profile
/clear    → clear conversation history
/quit     → stop server and exit

Typing any text sends a message to the active model.

Example Workflow

Start CLI
Automatically load default model
Switch between models using commands
Chat interactively without restarting processes manually

Architecture

User → CLI → ServerManager → llama-server → Model
           ↑
        Config + Profiles

Core components:

CLI - interactive interface and command routing
Server Manager - lifecycle control of llama-server
API Client - communication with local inference server
Config System - YAML-based profiles and defaults

Roadmap

Planned evolution:

v0.3 — Multi-model named workflows
v0.4 — TUI interface
v0.5 — Hardware auto-optimization
v1.0 — Production stability

For Development and Contributors

Clone repository:

git clone https://github.com/Ananay28425/Sequence-LLM.git
cd Sequence-LLM
pip install -e .

Run tests:

pytest -v

License

AGPL-3.0 License. See LICENSE file for details.

Contributing

Pull requests and issues are welcome.

GitHub: https://github.com/Ananay28425/Sequence-LLM

Sequence-LLM provides a lightweight and predictable way to manage local LLM workflows from the terminal.

Sequence-LLM - Orchestrate LLM workflows with ease.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Mar 1, 2026

0.2.0

Mar 1, 2026

0.1.2

Feb 17, 2026

0.1.1

Feb 16, 2026

0.1.0

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequence_llm-0.2.1.tar.gz (34.4 kB view details)

Uploaded Mar 1, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sequence_llm-0.2.1-py3-none-any.whl (32.9 kB view details)

Uploaded Mar 1, 2026 Python 3

File details

Details for the file sequence_llm-0.2.1.tar.gz.

File metadata

Download URL: sequence_llm-0.2.1.tar.gz
Upload date: Mar 1, 2026
Size: 34.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`dffb6c4ad3bbb0de9f578f00f547ebabff617922b5845f66530370750e1b1b0f`
MD5	`f3ba26b9e3a62c762d4ec36c7945c127`
BLAKE2b-256	`b16fb112a07c8d9d7a89d3c1c613e76931b9adfd09cc3036ec91d45442e60938`

See more details on using hashes here.

File details

Details for the file sequence_llm-0.2.1-py3-none-any.whl.

File metadata

Download URL: sequence_llm-0.2.1-py3-none-any.whl
Upload date: Mar 1, 2026
Size: 32.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2baa63800eec0845ff008f6b5ed8aaca4ae26021f09f2d119f86af179220f48a`
MD5	`273447ac806a56d4f0428c59fd6b93e7`
BLAKE2b-256	`1bd63d92c10a99a74bbad0a8f1ee8b8fb9820ac8203a0bb6e7675191111cb55c`

See more details on using hashes here.

sequence-llm 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Sequence-LLM

Why Sequence-LLM

Who Is This For

Features

Hardware Support

Comparison with Other Tools

Installation

Requirements

Quick Start

Configuration Example

CLI Usage

Example Workflow

Architecture

Roadmap

For Development and Contributors

License

Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes