Skip to main content

A sequence-based LLM orchestration framework

Project description

Sequence-LLM

Sequence-LLM is a terminal-first CLI for running local LLMs through llama-server (llama.cpp) with profile-based model switching and automatic server lifecycle management.

It is designed for developers who run multiple local models and want a simple, reproducible workflow without writing shell scripts.

Cross-platform: Windows, Linux, macOS.

Why Sequence-LLM

Running local models often involves:

  • Manually starting and stopping servers
  • Remembering model paths and ports
  • Managing multiple configurations
  • Writing ad-hoc scripts to switch models

Sequence-LLM solves this by providing:

  • Named model profiles
  • Automatic start and shutdown of servers
  • Interactive chat interface
  • Consistent configuration across machines

Features

  • Interactive CLI built with Typer and Rich
  • Profile-based model switching (/brain, /coder, etc.)
  • Automatic shutdown of previous server before starting a new one
  • Health checking with readiness polling
  • Cross-platform process management using subprocess and psutil
  • OS-aware configuration directory creation
  • Conversation history management
  • Status panel showing active model and server info

Installation

Requirements

  • Python 3.9+
  • llama-server binary from llama.cpp

Install from PyPI:

pip install sequence-llm

Quick Start

Run the CLI:

pip install sequence-llm
seq-llm

On first launch, a configuration file is created automatically.

Config locations:

  • Windows: %APPDATA%\sequence-llm\config.yaml
  • Linux: ~/.config/sequence-llm/config.yaml
  • macOS: ~/Library/Application Support/sequence-llm/config.yaml

Configuration Example

llama_server: "/path/to/llama-server"

defaults:
  threads: 6
  threads_batch: 8
  batch_size: 512

profiles:
  brain:
    name: "Brain Model"
    model_path: "/path/to/model.gguf"
    system_prompt: "/path/to/system.txt"
    port: 8081
    ctx_size: 16384
    temperature: 0.7

  coder:
    name: "Coder Model"
    model_path: "/path/to/coder.gguf"
    system_prompt: "/path/to/coder.txt"
    port: 8082
    ctx_size: 32768
    temperature: 0.3

CLI Usage

/status   → show active model and server status
/brain    → switch to brain profile
/coder    → switch to coder profile
/clear    → clear conversation history
/quit     → stop server and exit

Typing any text sends a message to the active model.

Example Workflow

  1. Start CLI
  2. Automatically load default model
  3. Switch between models using commands
  4. Chat interactively without restarting processes manually

Architecture

Core components:

  • CLI: interactive interface and command routing
  • Server Manager: lifecycle control of llama-server
  • API Client: communication with local inference server
  • Config System: YAML-based profiles and defaults

Development

Clone repository:

git clone https://github.com/Ananay28425/Sequence-LLM.git
cd Sequence-LLM
pip install -e .

Run tests:

pytest -v

License

MIT License. See LICENSE file for details.

Contributing

Pull requests and issues are welcome.

GitHub: https://github.com/Ananay28425/Sequence-LLM


Sequence-LLM provides a lightweight and predictable way to manage local LLM workflows from the terminal.


Sequence-LLM - Orchestrate LLM sequences with ease.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequence_llm-0.1.2.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sequence_llm-0.1.2-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file sequence_llm-0.1.2.tar.gz.

File metadata

  • Download URL: sequence_llm-0.1.2.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.1.2.tar.gz
Algorithm Hash digest
SHA256 0a160a9ebd1b3d1a772e4dcf6419b88f3844f2dc5ac193d2734c79e0e500a57b
MD5 c2201d3346a3ea2bcf21619b4896e310
BLAKE2b-256 d2dc41530975a0fd8bb8118193559a6245f403f61bd99f36e2249f95a0c8901d

See more details on using hashes here.

File details

Details for the file sequence_llm-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: sequence_llm-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 13.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for sequence_llm-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e147c5124d6d8b678c72ec9dc574d8b8958a2499617ade3b6dc295da1c53c886
MD5 1e240da6f208a60803ce2dd2ad673cf2
BLAKE2b-256 f9939739f1fea02f1672fe85c7790bac5c780fcd9914229b73e0bfea39539168

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page