Remote CLI client for Ollama servers - Network-ready chat interface with history tracking and inference control

These details have not been verified by PyPI

Project links

Project description

Ollama Remote Chat CLI

A network-ready command-line chat client for Ollama servers. Connect to local or remote Ollama instances from any machine on your network.

Quick Start

Installation

Install directly from PyPI:

pip install ollama-remote-chat-cli

Run the client:

orc

Or use the full command:

ollama-remote-chat-cli

First Run Setup

The first time you run orc, a setup wizard will guide you through:

Configuring your Ollama server URL
Selecting your preferred model
Setting up your preferences

orc
# Follow the interactive setup wizard

Features

🌐 Network-ready - Connect to local or remote Ollama servers
🎨 Retro terminal UI - Clean, colorful command-line interface with dynamic spinner messages
🧹 Smart screen clearing - /clear and /new preserve header while clearing chat history
🎨 Rich markdown rendering - Beautiful syntax highlighting and formatting
🧠 Thinking process support - See how reasoning models think (DeepSeek R1, QwQ, etc.)
📝 Chat history - Session management with search functionality
⚙️ Inference control - Configurable temperature, top_p, context window, and more
📊 Real-time metrics - Token usage, thinking metrics, and generation speed
🔄 Model management - Easy switching, pulling, and deletion of models
🔍 History search - Search through past conversations
💻 System monitoring - View running models and memory usage
🔒 Secure configuration - Environment-based settings with .env support

Requirements

Python 3.7 or higher
Ollama server (local or remote)
Required packages: requests, python-dotenv, wcwidth, rich

Connection Methods

Method 1: Local Connection (Default)

Ollama running on the same machine:

OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama2

Setup:

Install Ollama from ollama.ai
Run: ollama serve
Pull a model: ollama pull llama2

Method 2: Hostname Connection (.local / mDNS)

Connect to another computer on your local network:

OLLAMA_HOST=http://my-computer.local:11434
OLLAMA_MODEL=llama3.3

Setup:

Find your server's hostname:
- Windows: hostname in CMD
- Mac: System Preferences → Sharing → Computer Name
- Linux: hostname in terminal
Ensure mDNS/Bonjour is enabled:
- Windows: Install Bonjour Print Services
- Mac/Linux: Built-in support
Test: ping my-computer.local
Configure firewall to allow port 11434

Method 3: Static IP Address

For computers with fixed network IPs:

OLLAMA_HOST=http://192.168.1.100:11434
OLLAMA_MODEL=mistral

Setup:

Set static IP on your Ollama server
Find IP address:
- Windows: ipconfig
- Mac: ifconfig en0 | grep inet
- Linux: ip addr show
Configure firewall to allow port 11434

Method 4: Docker

Ollama running in a Docker container:

OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama2

Docker setup:

docker run -d -p 11434:11434 --name ollama ollama/ollama
docker exec -it ollama ollama pull llama2

Method 5: WSL (Windows Subsystem for Linux)

Accessing Windows Ollama host from WSL:

OLLAMA_HOST=http://host.docker.internal:11434
OLLAMA_MODEL=llama2

Or find Windows IP from WSL:

ip route | grep default | awk '{print $3}'

Available Commands

Command	Description
`/help`	Show all available commands
`/multi`	Enter multi-line input mode
`/clear`	Clear conversation context
`/new`	Start a new chat session
`/exit`	Exit the chat
Model Management
`/models`	List available models
`/switch`	Switch to a different model
`/pull`	Download a new model
`/delete`	Delete a model
`/create`	Create custom model from Modelfile
`/modelinfo`	Show detailed model information
Generation
`/generate`	Generate raw completion without chat context
Configuration
`/config`	Show current configuration
`/settings`	Configure inference settings
`/host`	Change Ollama host URL
History
`/history`	View chat history
`/search`	Search chat history
`/showthinking`	View thinking from last AI response
System Monitoring
`/version`	Show Ollama server version
`/ps`	List running models (memory usage)
`/ping`	Test connection latency to server

Inference Settings

Fine-tune AI behavior with the /settings command:

Temperature (0.0-2.0) - Controls response creativity

0.1-0.3: Precise (coding, math, facts)
0.6-0.8: Balanced (general chat)
1.0-1.5: Creative (writing, brainstorming)

Top P (0.0-1.0) - Nucleus sampling threshold (default: 0.9)

Top K (1-100) - Limits token choices (default: 40)

Context Window (128-32768) - Conversation memory in tokens (default: 2048)

Max Output (1-4096) - Maximum response length (default: 512)

Repeat Penalty (0.0-2.0) - Reduces repetition (default: 1.1)

Advanced Features

Thinking Process Visualization

For reasoning models like DeepSeek R1, QwQ, and others that use chain-of-thought:

Hidden by default: Thinking process is hidden during streaming for cleaner output
View thinking: Use /showthinking to see the AI's reasoning after response
Live thinking: Enable in /settings to watch the model think in real-time
Saved in history: Thinking is automatically saved for later review

Settings control:

show_thinking_live - Display thinking as it happens (default: false)
save_thinking - Save thinking to chat history (default: true)

Rich Markdown Rendering

AI responses are automatically formatted with:

Syntax highlighting for code blocks
Styled headers and emphasis (bold, italic)
Formatted lists and blockquotes
Colored output optimized for all terminals

Toggle in /settings:

use_markdown - Enable/disable Rich rendering (default: true)

System Monitoring

/ps command shows:

Currently loaded models in memory
VRAM usage per model
Model expiration times
Total memory consumption

Useful for:

Monitoring resource usage
Debugging slow responses
Managing multiple models

Remote Access & Performance Metrics

Optimized for remote server usage with transparent diagnostics:

Live timing: Spinner shows elapsed time during processing (e.g., "Thinking... 4.2s")
TTFT tracking: Measures Time To First Token for responsiveness analysis
Connection testing: /ping command with color-coded latency status
- 🟢 Excellent: <100ms (LAN/local)
- 🟡 Good: 100-300ms (fast remote)
- 🔴 Slow: >800ms (high latency)
Detailed metrics: Two-line performance breakdown
- Line 1: Total time, first token time, generation speed
- Line 2: Context size, input/output tokens, thinking words

Perfect for Tailscale/VPN setups - Metrics help diagnose whether slowness is from network or model processing.

Example output:

Total: 6.3s | First token: 4.1s | Speed: 23.0 tok/s
Context: 2048 tokens | Input: 15 | Output: 118

Use Cases

Home Network Setup

Powerful Desktop/Server: Runs Ollama with GPU acceleration
Laptop/Weak PC: Runs orc client, connects remotely
Benefit: Use AI from any device without local GPU

Development Environment

Server: Dedicated AI inference machine
Workstation: Lightweight client for coding assistance
Benefit: Consistent model across team

Multi-Device Access

Gaming PC: Hosts Ollama when not gaming
Any Device: Connect from laptop, tablet, or work PC
Benefit: Centralized AI without duplicate models

Troubleshooting

"Could not connect to Ollama"

1. Verify Ollama is running:

curl http://localhost:11434
# Should return: "Ollama is running"

2. Check firewall:

Windows: Allow port 11434 in Windows Firewall
Mac: System Preferences → Security → Firewall
Linux: sudo ufw allow 11434

3. Test hostname resolution:

ping your-hostname.local
# If fails, use IP address instead

4. Check Ollama binding:

# Ensure Ollama is listening on all interfaces
OLLAMA_HOST=0.0.0.0 ollama serve

"Model not found"

# List available models on server
ollama list

# Pull a model
ollama pull llama2

"Module not found" or Import Errors

# Reinstall the package
pip install --upgrade --force-reinstall ollama-remote-chat-cli

Command not found: `orc`

Windows:

# Add Python Scripts to PATH
# Location: C:\Users\YourName\AppData\Local\Programs\Python\Python3XX\Scripts

Mac/Linux:

# Ensure pip install location is in PATH
export PATH="$HOME/.local/bin:$PATH"

# Add to ~/.bashrc or ~/.zshrc to make permanent

Configuration Files

Config location: ~/.ollama_chat_config.json

History location: ~/.ollama_chat_history.json

These are created automatically on first run.

Manual configuration: Create a .env file in your home directory:

OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=llama2
TEMPERATURE=0.8
TOP_P=0.9

Alternative Installation Methods

From Source (Development)

# Clone the repository
git clone https://github.com/Avaxerrr/ollama-remote-chat-cli.git
cd ollama-remote-chat-cli

# Install in editable mode
pip install -e .

# Run it
orc

Building Standalone Executable

# Using Nuitka build script
python nuitka_build.py

# Follow the interactive menu
# Creates portable .exe or binary

Updating

To get the latest version:

pip install --upgrade ollama-remote-chat-cli

Check your current version:

pip show ollama-remote-chat-cli

Or inside the app:

orc
# Then type: /version

Development

Running from source:

git clone https://github.com/Avaxerrr/ollama-remote-chat-cli.git
cd ollama-remote-chat-cli
pip install -e .
orc

Running tests:

# Install dev dependencies
pip install -e ".[dev]"

# Run tests (when available)
pytest

License

MIT License - see LICENSE file for details.

Permission is hereby granted, free of charge, to use, modify, and distribute this software.

Acknowledgments

Built for Ollama
Inspired by modern CLI tools
Community contributions welcome

Support

Having issues? Here's how to get help:

Check the Troubleshooting section
Search existing issues
Open a new issue with:
- Your OS and Python version
- Error messages
- Steps to reproduce

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jan 27, 2026

This version

0.1.2

Jan 27, 2026

0.1.1

Jan 26, 2026

0.1.0

Jan 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ollama_remote_chat_cli-0.1.2.tar.gz (31.5 kB view details)

Uploaded Jan 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ollama_remote_chat_cli-0.1.2-py3-none-any.whl (27.8 kB view details)

Uploaded Jan 27, 2026 Python 3

File details

Details for the file ollama_remote_chat_cli-0.1.2.tar.gz.

File metadata

Download URL: ollama_remote_chat_cli-0.1.2.tar.gz
Upload date: Jan 27, 2026
Size: 31.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for ollama_remote_chat_cli-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`fff92e79cdd2dd203591c899422b549ffd0b014cc9a59861f25d6fef3794f496`
MD5	`0034ad8b5172f7cdda171d32af79b0c2`
BLAKE2b-256	`0c3ee36f74e83e2830388a07b050fb11a2463a0bfe0466442992cd5e5a61e481`

See more details on using hashes here.

File details

Details for the file ollama_remote_chat_cli-0.1.2-py3-none-any.whl.

File metadata

Download URL: ollama_remote_chat_cli-0.1.2-py3-none-any.whl
Upload date: Jan 27, 2026
Size: 27.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.5

File hashes

Hashes for ollama_remote_chat_cli-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`03a5b86bbd827f73370dc556af8c670a6e807fdd2b2bfa33f7448a034583bbca`
MD5	`142ea64dd06b5e8ee2857d6f0fd8b6cd`
BLAKE2b-256	`022e69ac632688d56ffece52014e12af2be36047b1e42d41ccd2a51204b2b0bc`

See more details on using hashes here.

ollama-remote-chat-cli 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Ollama Remote Chat CLI

Quick Start

Installation

First Run Setup

Features

Requirements

Connection Methods

Method 1: Local Connection (Default)

Method 2: Hostname Connection (.local / mDNS)

Method 3: Static IP Address

Method 4: Docker

Method 5: WSL (Windows Subsystem for Linux)

Available Commands

Inference Settings

Advanced Features

Thinking Process Visualization

Rich Markdown Rendering

System Monitoring

Remote Access & Performance Metrics

Use Cases

Home Network Setup

Development Environment

Multi-Device Access

Troubleshooting

"Could not connect to Ollama"

"Model not found"

"Module not found" or Import Errors

Command not found: orc

Configuration Files

Alternative Installation Methods

From Source (Development)

Building Standalone Executable

Updating

Development

Running from source:

Running tests:

License

Links

Acknowledgments

Support

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Command not found: `orc`