The simplest way to context engineer. Minimal streaming CLI clients for Claude and Gemini.
Project description
Raw LLM
The simplest way to context engineer.
Minimal, streaming CLI clients for Claude and Gemini that keep your conversations in plain JSON files.
What is this?
Raw LLM is a pair of thin Python scripts that talk to the Anthropic and Google GenAI APIs. No frameworks, no agents, no abstractions you don't need. Just a prompt, a streaming response, and a JSON file you can version, diff, edit, and pipe.
The entire idea: your conversation is a file. You build context by editing that file. That's it. That's the context engineering.
Features
- Streaming output — responses print token-by-token as they arrive
- Conversation persistence — every exchange is saved to a plain JSON file you own
- Resume any conversation — pass the JSON file back in to continue where you left off
- Pipe-friendly — reads from stdin, writes content to stdout, writes diagnostics to stderr
- Colored output — reasoning in gray (stderr), content in cyan (stdout), auto-disabled when piped
- Conflict detection — refuses to overwrite a conversation file modified by another process
- Symlink to switch models — symlink
claude.pyasopusorhaikuto change the default model
Installation
From PyPI
pip install raw-llm
This installs the claude, sonnet, opus, haiku, and gemini commands globally.
From source
git clone https://github.com/rodolfovillaruz/raw-llm.git
cd raw-llm
pip install .
Development install
git clone https://github.com/rodolfovillaruz/raw-llm.git
cd raw-llm
pip install -e ".[dev]"
Set your API keys:
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..." # or GOOGLE_API_KEY, per google-genai docs
Usage
Start a new conversation
claude
# Type your prompt, then press Ctrl+D to submit
echo "Explain monads in one paragraph" | claude
gemini
Resume an existing conversation
claude .prompt/some-conversation.json
The JSON file contains the full message history. Edit it with any text editor to reshape context before your next turn.
Pipe a file as context
cat code.py | claude conversation.json
Switch models
# By flag
claude -m claude-opus-4-6
# By command name
opus
haiku
sonnet
| Command | Default model |
|---|---|
claude / sonnet |
claude-sonnet-4-6 |
opus |
claude-opus-4-6 |
haiku |
claude-haiku-4-5 |
gemini |
gemini-3.1-pro-preview |
Options
usage: claude [-h] [-n] [-v] [-m MODEL] [-t MAX_TOKENS] [-i] [conversation_file]
positional arguments:
conversation_file JSON file to resume (omit to start fresh)
options:
-n, --dry-run Build the prompt but don't send it
-v, --verbose Show model name and prompt preview
-m, --model MODEL Override the default model
-t, --max-tokens TOKENS Cap the response length
-i, --interactive Interactive REPL mode
Conversation format
Conversations are stored as a JSON array of message objects, the same shape both APIs understand:
[
{
"role": "user",
"content": "What is context engineering?"
},
{
"role": "assistant",
"content": "Context engineering is the practice of ..."
}
]
You can create these files by hand, merge them, truncate them, or generate them with other tools. Raw LLM doesn't care. It reads the array, appends your new message, streams the response, and appends that too.
Project structure
.
├── src/
│ └── raw_llm/
│ ├── claude.py # Claude CLI client
│ ├── gemini.py # Gemini CLI client
│ └── common.py # Shared utilities (streaming, I/O, conversation management)
├── pyproject.toml # Package configuration and entry points
├── Makefile # Formatting, linting, typing
└── .prompt/ # Default directory for conversation files (auto-used if present)
Development
make fmt # Format with black/isort
make lint # Lint with pylint/flake8
make type # Type-check with mypy
make all # All of the above
Why?
Most LLM tools add layers between you and the model. Raw LLM removes them. The conversation is a file. The prompt is stdin. The response is stdout. Everything else is up to you.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file raw_llm-1.0.6.tar.gz.
File metadata
- Download URL: raw_llm-1.0.6.tar.gz
- Upload date:
- Size: 12.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1537eadeb7cb828d42d79b7ab743669b786349f2a46ff859f0e90b29a41181e8
|
|
| MD5 |
04100f2f28e02884d85dc525f3a517fe
|
|
| BLAKE2b-256 |
c7ad4bedfb2c36c8c6d60fac54842ac9ad4cbd871ce8aa066c0c164474a4d01e
|
File details
Details for the file raw_llm-1.0.6-py3-none-any.whl.
File metadata
- Download URL: raw_llm-1.0.6-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
708748434755e1a539ee6f4d4da79835fa6945dd268333a009ae03c98e8e7a2a
|
|
| MD5 |
9c1a8551babf6f4241e63b3cdb87831c
|
|
| BLAKE2b-256 |
069ef973eb5198558a5b6da79f0f4908538efc3ceff8a15eb639c684c4a81d63
|