Offline local-LLM terminal app for Jetson and edge Linux: chat with on-device models, run agent tools, and manage context safely.

These details have not been verified by PyPI

Project links

Project description

open-jet

open-jet is an offline-first terminal app for running local LLM workflows on edge Linux devices (including Jetson-class hardware).

It provides:

local chat with your on-device model
safe file-context loading with token/memory guards
slash commands for session control
first-run setup for model/runtime configuration
optional session logging and resume

Requirements

llama-server from llama.cpp built for your device (see below)
a local .gguf model file, or ollama installed for model download

Building llama-server

open-jet uses llama-server from llama.cpp as its inference backend. Pre-built binaries are available for x86, but on Jetson/ARM64 you need to build from source with the right flags.

Jetson (Orin Nano, Orin NX, AGX Orin)

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake .. \
  -DGGML_CUDA=ON \
  -DCMAKE_CUDA_ARCHITECTURES=87 \
  -DGGML_CUDA_FA_ALL_QUANTS=ON

cmake --build . --target llama-server -j$(nproc)

Key flags:

Flag	Why
`GGML_CUDA=ON`	Enable CUDA backend
`CMAKE_CUDA_ARCHITECTURES=87`	Target SM 8.7 (Orin). Use `72` for Xavier.
`GGML_CUDA_FA_ALL_QUANTS=ON`	Enable flash attention for all KV cache quantizations (q8_0, q4_0), not just f16. Required for fast inference with quantized KV cache.

The built binary will be at build/bin/llama-server. Either add it to your PATH or leave it at ~/llama.cpp/build/bin/ where open-jet will find it automatically.

Other Linux (x86_64 with NVIDIA GPU)

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake .. -DGGML_CUDA=ON -DGGML_CUDA_FA_ALL_QUANTS=ON
cmake --build . --target llama-server -j$(nproc)

CPU only

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake ..
cmake --build . --target llama-server -j$(nproc)

Install

pip install open-jet

Start

open-jet

Optional setup screen on launch:

open-jet --setup

First-Run Setup

On first run, open-jet guides you through:

hardware detection/profile
model source selection
model path or download choice
context window size
GPU offload configuration

It then saves your configuration and starts the runtime.

Basic Use

Type normally and press Enter to chat
Use @file or @[path with spaces] to add file content to context
Type / to open slash-command suggestions
Tab/Enter can autocomplete slash commands and file mentions
Ctrl+C or /exit quits

Slash Commands

/help show commands
/exit quit app
/clear clear chat and restart runtime (flush KV cache)
/clear-chat clear chat only
/status show context/RAM status
/condense condense older context
/load <path> load a file into context
/resume load previous saved session
/setup reopen setup wizard

Configuration

Main settings are stored in config.yaml, including:

context window size
memory guard limits
logging settings
session state/resume settings

TensorRT-LLM (PyTorch runtime) with Qwen

open-jet can run against trtllm-serve instead of llama-server.

Install TensorRT-LLM so trtllm-serve is on your PATH.
Set your config to use the TensorRT-LLM runtime.

Example config.yaml values:

runtime: trtllm_pytorch
model: Qwen/Qwen2.5-7B-Instruct
trtllm_backend: pytorch
trtllm_trust_remote_code: true
# optional: pass a trtllm-serve YAML file
# trtllm_config_path: /home/you/qwen-fast.yml
context_window_tokens: 4096
gpu_layers: 0

When runtime is trtllm_pytorch, open-jet launches:

trtllm-serve <model> --backend pytorch --host 127.0.0.1 --port 8080

and then connects through the same OpenAI-compatible chat API path.

Logging and Session State

When enabled:

session events are written to session_logs/*.events.jsonl
system metrics are written to session_logs/*.metrics.jsonl
conversation state is saved to session_state.json

Contact

Website: https://www.openjet.dev/
X: https://x.com/flouislf

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.1

Apr 16, 2026

0.3.13

Apr 9, 2026

0.3.12

Apr 6, 2026

0.3.11

Apr 6, 2026

0.3.10

Apr 4, 2026

0.3.9

Apr 4, 2026

0.3.8

Apr 4, 2026

0.3.7

Apr 4, 2026

0.3.6

Mar 31, 2026

0.3.5

Mar 31, 2026

0.3.3

Mar 31, 2026

0.1.16

Mar 20, 2026

0.1.14

Mar 17, 2026

0.1.13

Mar 17, 2026

0.1.12

Mar 15, 2026

0.1.10

Mar 13, 2026

0.1.8

Mar 13, 2026

This version

0.1.5

Feb 27, 2026

0.1.4

Feb 26, 2026

0.1.3

Feb 25, 2026

0.1.2

Feb 25, 2026

0.1.0

Feb 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

open_jet-0.1.5-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (6.1 MB view details)

Uploaded Feb 27, 2026 CPython 3.10manylinux: glibc 2.17+ ARM64

File details

Details for the file open_jet-0.1.5-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

Download URL: open_jet-0.1.5-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Upload date: Feb 27, 2026
Size: 6.1 MB
Tags: CPython 3.10, manylinux: glibc 2.17+ ARM64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for open_jet-0.1.5-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm	Hash digest
SHA256	`5cb5bcd6178561c3c6a410ce422e919c8fa752f58defe58161ff70f53b0a2f2e`
MD5	`66919d67cb07497243f51b68afbfbb9e`
BLAKE2b-256	`85909977755514f2ac4202c9a9dc81694bb8d9c92053675b73380281e625494f`

See more details on using hashes here.

open-jet 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

open-jet

Requirements

Building llama-server

Jetson (Orin Nano, Orin NX, AGX Orin)

Other Linux (x86_64 with NVIDIA GPU)

CPU only

Install

Start

First-Run Setup

Basic Use

Slash Commands

Configuration

TensorRT-LLM (PyTorch runtime) with Qwen

Logging and Session State

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes