Skip to main content

Offline local-LLM terminal app for Jetson and edge Linux: chat with on-device models, run agent tools, and manage context safely.

Project description

open-jet

open-jet is an offline-first terminal app for running local LLM workflows on edge Linux devices (including Jetson-class hardware).

It provides:

  • local chat with your on-device model
  • safe file-context loading with token/memory guards
  • slash commands for session control
  • first-run setup for model/runtime configuration
  • optional session logging and resume

Requirements

  • llama-server from llama.cpp built for your device (see below)
  • a local .gguf model file, or ollama installed for model download

Building llama-server

open-jet uses llama-server from llama.cpp as its inference backend. Pre-built binaries are available for x86, but on Jetson/ARM64 you need to build from source with the right flags.

Jetson (Orin Nano, Orin NX, AGX Orin)

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake .. \
  -DGGML_CUDA=ON \
  -DCMAKE_CUDA_ARCHITECTURES=87 \
  -DGGML_CUDA_FA_ALL_QUANTS=ON

cmake --build . --target llama-server -j$(nproc)

Key flags:

Flag Why
GGML_CUDA=ON Enable CUDA backend
CMAKE_CUDA_ARCHITECTURES=87 Target SM 8.7 (Orin). Use 72 for Xavier.
GGML_CUDA_FA_ALL_QUANTS=ON Enable flash attention for all KV cache quantizations (q8_0, q4_0), not just f16. Required for fast inference with quantized KV cache.

The built binary will be at build/bin/llama-server. Either add it to your PATH or leave it at ~/llama.cpp/build/bin/ where open-jet will find it automatically.

Other Linux (x86_64 with NVIDIA GPU)

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake .. -DGGML_CUDA=ON -DGGML_CUDA_FA_ALL_QUANTS=ON
cmake --build . --target llama-server -j$(nproc)

CPU only

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build && cd build

cmake ..
cmake --build . --target llama-server -j$(nproc)

Install

pip install open-jet

Start

open-jet

Optional setup screen on launch:

open-jet --setup

First-Run Setup

On first run, open-jet guides you through:

  1. hardware detection/profile
  2. model source selection
  3. model path or download choice
  4. context window size
  5. GPU offload configuration

It then saves your configuration and starts the runtime.

Basic Use

  • Type normally and press Enter to chat
  • Use @file or @[path with spaces] to add file content to context
  • Type / to open slash-command suggestions
  • Tab/Enter can autocomplete slash commands and file mentions
  • Ctrl+C or /exit quits

Slash Commands

  • /help show commands
  • /exit quit app
  • /clear clear chat and restart llama-server
  • /clear-chat clear chat only
  • /status show context/RAM status
  • /condense condense older context
  • /load <path> load a file into context
  • /resume load previous saved session
  • /setup reopen setup wizard

Configuration

Main settings are stored in config.yaml, including:

  • context window size
  • memory guard limits
  • logging settings
  • session state/resume settings

Logging and Session State

When enabled:

  • session events are written to session_logs/*.events.jsonl
  • system metrics are written to session_logs/*.metrics.jsonl
  • conversation state is saved to session_state.json

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

open_jet-0.1.4-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (5.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

File details

Details for the file open_jet-0.1.4-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl.

File metadata

File hashes

Hashes for open_jet-0.1.4-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl
Algorithm Hash digest
SHA256 c6c1cc00c1d34364a87aa3d55dece6324446a2c580d14cc5e7b28927f11274f4
MD5 119c491e305ebe467299487b4ebd9cfa
BLAKE2b-256 44e9483449518edfb780b1fc1312393b02f9de499b9fdb6865cb6d6ad51ee52d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page