Skip to main content

Lightweight local-first job queue manager and Ollama wrapper for resource-constrained environments.

Project description

Hoglah

Hoglah is a lightweight, local-first job queue manager and Ollama wrapper designed for resource-constrained environments.

It lets applications submit LLM inference requests (generate or chat) asynchronously, receive a job ID immediately, monitor progress, retrieve full results, and receive completion callbacks — even when the underlying hardware can only run one (or very few) model inferences at a time.

Named after one of the daughters of Zelophehad (Numbers 26/27/36, Joshua 17), continuing the Old Testament women's names pattern used by sister projects in the domains family (Mahalath, Tirzah, etc.).

Core Value Proposition

  • Simple Python-native interface for internal/library use
  • Reliable queuing with durable persistence (survives restarts)
  • Smart handling of context windows and model capabilities
  • Fire-and-forget + callback patterns for workflow orchestration
  • Fully local, privacy-focused, zero-cloud dependency
  • Extensible foundation (web API / webhooks / distributed backends planned for later versions)

Target users: Developers building multi-agent systems, background task processors, or local AI tooling that needs to safely queue and manage LLM calls.

Goals (V1)

  • Clean, reliable abstraction over Ollama for queuing
  • Configurable concurrency (default: 1 for low-resource setups)
  • Model discovery, context calibration, and basic resource awareness
  • Easy integration into existing Python applications
  • Persistent job state across process restarts
  • Keep V1 simple, focused, and production-ready for local use

Non-Goals (V1)

  • Full distributed orchestration or high-availability clustering
  • Built-in web UI (deferred to V2)
  • Advanced authentication / multi-tenancy
  • Non-Ollama backends
  • Real-time streaming UI surfaces (file + callback sufficient)

Installation

From PyPI (recommended once published)

pip install hoglah
# With CLI
pip install "hoglah[cli]"

From GitHub Releases (immediate, no PyPI needed)

After a release is published (via git tag vX.Y.Z && git push --tags):

# Latest wheel
pip install "https://github.com/gellsmore-svg/hoglah/releases/latest/download/hoglah-*.whl"

# Or specific version
pip install "https://github.com/gellsmore-svg/hoglah/releases/download/v0.2.1/hoglah-0.2.1-py3-none-any.whl"

From source (for development)

git clone https://github.com/gellsmore-svg/hoglah
cd hoglah
python -m venv .venv
.venv/bin/pip install -e ".[dev,cli]"

Publishing to PyPI (so pip install hoglah just works)

The release workflow already builds the packages. To publish them to PyPI automatically on every v* tag:

  1. Go to https://pypi.org/manage/project/hoglah/ (create the project first if it doesn't exist by doing a manual upload once).
  2. Go to "Publishing" → "Add a trusted publisher".
  3. Choose GitHub.
  4. Fill in:
    • Repository: gellsmore-svg/hoglah
    • Workflow: release.yml (or leave blank to allow any workflow)
    • Environment: (optional but recommended — create a GitHub Environment called pypi and select it here)
  5. Save.

Then push a tag:

git tag v0.2.1
git push origin v0.2.1

The release workflow will now publish to PyPI using OIDC (no API token required — this is the modern secure way).

You can also do a one-time manual upload with twine if you prefer.

Quick Start (Planned)

Once implemented:

git clone https://github.com/gellsmore-svg/hoglah
cd hoglah
python -m venv .venv && .venv/bin/pip install -e ".[dev,cli]"
from hoglah import Hoglah

h = Hoglah()  # or Hoglah(config_path="...")

job_id = h.submit(
    prompt="Explain the significance of Hoglah in the biblical land allotment.",
    model="gemma3:1b",
    tags=["research", "bible"],
    callback=lambda result: print("Done:", result.job_id, result.output[:100]),
)

print("Submitted:", job_id)
print(h.status(job_id))

result = h.wait(job_id, timeout=120)
print(result.output)

# Recommended: context manager for auto cleanup of the background worker
with Hoglah() as h:
    job_id = h.submit(prompt="...", model="gemma3:1b")
    print(h.wait(job_id).output)

CLI:

hoglah submit "Explain Hoglah" --model gemma3:1b --wait
hoglah list --status completed
hoglah ps --json                 # alias for list, machine-readable
hoglah stats --json              # queue overview (counts by status)
hoglah info --json               # config + adapter + log_level + stats snapshot
hoglah show gemma3:1b --json     # model details (context, template, etc.)
hoglah clear --status completed --older-than 7 --yes  # prune old jobs
hoglah rm <job-id> --yes  # remove specific job
hoglah wait <job-id> --timeout 60 --json  # block until done, machine readable
hoglah doctor --real  # diagnose setup and real Ollama/llama.cpp connectivity
hoglah status <job-id> --json

## V1 Scope

Hoglah 0.2.1 implements the full V1 specification from `docs/requirements-v1.0.md` and `docs/project-brief.md`.

**Included (V1):**
- Submit (prompt or messages/chat), immediate UUID.
- Status, get result (with output, usage, timings, metadata, parent, **truncated** reporting + effective_num_ctx).
- List (status, tags, **parent_job_id** filters; rich human + --json with preview).
- Cancel (best-effort).
- Wait (standalone or via submit --wait).
- rm / clear (per-job or bulk by status/age).
- info / stats (config, adapter, queue overview).
- Models: list + show (details, context size, template, family).
- pull (auto on real submit, or explicit).
- run (foreground worker).
- In-process callbacks (direct + named registry for restart re-delivery).
- Restart recovery (interrupted jobs + callback re-delivery).
- Pluggable adapters (safe Stub default + real Ollama with auto-pull, model-aware context, truncation via done_reason).
- Configurable concurrency (default 1), log_level, db, ollama host.
- Full submit surface (temperature, top_p/k, num_ctx, format, keep_alive, metadata, parent, etc.).
- Persistence (SQLite), context manager, --json everywhere.

**Explicitly not in V1 (per non-goals):**
- Web UI / HTTP server (V2).
- Webhooks / callback_url.
- Distributed / multi-node.
- Non-Ollama backends.
- Complex dependency graph execution (parent_job_id is for traceability only; no automatic waiting/fan-out).
- Real-time streaming UI (polling wait + final callbacks sufficient).

See the full requirements review and V1 completeness note in `.restart.md`.

You can also run the packaged install smoke test after installing the wheel:
```bash
python scripts/test_packaged_install.py

To validate with your working local Ollama (full real adapter paths including show, pull, context auto-detect):

RUN_OLLAMA_TESTS=1 python scripts/test_packaged_install.py
# or
HOGLAH_USE_REAL_ADAPTER=1 python scripts/test_packaged_install.py

(This is the recommended way to thoroughly test the packaged v0.2.1 with real Ollama.)

Real Ollama / llama.cpp: Opt-in via use_real=True / HOGLAH_USE_REAL_ADAPTER=1 / --real. The "real" adapter talks to Ollama (which uses llama.cpp for inference).

Important: In this agent's tool execution environment, localhost:11434 is not reachable, so real-Ollama calls and RUN_OLLAMA_TESTS=1 have failed here. All "tests green" and packaged validation in this development session used the safe StubAdapter + mocks.

To validate the packaged v0.2.1 wheel with your working local Ollama/llama.cpp (strongly recommended before tagging):

python3 -m venv /tmp/hoglah-validate
/tmp/hoglah-validate/bin/pip install dist/hoglah-0.2.1-py3-none-any.whl[cli]
RUN_OLLAMA_TESTS=1 /tmp/hoglah-validate/bin/python scripts/test_packaged_install.py

# Also run the gated integration test
RUN_OLLAMA_TESTS=1 python -m pytest tests/test_worker_execution.py::test_real_ollama_adapter_end_to_end -q -s --tb=long

Run the above on your machine (the same WSL cello env where Claude + your Ollama/llama.cpp work) and share the output. The smoke test script and gated test were built precisely for this real-backend validation of the packaged artifact.

To make Ollama listen on all interfaces (if needed for cross-context access in WSL):

  • Windows cmd: setx OLLAMA_HOST "0.0.0.0" (then restart ollama serve)
  • Or in WSL shell before starting: export OLLAMA_HOST=0.0.0.0 hoglah cancel hoglah models hoglah run --real # foreground worker using real Ollama

By default `hoglah` and `Hoglah()` use the safe stub adapter (no LLM calls). Use `--real` (CLI) or pass `adapter=OllamaAdapter(...)` (library) when you want actual inference.

`hoglah --version` / `-V` and `hoglah version` are supported. Use `with Hoglah(...) as h:` for automatic cleanup.

CLI now also includes `hoglah ps` (list alias) and `--json` output on list/ps/status/models. `hoglah submit` supports `--metadata` (JSON) and `--parent-job-id`. Real integration tests are gated behind `RUN_OLLAMA_TESTS=1`.

See `docs/requirements-v1.0.md` for the full initial specification.

## Submit API (Initial Draft)

```python
job_id = hoglah.submit(
    prompt: str | None = None,                    # or messages for chat
    messages: list[dict] | None = None,           # OpenAI-style chat history
    model: str,                                   # e.g. "gemma:7b", "mistral"
    system_prompt: str | None = None,
    num_ctx: int | None = None,                   # Context window size
    options: dict | None = None,                  # Passthrough for llama.cpp params
    callback: Callable[[JobResult], None] | None = None,  # Python callable
    callback_url: str | None = None,              # V2: HTTP webhook
    tags: list[str] | None = None,
    priority: int = 0,                            # Higher = earlier
    timeout_seconds: int | None = None,
    max_retries: int = 2,
    metadata: dict | None = None,                 # User-defined data
    parent_job_id: str | None = None,             # For chaining/dependencies
    temperature: float | None = None,
    top_p: float | None = None,
    top_k: int | None = None,
    repeat_penalty: float | None = None,
    seed: int | None = None,                      # Reproducibility
    stop: list[str] | None = None,                # Stop sequences
    num_predict: int | None = None,               # Max output tokens
    format: str | None = None,                    # e.g. "json"
    keep_alive: str | int | None = None,
    # ... full options dict covers the rest
)

Current Status

2026-06-12 (updated): Core implementation complete (Chunks 1-3 + follow-on polish).

  • Full durable queue + background asyncio worker (concurrency=1 default)
  • Pluggable adapters: StubAdapter (default, safe) + OllamaAdapter (real, opt-in via use_real=True or --real)
  • Hoglah(use_real=True) convenience + HOGLAH_USE_REAL_ADAPTER env var
  • Submit (prompt or messages/chat), rich generation params, status, get, list, cancel, wait, named+direct callbacks
  • Restart recovery (interrupted jobs + callback re-delivery)
  • Truncation metadata always surfaced (never fails the job)
  • CLI: list, status, cancel, submit (with --messages, --temperature, --num-ctx etc.), run, models, version
  • examples/basic_usage.py demonstrating the common patterns
  • 13 passing tests. No real Ollama required (everything exercises safely via stub).

See docs/requirements-v1.0.md, docs/architecture-decisions.md, and .restart.md for history and how to continue.

See sister domains for style and quality references:

Architecture Sketch (Early)

  • Client library (Hoglah or similar) for submit / status / wait / list / cancel
  • SQLite-backed job store (jobs table + results / events)
  • Worker loop (thread or task) with concurrency semaphore
  • Ollama adapter (generate + chat paths, model info)
  • In-process callback dispatch after completion
  • CLI entrypoint for inspection and operations
  • Config via constructor + env + small config file

Full details will evolve in docs/architecture-decisions.md and implementation docs.

License

Apache 2.0 — see LICENSE.

Contributing

See CONTRIBUTING.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hoglah-0.2.2.tar.gz (50.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hoglah-0.2.2-py3-none-any.whl (40.4 kB view details)

Uploaded Python 3

File details

Details for the file hoglah-0.2.2.tar.gz.

File metadata

  • Download URL: hoglah-0.2.2.tar.gz
  • Upload date:
  • Size: 50.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hoglah-0.2.2.tar.gz
Algorithm Hash digest
SHA256 3221a4011423613e333121c482620684b93d54b245ac2dd657df94d88f95bb89
MD5 cefd7aa7cbb48246420217cb21a90596
BLAKE2b-256 b13059b6ceb26614bb63f02e6d100ef9a0e0495f89e32690e0040b21bf3a6005

See more details on using hashes here.

Provenance

The following attestation bundles were made for hoglah-0.2.2.tar.gz:

Publisher: release.yml on gellsmore-svg/hoglah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file hoglah-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: hoglah-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 40.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for hoglah-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 b828bf062a639446ce171e779bc6d157ebe19cf6ca81d4d1e038e3aa226d19da
MD5 f4c297a9a17aa1f9ca961b4998677ac3
BLAKE2b-256 b9dce3b92669fde9929d95be23aae1edad8f3f588709c0d7c646028adf5eeaf4

See more details on using hashes here.

Provenance

The following attestation bundles were made for hoglah-0.2.2-py3-none-any.whl:

Publisher: release.yml on gellsmore-svg/hoglah

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page