Skip to main content

SQLite-backed durable task queue with first-class MCP integration, purpose-built for AI coding agents (crash recovery, audit trail, embedded daemon).

Project description

djobs

A SQLite-backed durable task queue with first-class MCP integration, purpose-built for AI coding agents that need crash recovery, audit trails, and zero infrastructure.

CI License: MIT Python 3.11+


Why djobs?

AI coding agents (GitHub Copilot, Cursor, Cline, etc.) often run multi-file tasks that can take several minutes. When the IDE crashes or the chat disconnects mid-way, in-flight progress is usually lost because the agent's state lives only in chat history.

djobs gives agents a small, durable checkpoint queue so they can resume exactly where they stopped:

Agent: "Add docstrings to 12 files"
  -> enqueue 12 tasks (crash-safe checkpoint)
  -> edit file -> complete_task
  -> edit file -> complete_task
  -> ... IDE crashes after file 7 ...

New chat: "hi"
  -> resume_session -> 5 incomplete tasks found
  -> auto-resume from file 8 — no questions asked

Under the hood it is a fairly conventional durable job queue (state machine, retry policy, lease, scheduler, event log). The interesting part is how it is wired to AI agents: an MCP server with enqueue_task / complete_task / resume_session / audit_log, plus an embedded background daemon, plus a type_filter so daemon-managed jobs and agent-managed jobs do not fight over the same queue.

What is in the box

Area What you get
MCP server 8 tools exposed via FastMCP / stdio — works in VS Code, Claude Desktop, etc.
Crash recovery resume_session returns incomplete tasks for a given workspace / correlation id
Audit trail audit_log aggregates job_events so you can answer "what did the AI do yesterday?"
Type isolation Built-in daemon only claims job types it has handlers for; AI-only types are left to the agent via complete_task / fail_task
SQLite first No Redis, RabbitMQ, Docker, or Postgres required for local use
Postgres path Same JobRepository protocol implemented on top of SELECT ... FOR UPDATE SKIP LOCKED for multi-worker setups
Test coverage 214 passing tests (16 skipped without Postgres), strict ruff lint

Quick Start

As a Python Library

pip install djobs
from djobs import SQLiteJobRepository, QueueService, HandlerRegistry, WorkerPool

# 1. Set up
repo = SQLiteJobRepository.from_path("jobs.db")
queue = QueueService(repo)

# 2. Submit a job
job = queue.submit("send_email", {"to": "user@example.com"}, max_attempts=3)

# 3. Process jobs
registry = HandlerRegistry()
registry.register("send_email", lambda payload: send_email(**payload))

pool = WorkerPool(queue, registry, worker_id="worker-1", max_concurrent=4)
pool.run_loop(stop_event)

As an MCP Server (for AI Agents)

pip install djobs
djobs install-mcp

That's it. Two commands, ready to go.

Options:

# Safe default (read-only tools auto-approved)
djobs install-mcp

# Or with full auto-approve (agent can enqueue/complete/fail without prompts)
djobs install-mcp --full-approve
Or add to .vscode/mcp.json manually
{
  "servers": {
    "djobs": {
      "type": "stdio",
      "command": "${workspaceFolder}/.venv/Scripts/python",
      "args": ["-m", "djobs.mcp_server"],
      "autoApprove": [
        "health", "resume_session", "check_task", "list_tasks", "audit_log"
      ]
    }
  }
}
macOS / Linux (venv)
{
  "servers": {
    "djobs": {
      "type": "stdio",
      "command": "${workspaceFolder}/.venv/bin/python",
      "args": ["-m", "djobs.mcp_server"],
      "autoApprove": [
        "health", "resume_session", "check_task", "list_tasks", "audit_log"
      ]
    }
  }
}
System Python (any OS)
{
  "servers": {
    "djobs": {
      "type": "stdio",
      "command": "python",
      "args": ["-m", "djobs.mcp_server"],
      "autoApprove": [
        "health", "resume_session", "check_task", "list_tasks", "audit_log"
      ]
    }
  }
}

Security note: The default autoApprove list only includes read-only tools. If you want your agent to enqueue/complete/fail tasks without confirmation prompts, add "enqueue_task", "complete_task", and "fail_task" to the array — but understand that this allows the agent to mutate queue state without asking.

Then any AI agent can call these MCP tools:

Tool Purpose
enqueue_task Submit a durable task (survives crashes)
complete_task Mark task succeeded after agent finishes work
fail_task Mark task failed with error message
resume_session Find incomplete tasks from previous sessions
check_task Inspect task status, attempts, duration
list_tasks List tasks by correlation_id
audit_log Query event history — "what did the AI do?"
health Queue depth by status

How is this different from X?

djobs is not the first project to expose a task queue to an AI agent over MCP. It targets a specific combination of properties: SQLite-first, MCP-driven, with crash recovery and audit-log style observability built in.

Project Storage Focus Closest to djobs?
TadMSTR/task-queue-mcp YAML files Multi-agent task hand-off for Claude Code Closest in spirit. Different storage model (YAML files + dispatcher), no resume_session / audit_log style observability.
midweste/mcp-cli-gateway SQLite Routing prompts to CLI agents (Gemini / Codex / Claude) with pacing Overlaps on persistence + observability, but the unit of work is "dispatch a prompt to a CLI", not "durable user task with retry / lease".
j0j1j2/claude-tunnel In-memory Pub/sub + 1:1 request/reply + job queue between Claude Code sessions Different problem: inter-session messaging, not durable work tracking.
Celery / RQ / Dramatiq / Hatchet Redis / Postgres General-purpose distributed task queues Strictly more capable as general queues, but not designed to be driven directly by an AI agent over MCP.
Temporal / Inngest / DBOS Server / SaaS Durable workflow / execution engines Much more powerful and much heavier; no MCP integration; not aimed at single-developer laptop use.

In one sentence: djobs is what you reach for when you want a small, Celery-shaped Python job queue, driven mostly by an AI agent through MCP, on a single developer machine, with SQLite.


Configuration

Environment Variables

Variable Default Description
DJOBS_DB_PATH djobs.db SQLite database file path
DJOBS_LOG_LEVEL INFO Logging level (DEBUG, INFO, WARNING, ERROR)
DJOBS_LOG_FORMAT json Log output format (json or text)
DJOBS_WORKER_ID worker-1 Identifier for this worker instance

These are read by Config.from_env() and used by the daemon / worker pool. The MCP server and CLI default to djobs_mcp.db via their own --db argument.

correlation_id convention

resume_session and list_tasks filter by correlation_id. The recommended convention:

  • VS Code agent: use the workspace folder path (e.g. c:\src\my\project or /home/user/project)
  • CI / automation: use the run ID or pipeline name
  • Multi-repo: use {workspace_path}:{repo_name} to avoid collision

The value is opaque — djobs does not interpret it. Pick any stable string that groups related tasks.

SQLite concurrency notes

SQLite uses file-level locking. On Windows, only one process can write at a time (journal mode is WAL by default, which helps with read concurrency). For single-developer laptop use this is fine. If you need multi-process writes, use the PostgreSQL backend (pip install "djobs[pg]").

Dead-lettered tasks

After a job exhausts all max_attempts, it moves to dead_lettered status. These tasks stay in the database for audit purposes but are not retried automatically. To inspect and handle them:

from djobs import SQLiteJobRepository, QueueService

repo = SQLiteJobRepository.from_path("djobs_mcp.db")
queue = QueueService(repo)

# Find dead-lettered tasks
dead = queue.list_by_status("dead_lettered")
for job in dead:
    print(f"{job.id} | {job.type} | {job.last_error}")
    # Resubmit as a fresh job if needed:
    # queue.submit(job.type, job.payload, max_attempts=job.max_attempts,
    #              correlation_id=job.correlation_id)

See also: examples/dead_letter_example.py


Architecture

┌─────────────┐     MCP tools      ┌──────────────┐
│  AI Agent   │ ──────────────────> │  MCP Server  │
│  (Copilot)  │ <────────────────── │  (FastMCP)   │
└─────────────┘                     └──────┬───────┘
                                           │
                              ┌────────────┼────────────┐
                              │            │            │
                        ┌─────▼─────┐ ┌────▼────┐ ┌────▼─────┐
                        │  Queue    │ │ Daemon  │ │ Audit    │
                        │  Service  │ │ (Pool + │ │ Log      │
                        │           │ │ Sched)  │ │          │
                        └─────┬─────┘ └─────────┘ └──────────┘
                              │
                        ┌─────▼─────┐
                        │  SQLite   │
                        │  (or PG)  │
                        └───────────┘

Job State Machine

pending ──────► running ──────► succeeded
   │               │
   │               ├──────► failed
   │               │
   │               ├──────► retry_scheduled ──► pending (retry)
   │               │
   │               └──────► dead_lettered
   │
   ├──────► succeeded  (AI agent direct complete)
   └──────► failed     (AI agent direct fail)

Module Map

Module Responsibility
djobs.core Job model, state machine, domain errors
djobs.queue Submit, claim, complete, fail, retry logic
djobs.storage SQLite & PostgreSQL repositories, event log
djobs.worker Handler registry, WorkerPool, WorkerRunner
djobs.scheduler Retry promotion, expired lease recovery
djobs.daemon Composes WorkerPool + Scheduler into one process
djobs.observability Metrics, structured logging, job inspection
djobs.mcp_server MCP tool definitions, embedded daemon
djobs.cli djobs serve CLI entry point

Examples

# Basic job lifecycle
python examples/run_echo_job.py

# Retry with exponential backoff
python examples/run_retry_job.py

# Concurrent worker pool
python examples/run_pool_demo.py

# Scheduler loop (retry promotion + lease recovery)
python examples/run_scheduler_demo.py

# AI task platform (batch submit + cost tracking)
python examples/run_ai_demo.py

# Durable crash recovery demo
python examples/run_durable_demo.py

Development

git clone https://github.com/jhuang-tw/djobs.git
cd djobs
python -m venv .venv && .venv/bin/activate
pip install -e ".[dev]"

pytest -q              # 214 tests (16 skipped without Postgres)
ruff check src/ tests/ # lint

See CONTRIBUTING.md for guidelines.


Roadmap

  • Durable job queue with retry, lease, heartbeat
  • SQLite + PostgreSQL backends
  • Worker pool with concurrency control
  • Scheduler (retry promotion + lease recovery)
  • Event sourcing & audit trail
  • MCP server with 8 tools
  • Embedded daemon (auto-start with MCP)
  • Type isolation (daemon vs. AI agent tasks)
  • Published on PyPI (pip install djobs)
  • djobs install-mcp — auto-generate mcp.json snippet
  • djobs audit — CLI access to the audit trail
  • Python 3.11+ support
  • Async worker support
  • Priority queues
  • Web dashboard for audit trail
  • Rate limiting per job type

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

djobs-0.2.1.tar.gz (86.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

djobs-0.2.1-py3-none-any.whl (43.2 kB view details)

Uploaded Python 3

File details

Details for the file djobs-0.2.1.tar.gz.

File metadata

  • Download URL: djobs-0.2.1.tar.gz
  • Upload date:
  • Size: 86.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for djobs-0.2.1.tar.gz
Algorithm Hash digest
SHA256 e68c2d7da604fa048507bb9552bbde127c5718c523404ef41156f8798ca623b8
MD5 5600230906d1b125eff82bfab2456990
BLAKE2b-256 d28eab8f4924b604dc4473d82457652f4981bb127d67d7100e786a45fc01ded0

See more details on using hashes here.

Provenance

The following attestation bundles were made for djobs-0.2.1.tar.gz:

Publisher: publish.yml on jhuang-tw/djobs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file djobs-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: djobs-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 43.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for djobs-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4fe9e888c5feaa9a19ae8720371130a7f403b00c080b94532126adc76bf04586
MD5 168705585300b08c3702b5e09cad4355
BLAKE2b-256 f774829de5f1741c8c0f16cd7c6b112cbd1944b10ce7550bbc888ae05daa7034

See more details on using hashes here.

Provenance

The following attestation bundles were made for djobs-0.2.1-py3-none-any.whl:

Publisher: publish.yml on jhuang-tw/djobs

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page