Keep humans in the loop. HITL control library for AI agent workflows with LangGraph.

These details have not been verified by PyPI

Project links

Project description

hitloop

Human-in-the-Loop control library for AI agent workflows with LangGraph integration.

hitloop provides explicit control nodes for human oversight in AI agent workflows, with strong instrumentation for research experiments. Unlike passive monitoring, human approval is a first-class control signal and event in the execution trace.

Core Concept

LLM proposes action → HITL policy decides → Human approves/rejects → Tool executes → Telemetry logs all

Human approval is not a UI gimmick. It is:

A control signal that gates execution
A first-class event in the trace
A research artifact for measuring oversight effectiveness

Quick Start

Installation

# Clone the repository
git clone https://github.com/ebaenamar/hitloop.git
cd hitloop

# Install with uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .

# For development
pip install -e ".[dev]"

Run an Example

# Basic workflow (with CLI approval prompts)
python examples/basic_workflow.py

# Auto-approve mode (no prompts)
python examples/basic_workflow.py --auto

# Run a full experiment
python examples/run_experiment.py --n-trials 20

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      LangGraph Workflow                      │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────┐    ┌──────────┐    ┌──────────────────────┐  │
│  │   LLM    │───►│  HITL    │───►│   Tool Executor      │  │
│  │  Node    │    │  Gate    │    │                      │  │
│  └──────────┘    └────┬─────┘    └──────────────────────┘  │
│                       │                                     │
│                       ▼                                     │
│              ┌────────────────┐                             │
│              │  HITL Policy   │                             │
│              │  ┌──────────┐  │                             │
│              │  │ Approval │  │◄──► Human (CLI/Web/etc)    │
│              │  │ Backend  │  │                             │
│              │  └──────────┘  │                             │
│              └────────────────┘                             │
│                       │                                     │
│                       ▼                                     │
│              ┌────────────────┐                             │
│              │   Telemetry    │───► SQLite / Analysis      │
│              │    Logger      │                             │
│              └────────────────┘                             │
└─────────────────────────────────────────────────────────────┘

API Overview

Core Models

from hitloop import Action, Decision, RiskClass

# Define an action
action = Action(
    tool_name="send_email",
    tool_args={"recipient": "alice@example.com", "subject": "Hello"},
    risk_class=RiskClass.MEDIUM,
    side_effects=["email_sent"],
    rationale="Sending follow-up email to client",
)

# Decisions from human review
decision = Decision(
    action_id=action.id,
    approved=True,
    reason="Verified recipient is correct",
    decided_by="human:operator",
    latency_ms=1500.0,
)

Policies

Three built-in policies for different oversight tiers:

from hitloop import AlwaysApprovePolicy, RiskBasedPolicy, AuditPlusEscalatePolicy

# Tier 4: No human oversight (baseline)
policy = AlwaysApprovePolicy()

# Risk-based: Approve high-risk actions only
policy = RiskBasedPolicy(
    require_approval_for_high=True,
    require_approval_for_medium=False,
    high_risk_tools=["send_email", "delete_record"],
)

# Audit + Escalate: Random sampling + anomaly detection
policy = AuditPlusEscalatePolicy(
    audit_sample_rate=0.1,  # 10% random audit
    escalate_on_high_risk=True,
    anomaly_signals=["unusual_recipient", "large_amount"],
)

Adding a New Policy

Create a single file in src/hitloop/policies/:

# src/hitloop/policies/my_policy.py
from hitloop.core.interfaces import HITLPolicy
from hitloop.core.models import Action, Decision

class MyCustomPolicy(HITLPolicy):
    @property
    def name(self) -> str:
        return "my_custom"

    def should_request_approval(
        self, action: Action, state: dict
    ) -> tuple[bool, str]:
        # Your logic here
        if action.tool_name in self.critical_tools:
            return True, "Critical tool requires approval"
        return False, "Auto-approved"

LangGraph Integration

from langgraph.graph import StateGraph
from hitloop import hitl_gate_node, execute_tool_node, RiskBasedPolicy, CLIBackend

# Create nodes
policy = RiskBasedPolicy()
backend = CLIBackend()
logger = TelemetryLogger("traces.db")

gate = hitl_gate_node(policy, backend, logger)
executor = execute_tool_node(tools, logger)

# Build graph
graph = StateGraph(HITLState)
graph.add_node("llm", llm_node)
graph.add_node("hitl_gate", gate)
graph.add_node("execute", executor)

graph.add_edge("llm", "hitl_gate")
graph.add_conditional_edges(
    "hitl_gate",
    should_execute_condition,
    {"execute": "execute", "skip": "end"}
)
graph.add_edge("execute", "end")

Running Experiments

from hitloop import TelemetryLogger
from hitloop.eval import ExperimentRunner, ExperimentCondition
from hitloop.eval.runner import create_standard_conditions
from hitloop.scenarios import EmailDraftScenario

# Setup
logger = TelemetryLogger("experiment.db")
scenario = EmailDraftScenario()

# Create standard conditions (4 policies × scenarios)
conditions = create_standard_conditions(
    scenario=scenario,
    n_trials=20,
    injection_rate=0.2,  # 20% error injection
)

# Run
runner = ExperimentRunner(logger)
for c in conditions:
    runner.add_condition(c)

await runner.run_all()

# Export
runner.export_results("results.csv", "summary.json")

Output: results.csv

run_id	scenario_id	condition_id	policy_name	task_success	approval_requested	injected_error	error_caught
abc123	email_draft	risk_based	risk_based	1	1	0	0
def456	email_draft	risk_based	risk_based	0	1	1	1

Output: summary.json

{
  "risk_based": {
    "n_runs": 20,
    "success_rate": 0.85,
    "approval_rate": 0.65,
    "error_catch_rate": 0.75,
    "false_reject_rate": 0.05,
    "human_latency_mean_ms": 1200.5
  }
}

Research Alignment

hitloop metrics map directly to the research framework:

Metric	Research Concept	Description
`success_rate`	Quality	Task completion rate
`approval_rate`	Leverage proxy	Human involvement frequency
`error_catch_rate`	Appropriate reliance	Injected error detection
`false_reject_rate`	Appropriate reliance	Unnecessary rejections
`human_latency_ms`	Human burden	Time cost per decision
`cost_proxy`	Efficiency	Token/call overhead

Injected Errors for Ground Truth

The error injector provides ground truth for measuring oversight effectiveness:

from hitloop.eval import ErrorInjector, InjectionConfig

injector = ErrorInjector(InjectionConfig(
    injection_rate=0.2,
    injection_types=[
        InjectionType.WRONG_RECIPIENT,
        InjectionType.WRONG_RECORD_ID,
    ]
))

# Every action has known correctness
result = injector.maybe_inject(action)
if result.injected:
    # This action is KNOWN to be wrong
    # If human approves it → false negative
    # If human rejects it → true positive (error caught)

Project Structure

hitloop/
├── src/hitloop/
│   ├── core/
│   │   ├── models.py      # Action, Decision, TraceEvent
│   │   ├── interfaces.py  # ApprovalBackend, HITLPolicy
│   │   └── logger.py      # TelemetryLogger (SQLite)
│   ├── policies/
│   │   ├── always_approve.py
│   │   ├── risk_based.py
│   │   └── audit_plus_escalate.py
│   ├── backends/
│   │   ├── cli_backend.py
│   │   └── humanlayer_backend.py  # Optional
│   ├── langgraph/
│   │   └── nodes.py       # hitl_gate_node, execute_tool_node
│   ├── scenarios/
│   │   ├── email_draft.py
│   │   └── record_update.py
│   └── eval/
│       ├── runner.py      # ExperimentRunner
│       ├── injectors.py   # ErrorInjector
│       └── metrics.py     # MetricsCalculator
├── tests/
├── examples/
├── pyproject.toml
└── README.md

Instrumentation

Every run emits structured events:

# Per run
{
    "run_id": "abc123",
    "scenario_id": "email_draft",
    "condition_id": "risk_based",
    "seed": 42
}

# Per action
{
    "action_id": "xyz789",
    "tool_name": "send_email",
    "args_hash": "a1b2c3d4",
    "risk_class": "medium",
    "injected_error": false
}

# Per approval
{
    "channel": "cli",
    "latency_ms": 1250.0,
    "decision": true,
    "decided_by": "human"
}

# Per execution
{
    "success": true,
    "execution_time_ms": 45.2
}

Development

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy src/hitloop

# Linting
ruff check src/hitloop
ruff format src/hitloop

License

MIT License - see LICENSE file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.1

Jan 11, 2026

0.5.0

Jan 11, 2026

0.4.2

Jan 11, 2026

0.4.1

Jan 10, 2026

0.4.0

Jan 10, 2026

0.3.0

Jan 10, 2026

0.2.0

Jan 10, 2026

This version

0.1.0

Jan 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hitloop-0.1.0.tar.gz (44.9 kB view details)

Uploaded Jan 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hitloop-0.1.0-py3-none-any.whl (44.6 kB view details)

Uploaded Jan 10, 2026 Python 3

File details

Details for the file hitloop-0.1.0.tar.gz.

File metadata

Download URL: hitloop-0.1.0.tar.gz
Upload date: Jan 10, 2026
Size: 44.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for hitloop-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`e3e0c1511692bf0e54370ef27b99d0768d3ad3adc08f397a6e0878e3e0ebcfb4`
MD5	`2cd1771343e49ae876cd90104b05ed23`
BLAKE2b-256	`31e86b249701ebf0c16939a89cd04ca6fe921b41a7a02035ac7da8c1141838ec`

See more details on using hashes here.

File details

Details for the file hitloop-0.1.0-py3-none-any.whl.

File metadata

Download URL: hitloop-0.1.0-py3-none-any.whl
Upload date: Jan 10, 2026
Size: 44.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for hitloop-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1ba197b16c66e134b65e0d58fc8a3a21904cb1f33381d1ec320dd75c875a4412`
MD5	`30fd7fb2577c5f7a9975c9e710d065e0`
BLAKE2b-256	`4bb8803a94801ed51c18baec7196dc4502447ecdb79fcce9cf8e384d91419eea`

See more details on using hashes here.

hitloop 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

hitloop

Core Concept

Quick Start

Installation

Run an Example

Architecture

API Overview

Core Models

Policies

Adding a New Policy

LangGraph Integration

Running Experiments

Output: results.csv

Output: summary.json

Research Alignment

Injected Errors for Ground Truth

Project Structure

Instrumentation

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes