Procedural memory + REPL context package with DSPy 3.0 integration for autonomous agents

These details have not been verified by PyPI

Project links

Project description

rec-praxis-rlm

Procedural Memory + REPL Context for Autonomous AI Agents

A Python package that provides persistent procedural memory and safe code execution capabilities for DSPy 3.0 autonomous agents, enabling experience-based learning and programmatic document manipulation.

Features

Core Capabilities

Procedural Memory: Store and retrieve agent experiences with hybrid similarity scoring (environmental + goal embeddings)
FAISS Indexing: 10-100x faster retrieval at scale (>10k experiences)
RLM Context: Programmatic document inspection (grep, peek, head, tail) with ReDoS protection
Safe Code Execution: Sandboxed Python REPL with AST validation and restricted builtins
DSPy 3.0 Integration: Autonomous planning with ReAct agents and integrated tools
MLflow Observability: Automatic tracing and experiment tracking
Production Ready: 99.38% test coverage, comprehensive error handling, backward-compatible storage versioning

IDE Integrations & Developer Tools (v0.4.0+)

Pre-commit Hooks: Automated code review, security audit, and dependency scanning before git commits
VS Code Extension: Real-time inline diagnostics with procedural memory-powered suggestions
GitHub Actions: CI/CD workflows for automated security scanning on pull requests
CLI Tools: Command-line interface for integration into any development workflow
Learning from Fixes: Agents remember and apply successful code improvements across sessions

Requirements

Core Features (No API Key Required)

The following features work out-of-the-box without any API keys:

Procedural Memory: Uses local sentence-transformers for embeddings
RLM Context: Document inspection (grep, peek, head, tail) and safe code execution
FAISS Indexing: Optional performance optimization for large-scale retrieval

Optional Features (API Key Required)

DSPy Autonomous Planning: Requires an API key from one of these providers:
- Groq (recommended - fast and free): export GROQ_API_KEY="gsk-..."
- OpenAI: export OPENAI_API_KEY="sk-..."
- OpenRouter (access to many models): export OPENROUTER_API_KEY="sk-or-..."
- Any LiteLLM-supported provider

Quick Start

Installation

# Basic installation (works without API key)
pip install rec-praxis-rlm

# With all optional dependencies (FAISS, OpenAI, async support)
pip install rec-praxis-rlm[all]

# Development installation
pip install rec-praxis-rlm[dev]

Example 1: Procedural Memory

from rec_praxis_rlm.memory import ProceduralMemory, Experience
from rec_praxis_rlm.config import MemoryConfig

# Initialize memory
config = MemoryConfig(storage_path="./agent_memory.jsonl")
memory = ProceduralMemory(config)

# Store experiences
memory.store(Experience(
    env_features=["web_scraping", "python", "beautifulsoup"],
    goal="extract product prices from e-commerce site",
    action="Used BeautifulSoup with CSS selectors for price elements",
    result="Successfully extracted 1000 prices with 99% accuracy",
    success=True
))

# Recall similar experiences
experiences = memory.recall(
    env_features=["web_scraping", "python"],
    goal="extract data from website",
    top_k=5
)

for exp in experiences:
    print(f"Similarity: {exp.similarity_score:.2f}")
    print(f"Action: {exp.action}")
    print(f"Result: {exp.result}\n")

Example 2: RLM Context for Document Inspection

from rec_praxis_rlm.rlm import RLMContext
from rec_praxis_rlm.config import ReplConfig

# Initialize context
config = ReplConfig()
context = RLMContext(config)

# Add documents
with open("application.log", "r") as f:
    context.add_document("app_log", f.read())

# Search for patterns
matches = context.grep(r"ERROR.*database", doc_id="app_log")
for match in matches:
    print(f"Line {match.line_number}: {match.match_text}")
    print(f"Context: ...{match.context_before}{match.match_text}{match.context_after}...")

# Extract specific ranges
error_section = context.peek("app_log", start_char=1000, end_char=2000)

# Get first/last N lines
recent_logs = context.tail("app_log", n_lines=50)

Example 3: Safe Code Execution

from rec_praxis_rlm.rlm import RLMContext

context = RLMContext()

# Execute safe code
result = context.safe_exec("""
total = 0
for i in range(10):
    total += i * 2
total
""")

if result.success:
    print(f"Output: {result.output}")
    print(f"Execution time: {result.execution_time_seconds:.3f}s")
else:
    print(f"Error: {result.error}")

# Prohibited operations are blocked
result = context.safe_exec("import os; os.system('rm -rf /')")
# Result: ExecutionError - Import statements not allowed

Example 4: Autonomous Planning with DSPy

from rec_praxis_rlm.dspy_agent import PraxisRLMPlanner
from rec_praxis_rlm.memory import ProceduralMemory
from rec_praxis_rlm.config import PlannerConfig, MemoryConfig

# Initialize memory and planner
memory = ProceduralMemory(MemoryConfig())

# Option 1: Programmatic API key (recommended for Groq)
planner = PraxisRLMPlanner(
    memory=memory,
    config=PlannerConfig(
        lm_model="groq/llama-3.3-70b-versatile",
        api_key="gsk-..."  # Pass key directly
    )
)

# Option 2: Environment variables (works for all providers)
# import os
# os.environ["GROQ_API_KEY"] = "gsk-..."
# planner = PraxisRLMPlanner(
#     memory=memory,
#     config=PlannerConfig(lm_model="groq/llama-3.3-70b-versatile")
# )

# Option 3: OpenAI with programmatic key
# planner = PraxisRLMPlanner(
#     memory=memory,
#     config=PlannerConfig(
#         lm_model="openai/gpt-4o-mini",
#         api_key="sk-..."
#     )
# )

# Option 4: OpenRouter with programmatic key
# planner = PraxisRLMPlanner(
#     memory=memory,
#     config=PlannerConfig(
#         lm_model="openrouter/meta-llama/llama-3.2-3b-instruct:free",
#         api_key="sk-or-..."
#     )
# )

# Add context for document inspection
from rec_praxis_rlm.rlm import RLMContext
context = RLMContext()
context.add_document("logs", open("server.log").read())
planner.add_context(context, "server_logs")

# Autonomous planning
answer = planner.plan(
    goal="Analyze server errors and suggest fixes",
    env_features=["production", "high_traffic", "database"]
)
print(answer)

Architecture

┌─────────────────────────────────────────┐
│     PraxisRLMPlanner (DSPy ReAct)       │
│   Autonomous decision-making layer      │
├─────────────────┬───────────────────────┤
│                 │                       │
│    Tools        │    Tools              │
│                 │                       │
▼                 ▼                       ▼
┌─────────────┐  ┌──────────────┐  ┌─────────────┐
│ Procedural  │  │  RLMContext  │  │   External  │
│   Memory    │  │   (Facade)   │  │    APIs     │
├─────────────┤  ├──────────────┤  └─────────────┘
│ • recall()  │  │ DocumentStore│
│ • store()   │  │ DocSearcher  │
│ • compact() │  │ CodeExecutor │
├─────────────┤  └──────────────┘
│ Embeddings  │
│ ┌─────────┐ │
│ │ Local   │ │  FAISS Index (optional)
│ │ API     │ │  ┌──────────────┐
│ │ Jaccard │◄─┼──┤ 10-100x      │
│ └─────────┘ │  │ faster search│
└─────────────┘  └──────────────┘
       │
       ▼
  Storage (JSONL)
  • Append-only
  • Versioned
  • Crash-safe

Performance

Operation	Without FAISS	With FAISS	Speedup
Recall (100 exp)	~2ms	~2ms	1x
Recall (1,000 exp)	~20ms	~3ms	6.7x
Recall (10,000 exp)	~200ms	~20ms	10x
Recall (100,000 exp)	~2000ms	~20ms	100x

Operation	Performance	Notes
Document grep (10MB)	<500ms	With ReDoS protection
Safe code execution	<100ms	Sandboxed environment
Memory loading (10k exp)	<1s	With lazy loading

Supported LLM Providers

For DSPy autonomous planning, rec-praxis-rlm supports any LiteLLM-compatible provider:

Groq (Recommended)

Fast, free API with high rate limits.

import os
os.environ["GROQ_API_KEY"] = "gsk-..."

planner = PraxisRLMPlanner(
    memory=memory,
    config=PlannerConfig(lm_model="groq/llama-3.3-70b-versatile")
)

Available models: llama-3.3-70b-versatile, mixtral-8x7b-32768, gemma2-9b-it

OpenAI

Industry standard with highest quality models.

import os
os.environ["OPENAI_API_KEY"] = "sk-..."

planner = PraxisRLMPlanner(
    memory=memory,
    config=PlannerConfig(lm_model="openai/gpt-4o-mini")
)

Available models: gpt-4o-mini, gpt-4o, gpt-4-turbo, gpt-3.5-turbo

OpenRouter

Access to 200+ models from multiple providers.

import os
os.environ["OPENROUTER_API_KEY"] = "sk-or-..."

planner = PraxisRLMPlanner(
    memory=memory,
    config=PlannerConfig(lm_model="openrouter/meta-llama/llama-3.2-3b-instruct:free")
)

Available models: See OpenRouter models

Other Providers

Any LiteLLM-supported provider works: Anthropic, Cohere, Azure, AWS Bedrock, etc.

# Anthropic Claude
os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
planner = PraxisRLMPlanner(
    memory=memory,
    config=PlannerConfig(lm_model="anthropic/claude-3-5-sonnet-20241022")
)

See LiteLLM providers for full list.

Configuration

Memory Configuration

from rec_praxis_rlm.config import MemoryConfig

config = MemoryConfig(
    storage_path="./memory.jsonl",
    top_k=6,                          # Number of experiences to retrieve
    similarity_threshold=0.5,         # Minimum similarity score
    env_weight=0.6,                   # Weight for environmental features
    goal_weight=0.4,                  # Weight for goal similarity
    require_success=False,            # Only retrieve successful experiences
    embedding_model="sentence-transformers/all-MiniLM-L6-v2",
    result_size_limit=50000           # Max result size in bytes
)

Configuration Presets (v0.4.3+):

Simplify configuration with task-optimized presets:

# Code review preset (precise, successful experiences only)
config = MemoryConfig.for_code_review()

# Security audit preset (broad, includes false positives for learning)
config = MemoryConfig.for_security_audit()

# Web scraping preset (prioritizes site structure)
config = MemoryConfig.for_web_scraping()

# Test generation preset (high precision for test patterns)
config = MemoryConfig.for_testing()

Preset Comparison:

Preset	top_k	similarity_threshold	env_weight	goal_weight	require_success	Best For
`for_code_review()`	4	0.7 (high)	0.3	0.7	True	Precise code quality patterns
`for_security_audit()`	8	0.4 (low)	0.5	0.5	False	Diverse vulnerability detection
`for_web_scraping()`	6	0.5 (medium)	0.7	0.3	True	Site structure similarity
`for_testing()`	5	0.75 (very high)	0.2	0.8	True	Test coverage patterns

REPL Configuration

from rec_praxis_rlm.config import ReplConfig

config = ReplConfig(
    max_output_chars=10000,           # Max output capture
    max_search_matches=100,           # Max grep results
    search_context_chars=200,         # Context before/after match
    execution_timeout_seconds=5.0,    # Code execution timeout
    enable_sandbox=True,              # Use sandboxed execution
    log_executions=True,              # Log for audit trail
    allowed_builtins=[                # Allowed built-in functions
        "len", "range", "sum", "max", "min", "sorted", ...
    ]
)

Planner Configuration

from rec_praxis_rlm.config import PlannerConfig

config = PlannerConfig(
    lm_model="openai/gpt-4o-mini",    # Language model
    api_key="sk-...",                  # Optional API key (or use env vars)
    temperature=0.0,                   # Sampling temperature
    max_iters=10,                      # Max ReAct iterations
    enable_mlflow_tracing=True,        # MLflow observability
    optimizer="miprov2",               # DSPy optimizer
    optimizer_auto_level="medium",     # Automation level
    use_toon_adapter=False             # Enable TOON format for 40% token reduction (experimental)
)

TOON Format Support (Experimental):

Enable TOON (Token-Oriented Object Notation) for ~40% token reduction in DSPy prompts:

# Install TOON support
# pip install rec-praxis-rlm[toon]

config = PlannerConfig(
    lm_model="openai/gpt-4o-mini",
    use_toon_adapter=True  # Enable TOON format
)

planner = PraxisRLMPlanner(memory, config)
# All DSPy interactions now use TOON format for efficiency

Benefits:

~40% reduction in prompt tokens (saves API costs)
Faster inference (fewer tokens to process)
Same accuracy as JSON format

Compatibility: Requires dspy-toon>=0.1.0 (install with pip install rec-praxis-rlm[toon])

Note: TOON support is experimental in v0.4.1. Future versions (v0.6.0+) will integrate TOON into procedural memory storage for further efficiency gains. See Issue #1 for roadmap.

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=rec_praxis_rlm --cov-report=html

# Run specific test suites
pytest tests/unit/           # Unit tests
pytest tests/integration/    # Integration tests

# Run performance tests
pytest tests/unit/test_memory.py -k "performance"

Current test coverage: 99.38% (327 passing tests)

Security

Sandboxed Code Execution

The SafeExecutor provides multiple layers of security:

AST Validation: Blocks imports, eval, exec, file I/O, network access
Restricted Builtins: Only safe functions allowed (configurable)
Execution Timeout: Prevents infinite loops
Output Limiting: Prevents memory exhaustion
Code Hashing: Audit trail for all executed code

Blocked operations:

All imports (import, from ... import)
Dangerous builtins (eval, exec, __import__, compile, open)
File system access
Network access
Privileged attributes (__class__, __globals__, __dict__)

ReDoS Protection

The DocumentSearcher validates regex patterns to prevent Regular Expression Denial of Service attacks:

Pattern length limits (<500 chars)
Nested quantifier detection ((a+)+)
Excessive wildcard detection (>3 instances of .* or .+)
Overlapping alternation warnings

Advanced Features

Async Support

import asyncio
from rec_praxis_rlm.memory import ProceduralMemory
from rec_praxis_rlm.rlm import RLMContext

async def main():
    memory = ProceduralMemory(config)
    context = RLMContext(config)

    # Async memory recall
    experiences = await memory.arecall(
        env_features=["python"],
        goal="debug error"
    )

    # Async code execution
    result = await context.asafe_exec("sum(range(1000000))")

asyncio.run(main())

Custom Embedding Providers

from rec_praxis_rlm.embeddings import APIEmbedding
from rec_praxis_rlm.memory import ProceduralMemory

# Use OpenAI embeddings
embedding_provider = APIEmbedding(
    api_provider="openai",
    api_key="sk-...",
    model_name="text-embedding-3-small"
)

memory = ProceduralMemory(
    config,
    embedding_provider=embedding_provider
)

Memory Maintenance

# Compact memory (remove old/low-value experiences)
memory.compact(max_size=1000, min_similarity=0.7)

# Recompute embeddings (after changing embedding model)
new_provider = SentenceTransformerEmbedding("new-model")
memory.recompute_embeddings(new_provider)

Custom Metrics

from rec_praxis_rlm.metrics import memory_retrieval_quality, SemanticF1Score

# Memory retrieval quality metric
score = memory_retrieval_quality(
    example={"env_features": [...], "goal": "...", "expected_success_rate": 0.8},
    prediction=retrieved_experiences
)

# Semantic F1 scoring for DSPy optimization
f1_metric = SemanticF1Score(relevance_threshold=0.7)
score = f1_metric(example, prediction)

MLflow Integration

from rec_praxis_rlm.telemetry import setup_mlflow_tracing

# Enable automatic MLflow tracing
setup_mlflow_tracing(experiment_name="my-agent-experiment")

# All DSPy operations are now traced automatically
planner = PraxisRLMPlanner(memory, config)
result = planner.plan(goal="...", env_features=[...])

# View traces in MLflow UI
# mlflow ui --port 5000

IDE Integrations & Developer Tools

Pre-commit Hooks

Automatically review code, audit security, and scan dependencies before every commit:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/jmanhype/rec-praxis-rlm
    rev: v0.4.0
    hooks:
      - id: rec-praxis-review      # Code review (fail on HIGH+)
      - id: rec-praxis-audit        # Security audit (fail on CRITICAL)
      - id: rec-praxis-deps         # Dependency & secret scan

Install and run:

pip install pre-commit rec-praxis-rlm[all]
pre-commit install
git commit -m "feat: add new feature"  # Hooks run automatically

CLI Tools

Use rec-praxis-rlm from the command line:

# Code review (human-readable format)
rec-praxis-review src/**/*.py --severity=HIGH

# Code review (JSON for IDE integration)
rec-praxis-review src/**/*.py --severity=HIGH --format=json

# Code review (TOON format for 40% token reduction)
rec-praxis-review src/**/*.py --severity=HIGH --format=toon

# Code review (SARIF for GitHub Security tab)
rec-praxis-review src/**/*.py --format=sarif > code-review.sarif

# Code review (Interactive HTML report for stakeholders)
rec-praxis-review src/**/*.py --format=html --output=security-report.html

# Security audit
rec-praxis-audit app.py --fail-on=CRITICAL --format=sarif > security-audit.sarif

# Dependency & secret scan
rec-praxis-deps --requirements=requirements.txt --files src/config.py --format=sarif > deps.sarif

Output Formats:

human (default): Colorful, emoji-rich output for terminal viewing
json: Structured JSON for IDE integration and programmatic parsing
toon: Token-efficient format providing ~40% token reduction (experimental)
sarif: SARIF v2.1.0 format for GitHub Security tab integration (v0.4.3+)
html: Interactive HTML reports with charts and filtering (v0.4.4+)

Features:

Configurable severity thresholds
Persistent procedural memory (learns from past reviews)
Exit codes for CI/CD pipelines
TOON format support for cost-effective LLM integration

Interactive HTML Reports

Generate beautiful, shareable security reports for stakeholders (v0.4.4+):

Features:

Interactive Charts: Severity distribution (pie chart) and OWASP Top 10 breakdown (bar chart)
Filterable Tables: Click severity badges to filter findings instantly
Detailed Findings: Expandable remediation advice for each issue
Print-to-PDF: Built-in print stylesheet for professional PDF export
Standalone Files: No external dependencies - share HTML files directly
CVE Support: Displays dependency vulnerabilities with upgrade paths

Example Usage:

# Generate HTML report from code review
rec-praxis-review src/**/*.py --format=html --output=security-report.html

# Security audit HTML report
rec-praxis-audit app.py --format=html --output=audit-report.html

# Dependency scan with CVEs
rec-praxis-deps --requirements=requirements.txt --files src/*.py --format=html --output=deps-report.html

Iterative Improvement Mode

Autonomous security improvement mode similar to Qodo-Cover's coverage-driven iteration, but for security quality (v0.5.0+):

Features:

Run multiple LLM iterations until quality score target met
Each iteration learns from previous failures via procedural memory
Auto-suggest fixes for detected issues (prioritizes CRITICAL/HIGH)
Re-scan with fixes applied to validate improvement
Track quality score progression with visual progress bars
Stop when target reached or max iterations hit
MLflow integration for tracking improvement metrics

Example Usage:

# Iterative mode with default settings (target: 95%, max: 5 iterations)
rec-praxis-review src/app.py --mode=iterative

# Custom target and iteration limit
rec-praxis-review src/**/*.py \
  --mode=iterative \
  --max-iterations=10 \
  --target-score=98

# With auto-fix suggestions
rec-praxis-review src/**/*.py \
  --mode=iterative \
  --target-score=95 \
  --auto-fix

# With MLflow tracking
rec-praxis-review src/**/*.py \
  --mode=iterative \
  --max-iterations=7 \
  --target-score=90 \
  --auto-fix \
  --mlflow-experiment=iterative-improvement

Quality Score Calculation:

The quality score (0-100) is calculated based on:

CRITICAL findings: -10 points each
HIGH findings: -5 points each
MEDIUM findings: -2 points each
LOW findings: -0.5 points each
INFO findings: -0.1 points each
Normalized by code size (more lines → more tolerant)

Output Example:

🔄 Iterative Improvement Mode
Target: 95% quality score
Max iterations: 5

============================================================
Iteration 1/5
============================================================

📊 Results:
  Quality Score: 72.5%
  Total Findings: 12
  Blocking Findings: 3
  Severity Breakdown:
    CRITICAL: 1
    HIGH: 2
    MEDIUM: 5
    LOW: 4

💡 Suggested Fixes for Next Iteration:

1. Hardcoded Credentials (CRITICAL)
   File: src/app.py:45
   Fix: Use environment variables: os.getenv('API_KEY')

2. SQL Injection Risk (HIGH)
   File: src/db.py:78
   Fix: Use parameterized queries: cursor.execute('SELECT * FROM users WHERE id=?', (user_id,))

🔄 Continuing to iteration 2...
   Current: 72.5% | Target: 95% | Gap: 22.5%

... (iterations 2-4) ...

✅ Target score reached! (96.2% >= 95%)
   Completed in 4 iteration(s)

============================================================
📈 Improvement Summary
============================================================
Initial Score: 72.5%
Final Score: 96.2%
Improvement: +23.7%
Iterations: 4

Progression:
  Iter 1: ████████████████████████████████████ 72.5%
  Iter 2: ███████████████████████████████████████████ 85.0%
  Iter 3: █████████████████████████████████████████████ 91.3%
  Iter 4: ████████████████████████████████████████████████ 96.2%

JSON Output:

{
  "mode": "iterative",
  "iterations": 4,
  "final_score": 96.2,
  "target_score": 95,
  "target_reached": true,
  "total_findings": 2,
  "blocking_findings": 0,
  "iteration_history": [
    {"iteration": 1, "score": 72.5, "total_findings": 12, "blocking_findings": 3},
    {"iteration": 2, "score": 85.0, "total_findings": 6, "blocking_findings": 1},
    {"iteration": 3, "score": 91.3, "total_findings": 4, "blocking_findings": 0},
    {"iteration": 4, "score": 96.2, "total_findings": 2, "blocking_findings": 0}
  ],
  "findings": [...]
}

Use Cases:

Autonomous security improvement ("set it and forget it")
CI/CD pipelines with quality gates
Pre-release security hardening
Continuous security posture improvement
A/B testing security fix strategies
Demonstrating security improvement progress to stakeholders

Report Contents:

Summary cards (Total findings, Critical count, High count, Medium/Low count)
Severity distribution donut chart (powered by Chart.js)
OWASP Top 10 category breakdown bar chart
Sortable/filterable findings table with:
- Severity badges (color-coded)
- File paths and line numbers
- Expandable remediation guidance
- CWE and OWASP categorization
CVE vulnerability table (if applicable)
Print/Save to PDF button

Use Cases:

Share security reports with non-technical stakeholders
Archive security scan results for compliance audits
Present findings in management reviews
Embed in documentation or wikis

MLflow Metrics Tracking

Track security scan metrics over time with MLflow integration (v0.4.4+):

Features:

Automatic metrics logging for all scan types
Trend analysis and security posture dashboards
MTTR (Mean Time To Remediate) tracking
False positive rate monitoring
LLM cost tracking (tokens, USD estimates)
A/B testing for different prompt variants

Example Usage:

# Code review with MLflow tracking
rec-praxis-review src/**/*.py --mlflow-experiment=code-quality

# Security audit with metrics
rec-praxis-audit app.py --mlflow-experiment=security-posture

# Dependency scan with tracking
rec-praxis-deps --requirements=requirements.txt --mlflow-experiment=supply-chain

View MLflow Dashboard:

# Start MLflow UI
mlflow ui --port 5000

# Navigate to http://localhost:5000 to view:
# - Scan duration trends
# - Findings by severity over time
# - OWASP category distribution
# - Cost per scan (token usage)
# - Files scanned per second (performance)

Metrics Logged:

<scan_type>.total_findings: Total issues detected
<scan_type>.critical_count: Critical severity count
<scan_type>.high_count: High severity count
<scan_type>.medium_count: Medium severity count
<scan_type>.low_count: Low severity count
<scan_type>.files_scanned: Number of files analyzed
<scan_type>.scan_duration_seconds: Total scan time
<scan_type>.llm_tokens_used: LLM tokens consumed
<scan_type>.llm_cost_usd: Estimated cost in USD
<scan_type>.findings_per_file: Derived metric
<scan_type>.files_per_second: Performance metric

Programmatic Usage:

from rec_praxis_rlm.telemetry import (
    setup_mlflow_tracing,
    log_security_scan_metrics,
    log_remediation_metrics
)
import mlflow

# Setup experiment
setup_mlflow_tracing(experiment_name="security-scans")

# Run scan and log metrics
with mlflow.start_run(run_name="scan_2025_01_15"):
    # Your scan logic here...
    findings = agent.review_code(files)

    log_security_scan_metrics(
        findings=findings,
        scan_type="code_review",
        files_scanned=len(files),
        scan_duration_seconds=duration
    )

# Track remediation (MTTR)
log_remediation_metrics(
    issue_id="SEC-123",
    severity="CRITICAL",
    time_to_fix_hours=2.5,
    was_reintroduced=False
)

Use Cases:

Monitor security posture over sprints
Track mean time to remediation (MTTR)
Compare scan performance across code versions
Optimize LLM costs with token tracking
A/B test different review prompts
Generate compliance reports

VS Code Extension

Install the "rec-praxis-rlm Code Intelligence" extension from the VS Code Marketplace, or build from source:

Repository: github.com/jmanhype/rec-praxis-rlm-vscode

Features:

Inline Diagnostics: See code review and security findings as you type
Context Menu: Right-click to review/audit current file
Auto-review on Save: Real-time feedback (configurable)
Dependency Scanning: Right-click requirements.txt to scan for CVEs
Procedural Memory Integration: Learns from past fixes across sessions

Settings (F1 → "Preferences: Open Settings (JSON)"):

{
  "rec-praxis-rlm.pythonPath": "python",
  "rec-praxis-rlm.codeReview.severity": "HIGH",
  "rec-praxis-rlm.securityAudit.failOn": "CRITICAL",
  "rec-praxis-rlm.enableDiagnostics": true,
  "rec-praxis-rlm.autoReviewOnSave": false
}

Installation:

# From VS Code
# 1. Open Extensions (Ctrl+Shift+X / Cmd+Shift+X)
# 2. Search for "rec-praxis-rlm"
# 3. Click Install

# From source (for developers)
git clone https://github.com/jmanhype/rec-praxis-rlm-vscode.git
cd rec-praxis-rlm-vscode
npm install && npm run compile
npm run package  # Creates .vsix file
# Install .vsix via VS Code: Extensions → ... → Install from VSIX

See the VS Code extension repository for full documentation.

GitHub Actions

Automatically scan pull requests for security issues:

# .github/workflows/rec-praxis-scan.yml
name: Security Scan

on: [pull_request]

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
      - run: pip install rec-praxis-rlm[all]
      - run: rec-praxis-review $(git diff --name-only --diff-filter=ACMR origin/main...HEAD | grep '\.py$')
      - run: rec-praxis-audit $(git diff --name-only --diff-filter=ACMR origin/main...HEAD | grep '\.py$')
      - run: rec-praxis-deps --requirements=requirements.txt --fail-on=CRITICAL

Features:

Automatic PR comments with findings
Artifact uploads for review results
Configurable severity thresholds
Supports matrix builds (Python 3.10+)
Dogfooding: This repo uses its own tools to scan the examples/ directory on every push

Dogfooding Workflow:

The rec-praxis-rlm project dogfoods its own tools by scanning the examples/ directory on every push to main:

# .github/workflows/rec-praxis-scan.yml (dogfood-examples job)
dogfood-examples:
  name: Dogfood on Examples
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-python@v5
    - run: pip install -e .[all]  # Install from source
    - run: rec-praxis-review examples/*.py --severity=MEDIUM --json
    - run: rec-praxis-audit examples/*.py --fail-on=HIGH --json
    - run: rec-praxis-deps --requirements=requirements.txt --files examples/*.py

This demonstrates:

Self-validation: The tools scan themselves for quality issues
Real-world usage: Shows the tools working on production code
Continuous improvement: Catches regressions in example code
Non-blocking: Uses continue-on-error: true to show findings without failing CI

View dogfooding results in the GitHub Actions artifacts.

See .github/workflows/rec-praxis-scan.yml for the full workflow implementation.

PR-Agent Style Integration

Post security findings as inline GitHub PR comments, similar to PR-Agent (v0.5.0+):

Features:

Inline review comments on specific lines with security findings
Summary comment with severity breakdown and top issues
Automatic PR comment posting via GitHub API
Supports dry-run mode for testing
Up to 20 inline comments to avoid spam
Emoji-coded severity levels (🔴 CRITICAL, 🟠 HIGH, 🟡 MEDIUM, 🟢 LOW)

Installation:

pip install rec-praxis-rlm[github]

CLI Usage:

# Post findings to PR #123
rec-praxis-pr-review src/**/*.py \
  --pr-number=123 \
  --repo=owner/repo \
  --severity=HIGH

# Dry run (show what would be posted)
rec-praxis-pr-review src/**/*.py \
  --pr-number=123 \
  --repo=owner/repo \
  --severity=HIGH \
  --dry-run

# Custom commit SHA
rec-praxis-pr-review src/**/*.py \
  --pr-number=123 \
  --repo=owner/repo \
  --commit-sha=abc123def456

GitHub Actions Workflow:

# .github/workflows/pr-security-review.yml
name: PR Security Review

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  security-review:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write  # Required for posting comments

    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Full history for git diff

      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"

      - name: Install rec-praxis-rlm
        run: pip install rec-praxis-rlm[github]

      - name: Find changed Python files
        id: changed-files
        run: |
          CHANGED_FILES=$(git diff --name-only --diff-filter=ACMR ${{ github.event.pull_request.base.sha }}...${{ github.event.pull_request.head.sha }} | grep '\.py$' || echo "")
          echo "files=$CHANGED_FILES" >> $GITHUB_OUTPUT

      - name: Post security findings
        if: steps.changed-files.outputs.files != ''
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          rec-praxis-pr-review ${{ steps.changed-files.outputs.files }} \
            --pr-number=${{ github.event.pull_request.number }} \
            --repo=${{ github.repository }} \
            --severity=HIGH

Example PR Comment Output:

Summary Comment:

## 🔒 rec-praxis-rlm Security Scan Results

**Found 5 issue(s) at HIGH+ severity**

### Severity Breakdown
- 🔴 **CRITICAL**: 1
- 🟠 **HIGH**: 2
- 🟡 **MEDIUM**: 2

### Top Issues

1. 🔴 **Hardcoded Credentials** (CRITICAL)
   - File: `src/app.py:45`
   - Hardcoded API key found in source code

2. 🟠 **SQL Injection Risk** (HIGH)
   - File: `src/db.py:78`
   - Potential SQL injection: String concatenation in SQL execute()

...

---
*Powered by [rec-praxis-rlm](https://github.com/jmanhype/rec-praxis-rlm)*

Inline Review Comments (posted on specific lines):

🔴 **CRITICAL: Hardcoded Credentials**

Hardcoded API key found in source code

**Remediation:**
Use environment variables: os.getenv('API_KEY') or configuration files excluded from version control

Use Cases:

Shift-left security: Catch issues before merge
Code review automation for security teams
Educational tool for developers (learn from inline comments)
Compliance enforcement (block PRs with CRITICAL findings)
Reduce manual security review workload

Rate Limiting:

Maximum 20 inline comments per scan (most critical issues prioritized)
Prevents GitHub API rate limit issues
All findings still shown in summary comment

Test Generation Agent

Automatically generate pytest tests for uncovered code paths, inspired by Qodo-Cover (v0.5.0+):

Features:

Analyzes coverage.py reports to identify uncovered code regions
Generates pytest tests targeting specific uncovered functions
Uses procedural memory to learn from successful test patterns
Validates generated tests execute and pass
Supports iterative test generation until coverage target met
Groups consecutive uncovered lines into meaningful test targets
Provides estimated coverage gain for each generated test

Installation:

pip install rec-praxis-rlm[test-generation]
# Or install with coverage already included in dev dependencies
pip install rec-praxis-rlm[dev]

Prerequisites:

# First, run your tests with coverage to generate .coverage file
pytest --cov=your_package --cov-report=term tests/

CLI Usage:

# Generate tests for all uncovered code (default target: 90%)
rec-praxis-generate-tests

# Target specific files
rec-praxis-generate-tests src/app.py src/utils.py

# Custom coverage target
rec-praxis-generate-tests --target-coverage=95

# Limit number of tests generated
rec-praxis-generate-tests --max-tests=5

# Dry run (show what would be generated)
rec-praxis-generate-tests --dry-run

# Generate and validate tests
rec-praxis-generate-tests --validate

# Custom test directory
rec-praxis-generate-tests --test-dir=integration_tests

# JSON output for automation
rec-praxis-generate-tests --format=json

Example Output:

🧪 Test Generation Agent v0.5.0
============================================================

Current coverage: 78.3%
Target coverage: 90.0%
Found 12 uncovered regions

Generated test for calculate_discount at src/pricing.py:45
Generated test for validate_email at src/utils.py:89
Generated test for process_payment at src/payments.py:123

============================================================
📝 Generated 3 test(s)
============================================================

1. Test for calculate_discount at lines 45-52
   Target: calculate_discount in src/pricing.py
   Test file: tests/test_pricing.py
   Estimated coverage gain: 8.0 lines

   ✅ Created tests/test_pricing.py

2. Test for validate_email at lines 89-96
   Target: validate_email in src/utils.py
   Test file: tests/test_utils.py
   Estimated coverage gain: 8.0 lines

   ✅ Appended to tests/test_utils.py

3. Test for process_payment at lines 123-135
   Target: process_payment in src/payments.py
   Test file: tests/test_payments.py
   Estimated coverage gain: 13.0 lines

   ✅ Created tests/test_payments.py

Generated Test Example:

"""Auto-generated test for calculate_discount."""
import pytest
from pricing import calculate_discount


def test_calculate_discount_basic():
    """Test calculate_discount with basic inputs."""
    # TODO: Add appropriate test cases
    # Generated for uncovered lines 45-52
    pass

How It Works:

Coverage Analysis: Parses .coverage file to identify uncovered lines
Region Grouping: Groups consecutive uncovered lines into logical regions
AST Parsing: Extracts function/class context for each uncovered region
Test Generation: Creates pytest test stubs targeting specific functions
Memory Learning: Stores successful test patterns in procedural memory
Validation: Optionally runs pytest to verify tests execute

Procedural Memory Integration:

The agent learns from successful test patterns:

Stores test structures that pass validation
Recalls similar tests for similar code patterns
Improves test quality over time through experience

Use Cases:

Increase test coverage before releases
Generate test scaffolding for new code
Find untested code paths in legacy projects
Close the loop: detect issues → generate tests → prevent regression
Save developer time on boilerplate test writing

Iterative Test Generation:

# Generate tests iteratively until 95% coverage reached
while [ $(pytest --cov=src --cov-report=term | grep TOTAL | awk '{print $4}' | sed 's/%//') -lt 95 ]; do
  rec-praxis-generate-tests --max-tests=3 --validate
  pytest --cov=src --cov-report=term
done

Programmatic API:

from rec_praxis_rlm.agents import TestGenerationAgent
from pathlib import Path

# Initialize agent
agent = TestGenerationAgent(
    memory_path=".rec-praxis-rlm/test_generation_memory.jsonl",
    coverage_data_file=".coverage",
    test_dir="tests"
)

# Analyze coverage
analysis = agent.analyze_coverage(source_files=["src/app.py"])
print(f"Current coverage: {analysis.total_coverage:.1f}%")
print(f"Uncovered regions: {len(analysis.uncovered_regions)}")

# Generate tests
generated_tests = agent.generate_tests_for_coverage_gap(
    target_coverage=90.0,
    max_tests=10,
    source_files=["src/app.py"]
)

for test in generated_tests:
    print(f"Generated: {test.description}")
    print(f"Test file: {test.test_file_path}")

    # Validate test
    success, message = agent.validate_test(test)
    if success:
        print(f"✅ {message}")
    else:
        print(f"❌ {message}")

Limitations (MVP):

The current implementation provides a foundation for test generation:

Template-based generation: Uses simple templates for test stubs (TODO: LLM-based generation with DSPy in future)
Manual refinement required: Generated tests need developer input for assertions
Function-level targeting: Focuses on uncovered functions (TODO: branch coverage in future)
Python/pytest only: Currently supports Python projects with pytest

Roadmap (Future Versions):

v0.6.0: DSPy-based intelligent test generation with assertions
v0.7.0: Branch coverage analysis and conditional test generation
v0.8.0: Property-based testing with Hypothesis integration
v0.9.0: Multi-language support (JavaScript/TypeScript, Go, Rust)

Integration with Qodo AI Workflow:

# 1. Run security scan
rec-praxis-audit src/**/*.py --severity=HIGH

# 2. Generate tests for untested security-critical code
rec-praxis-generate-tests src/auth.py src/crypto.py --validate

# 3. Run tests to verify coverage increase
pytest --cov=src --cov-report=term

# 4. Iterate until both security and coverage targets met
rec-praxis-review src/**/*.py --mode=iterative --target-score=95

Examples

See the examples/ directory for complete examples:

quickstart.py - Basic memory and context usage
log_analyzer.py - Log analysis with RLM context
web_agent.py - Web scraping agent with procedural memory
optimization.py - DSPy MIPROv2 optimizer usage
code_review_agent.py - Intelligent code review with procedural memory
security_audit_agent.py - OWASP-based security auditing
dependency_scan_agent.py - CVE detection and secret scanning

Community & Contributing

Contributing Guidelines

We welcome contributions from the community! Here's how you can help:

Ways to Contribute:

Report bugs via GitHub Issues
Propose features or improvements
Improve documentation
Submit bug fixes or new features via pull requests
Star the repository to show support
Join discussions in GitHub Discussions

Development Setup:

# Clone and install in development mode
git clone https://github.com/jmanhype/rec-praxis-rlm.git
cd rec-praxis-rlm
pip install -e .[dev]

# Run tests
pytest --cov=rec_praxis_rlm

# Run linters
ruff check .
black --check .
mypy rec_praxis_rlm

# Run security audit on your changes
bandit -r rec_praxis_rlm

Pull Request Process:

Fork the repository and create a feature branch
Write tests for new functionality
Ensure all tests pass (pytest)
Run linters (ruff, black, mypy)
Update documentation as needed
Submit PR with clear description of changes

See CONTRIBUTING.md for detailed guidelines.

Code of Conduct

We are committed to providing a welcoming and inclusive environment. Please:

Be respectful and considerate
Welcome newcomers and help them get started
Focus on constructive feedback
Report unacceptable behavior to the maintainers

Getting Help

Documentation: GitHub README
Discord Community: Join our Discord for real-time chat
Bug Reports: GitHub Issues
Feature Requests: GitHub Discussions
Email: jmanhype@users.noreply.github.com (for security issues)

Recognition

Contributors are recognized in our CONTRIBUTORS.md file. Thank you to all who have helped improve rec-praxis-rlm!

License

MIT License - see LICENSE for details.

Citation

If you use rec-praxis-rlm in your research, please cite:

@software{rec_praxis_rlm,
  title = {rec-praxis-rlm: Procedural Memory and REPL Context for Autonomous Agents},
  author = {Your Name},
  year = {2025},
  url = {https://github.com/your-org/rec-praxis-rlm}
}

Acknowledgments

Built on DSPy 3.0 for autonomous agent capabilities
Uses sentence-transformers for semantic embeddings
Integrated with MLflow for experiment tracking
FAISS for fast similarity search

Support

Documentation: Full API docs
Issues: GitHub Issues
Discussions: GitHub Discussions

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.9.2

Dec 8, 2025

0.9.1

Dec 7, 2025

0.4.3

Dec 7, 2025

0.4.2

Dec 7, 2025

0.4.1

Dec 7, 2025

0.4.0

Dec 7, 2025

0.3.0

Dec 7, 2025

0.2.0

Dec 7, 2025

0.1.2

Dec 7, 2025

0.1.1

Dec 6, 2025

0.1.0

Dec 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rec_praxis_rlm-0.9.2.tar.gz (174.3 kB view details)

Uploaded Dec 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rec_praxis_rlm-0.9.2-py3-none-any.whl (135.7 kB view details)

Uploaded Dec 8, 2025 Python 3

File details

Details for the file rec_praxis_rlm-0.9.2.tar.gz.

File metadata

Download URL: rec_praxis_rlm-0.9.2.tar.gz
Upload date: Dec 8, 2025
Size: 174.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.14

File hashes

Hashes for rec_praxis_rlm-0.9.2.tar.gz
Algorithm	Hash digest
SHA256	`9c7190a3a5210178d92f34ff5a33d4c3340fc94ecbd81a30b3cef4079b598434`
MD5	`ca3da9969b8f47c46ae507a8bb64cfae`
BLAKE2b-256	`4bcdee6f643e5702128b36ccccb28fc6abae244d5f0e9b12238942ff45e7fba5`

See more details on using hashes here.

File details

Details for the file rec_praxis_rlm-0.9.2-py3-none-any.whl.

File metadata

Download URL: rec_praxis_rlm-0.9.2-py3-none-any.whl
Upload date: Dec 8, 2025
Size: 135.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.14

File hashes

Hashes for rec_praxis_rlm-0.9.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`957ef3e86a41b75a2bdc0125df052502f8efa482b13548479a9410c79115c539`
MD5	`261b5f94ba99ca840378f4de6469e53f`
BLAKE2b-256	`117998eed9be68ec490068227163b988dfefd8d2f5260dfb22b7795642bea5a3`

See more details on using hashes here.

rec-praxis-rlm 0.9.2

Navigation

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

rec-praxis-rlm

Features

Core Capabilities

IDE Integrations & Developer Tools (v0.4.0+)

Requirements

Core Features (No API Key Required)

Optional Features (API Key Required)

Quick Start

Installation

Example 1: Procedural Memory

Example 2: RLM Context for Document Inspection

Example 3: Safe Code Execution

Example 4: Autonomous Planning with DSPy

Architecture

Performance

Supported LLM Providers

Groq (Recommended)

OpenAI

OpenRouter

Other Providers

Configuration

Memory Configuration

REPL Configuration

Planner Configuration

Testing

Security

Sandboxed Code Execution

ReDoS Protection

Advanced Features

Async Support

Custom Embedding Providers

Memory Maintenance

Custom Metrics

MLflow Integration

IDE Integrations & Developer Tools

Pre-commit Hooks

CLI Tools

Interactive HTML Reports

Iterative Improvement Mode

MLflow Metrics Tracking

VS Code Extension

GitHub Actions

PR-Agent Style Integration

Test Generation Agent

Examples

Community & Contributing

Contributing Guidelines

Code of Conduct

Getting Help

Recognition

License

Citation

Acknowledgments

Support

Project details

Verified details

Project links

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details