Offline-first AI content detection and humanization engine for LaTeX/Markdown documents

These details have not been verified by PyPI

Project links

Project description

LUCID

Linguistic Understanding, Classification, Identification & Defense — See through the text.

Offline-first AI content detection and humanization engine for LaTeX, Markdown, and plain text documents. Runs entirely on local hardware via Ollama for LLM inference and ONNX-optimized models for detection/evaluation.

Features

AI Detection — RoBERTa classifier + statistical features + ensemble scoring
Humanization — Ollama-powered paraphrasing with adversarial refinement loop
Semantic Evaluation — MiniLM embedding similarity, DeBERTa NLI entailment, BERTScore quality
Format-Preserving — LaTeX byte-position reconstruction, Markdown line-range replacement, plain text paragraph segmentation
Checkpoint/Resume — JSON checkpoints after each chunk, resume interrupted runs
Batch Processing — Process entire directories of documents

Requirements

Python 3.12+
Ollama running locally (for humanization)
16GB RAM minimum (32GB recommended for quality profile)
macOS (Apple Silicon optimized) or Linux x86-64

Installation

# Clone and install with uv
git clone https://github.com/AetherForge/lucid.git
cd lucid
uv sync

# Verify installation
uv run lucid --version

First-Run Setup

# Guided setup — checks Ollama, downloads models
uv run lucid setup

# Setup for a specific profile
uv run lucid setup --profile quality

Quick Start

# Check model availability
uv run lucid models

# Download missing models
uv run lucid models --download

# Detect AI content in a document
uv run lucid detect paper.tex
uv run lucid detect paper.tex --output-format json

# Run full pipeline (detect → humanize → evaluate → reconstruct)
uv run lucid pipeline paper.tex -o paper_humanized.tex

# Humanize a document directly
uv run lucid humanize paper.tex -o paper_humanized.tex

# Process a directory of documents
uv run lucid detect ./papers/

# View current configuration
uv run lucid config

CLI Reference

lucid [OPTIONS] COMMAND [ARGS]...

Global Options:
  --profile [fast|balanced|quality]  Quality profile
  --config PATH                      Custom config TOML file
  -v, --verbose                      Verbose output
  -q, --quiet                        Suppress all output
  --version                          Show version

Commands:
  detect     Detect AI-generated content in a document
  humanize   Humanize AI-detected content in a document
  pipeline   Full detect → humanize → validate pipeline
  config     View or modify configuration
  models     Check or download required models
  setup      First-run setup: check Ollama, download models

detect

lucid detect <INPUT> [OPTIONS]
  --output-format [json|text]   Report format (default: text)
  --threshold FLOAT             Detection threshold override
  -o, --output PATH             Write report to file

humanize

lucid humanize <INPUT> [OPTIONS]
  -o, --output PATH                  Output file path
  --model TEXT                       Override Ollama model tag
  --adversarial / --no-adversarial   Enable adversarial loop (default: on)

pipeline

lucid pipeline <INPUT> [OPTIONS]
  -o, --output PATH                  Output file path
  --report PATH                      Write report file
  --output-format [json|text|annotated]  Report format (default: json)
  --resume / --no-resume             Resume from checkpoint (default: on)
  --checkpoint-dir PATH              Checkpoint directory

setup

lucid setup [OPTIONS]
  --profile [fast|balanced|quality]   Profile to set up (default: balanced)

Configuration

LUCID uses TOML configuration with three built-in profiles:

Profile	Model Size	Speed	Quality	Use Case
`fast`	3B	Fastest	Good	Quick passes, drafts
`balanced`	7B	Moderate	Better	Default for most documents
`quality`	14B+	Slow	Best	Final submissions

# View config
uv run lucid config

# Override settings
uv run lucid config --set detection.use_binoculars true

Configuration files: config/default.toml, config/profiles/.

Model Recommendations

Profile	Default Model	Size	RAM Required	License
fast	phi3:3.8b	2.4GB	8GB	MIT
balanced	qwen2.5:7b	4.5GB	12GB	Apache 2.0
quality	llama3.1:8b	4.9GB	16GB	Meta Community

Profile Comparison

Feature	fast	balanced	quality
Statistical detection	No	Yes	Yes
Binoculars (Tier 3)	No	No	Yes
Adversarial iterations	1	3	5
LaTeX validation	No	Yes	Yes
Embedding threshold	0.75	0.80	0.85
BERTScore threshold	0.82	0.88	0.90

Web UI

LUCID includes an optional Gradio web interface for browser-based detection and humanization.

# Install web extras
uv sync --extra web

# Launch web UI
uv run lucid-web

The web UI provides two tabs: Detect (upload and analyze documents) and Full Pipeline (detect, humanize, and download results).

Architecture

Input Document
    │
    ▼
┌─────────┐     ┌──────────┐     ┌────────────┐     ┌───────────┐     ┌──────────────┐
│  Parser  │────▶│ Detector │────▶│ Humanizer  │────▶│ Evaluator │────▶│Reconstructor │
│          │     │          │     │            │     │           │     │              │
│ LaTeX    │     │ RoBERTa  │     │ Ollama LLM │     │ MiniLM    │     │ Position-    │
│ Markdown │     │ Stats    │     │ Adversarial│     │ DeBERTa   │     │ based        │
│ Plain    │     │ Ensemble │     │ Loop       │     │ BERTScore │     │ Replacement  │
└─────────┘     └──────────┘     └────────────┘     └───────────┘     └──────────────┘
    │                                                                         │
    └─────────────── Checkpoint after each chunk ─────────────────────────────┘

Project Structure

src/lucid/
├── cli.py              # Click CLI interface
├── pipeline.py         # Pipeline orchestrator
├── checkpoint.py       # Checkpoint/resume system
├── progress.py         # Rich progress reporting
├── output.py           # Output formatting (JSON, text, annotated)
├── config.py           # TOML config with profile merging
├── parser/             # Document parsers (LaTeX, Markdown, plain text)
├── detector/           # AI detection (RoBERTa, statistical, ensemble)
├── humanizer/          # Ollama paraphrasing with adversarial refinement
├── evaluator/          # Semantic evaluation (embedding, NLI, BERTScore)
├── reconstructor/      # Format-preserving document reconstruction
└── models/
    ├── manager.py      # Model lifecycle management
    ├── download.py     # Model availability checker and downloader
    └── results.py      # Result dataclasses

Benchmarks

Metric	Target
Detection TPR (AI text)	>85% at 5% FPR
Evasion rate (single-pass)	>70%
Evasion rate (adversarial)	>85%
Semantic similarity	>0.85 embedding, >0.88 BERTScore

Run benchmarks: uv run pytest tests/benchmarks/ -m benchmark -v

Full results: docs/benchmarks/

Development

# Install with dev dependencies
uv sync --extra dev

# Run unit tests
uv run pytest

# Run integration tests
uv run pytest -m integration

# Run all tests
uv run pytest -m ""

# Lint
uv run ruff check src/ tests/

# Run example scripts
uv run python examples/detect_latex.py tests/corpus/latex/simple.tex
uv run python examples/full_pipeline.py tests/corpus/markdown/simple.md

# Type check
uv run mypy src/lucid/

License

MIT — See LICENSE for details.

See RESPONSIBLE_USE.md for the ethical framework and responsible use policy.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lucid_ai-0.1.0.tar.gz (329.4 kB view details)

Uploaded Feb 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lucid_ai-0.1.0-py3-none-any.whl (82.7 kB view details)

Uploaded Feb 21, 2026 Python 3

File details

Details for the file lucid_ai-0.1.0.tar.gz.

File metadata

Download URL: lucid_ai-0.1.0.tar.gz
Upload date: Feb 21, 2026
Size: 329.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for lucid_ai-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`75c31d4b68bc3e97190b1a9fb24a939e44fa4c01363d8f429008d31285dd7ef6`
MD5	`e06fcbf3a1601b584ab1785b0552daf2`
BLAKE2b-256	`f2ea85a7b9751299cf8dd876c8bfadbb7364de5145f4b85c1c9ff708d22cb1f4`

See more details on using hashes here.

File details

Details for the file lucid_ai-0.1.0-py3-none-any.whl.

File metadata

Download URL: lucid_ai-0.1.0-py3-none-any.whl
Upload date: Feb 21, 2026
Size: 82.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.8

File hashes

Hashes for lucid_ai-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0898ea89b4a3f0255ee5cc4a74360feab20fc839c7070da21bc39b6217e046cf`
MD5	`fa8af7e2cb68e084355c3867fc4b7da4`
BLAKE2b-256	`0321760c6e0db951fcdda6a65736ed7cad99426f64772da25ce1f903d0e4c353`

See more details on using hashes here.

lucid-ai 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LUCID

Features

Requirements

Installation

First-Run Setup

Quick Start

CLI Reference

detect

humanize

pipeline

setup

Configuration

Model Recommendations

Profile Comparison

Web UI

Architecture

Project Structure

Benchmarks

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes