Skip to main content

A programming language type-checker enforcing structural honesty (Phase 2 prototype).

Project description

Furqan

CI

A programming-language type-checker that enforces structural honesty at compile time. Furqan rejects code shapes that promise more than the program can actually deliver, before the program ever runs.

What this repository is

A standalone Python type-checker that verifies a minimal subset of the Furqan surface syntax against the paper's structural-honesty primitives. Phase 2 (this repository) is the prototype type-checker , not a full compiler, not a runtime, not an LLM. It demonstrates that the Furqan thesis paper's compile-time rules are mechanically implementable.

The thesis claim under test: a meaningful fraction of code-level "AI hallucination" is the same shape as a long-known software defect, promising a complete answer about an input the program cannot fully process. That shape can be made structurally uncompilable by the type system, in milliseconds, with no model in the loop.

Why this matters

When a function declares it returns a complete answer (Integrity) but the input may be unreadable (an encrypted PDF, a partial scan, a missing field), the function should be required to either rule out the unreadability before promising completeness, or return a populated incomplete result with a reason and a confidence bound. Most languages do not enforce this. Furqan does.

The discipline is a static, syntactic check. It runs in sub-millisecond time. It produces diagnostics that name the rule violated, the line it occurred on, and the minimal fix. It has zero runtime dependencies and no model in the loop.

Definition: "minimal fix". A diagnostic's minimal_fix field is the smallest source edit that satisfies the checker that fired the diagnostic. It is local to that one checker. Other checkers may fire on the result. Future-checker compatibility is not promised by minimal_fix and cannot be — checkers are pure, additive, and may detect violations the original code did not expose. A stronger definition ("minimal under AST node delta, with a proof that no other checker fires on the result") is the subject of QUESTIONS.md Q3 and is not yet operational. Treat minimal_fix as the suggested first edit to make the diagnostic go away, not as a guaranteed convergence path.

Status

# Primitive Module Status
1 Bismillah scope checker/bismillah.py Shipped (Session 1.0, v0.1.0)
2 Zahir / batin checker/zahir_batin.py Shipped (Session 1.3, v0.2.0)
3 Additive-only modules checker/additive.py Shipped (Session 1.4, v0.3.0)
4 Scan-incomplete checker/incomplete.py Shipped (Session 1.5, v0.4.0)
5 Mizan calibration checker/mizan.py Shipped (Session 1.6, v0.5.0)
6 Tanzil build ordering checker/tanzil.py Shipped (Session 1.7, v0.6.0)
7 Ring-close checker/ring_close.py Shipped (Session 1.8, v0.7.0)
+ Status-coverage (D11) checker/status_coverage.py Shipped (Session 1.10, v0.8.0)
+ Return-type match (D22) checker/return_type_match.py Shipped (Session 1.11, v0.8.1)
+ All-paths-return (D24) checker/all_paths_return.py Shipped (Session 1.12, v0.8.2)
+ CLI entry point __main__.py Shipped (Session 1.13, v0.8.3)
+ Multi-module graph (D9/D20) project.py Shipped (Session 1.14, v0.10.0)
+ Cross-module type resolution (D23) checker/ring_close.py (Project) Shipped (Session 1.15, v0.10.1)

Seven of seven primitives shipped, the ring is closed; multi-module graph (D9 / D20) and cross-module type resolution (D23) shipped on top. Each row corresponds to a single closing HANDOFF.md block; each version corresponds to a CHANGELOG.md minor-version bump that registers the source-language additions.

Verified state

  • 539 tests passing in ~3 seconds on Python 3.10+
  • Zero runtime dependencies, Python standard library only
  • Public surface 42 / 38 / 4 (parser / checker / errors), plus furqan.Project for multi-module analysis (D9/D20). Additive-only invariant held since v0.1.0
  • Eight sessions, eight LLM-cross-collaborator closing audits, zero findings within that protocol (Anthropic Claude, xAI Grok, Perplexity Computer). See QUESTIONS.md for open interpretive questions registered from human audit, including one (Q5) on the limits of LLM cross-verification itself

Resource limits

The parser accepts up to MAX_NESTING_DEPTH = 200 levels of nested if blocks per function. Deeper inputs produce a structured ParseError with exit code 2 (PARSE ERROR), not a Python RecursionError traceback. The constant is importable from the package surface:

from furqan.parser import MAX_NESTING_DEPTH  # 200

This closes the first step of QUESTIONS.md Q10. The iterative-parser refactor that would make the limit memory-bound rather than stack-bound remains open.

Quickstart

git clone https://github.com/BayyinahEnterprise/furqan-programming-language.git
cd furqan-programming-language
pip install -e .
python -m pytest        # 539 passing in ~3s

The library:

from furqan.parser import parse
from furqan.checker import check_incomplete

source = open("scan_pdf.fqn").read()
module = parse(source)
diagnostics = check_incomplete(module)

for d in diagnostics:
    print(d.diagnosis)
    print(d.location)
    print(d.minimal_fix)

CLI usage

After pip install -e ., the furqan command is on your PATH.

# Check a single file (runs 9 checkers)
furqan check examples/clean_module.furqan

# Strict mode (exit 3 on any Marad)
furqan check examples/status_collapse.furqan --strict

# Show version
furqan version

Three example files demonstrate the contract:

$ furqan check examples/clean_module.furqan
PASS  examples/clean_module.furqan
  9 checkers ran. Zero diagnostics.

$ furqan check examples/status_collapse.furqan
MARAD  examples/status_collapse.furqan
  1 violation(s):
    [status_coverage] function 'summarize' calls 'deep_scan' ...

$ furqan check examples/missing_return_path.furqan
MARAD  examples/missing_return_path.furqan
  1 violation(s):
    [all_paths_return] function 'scan' declares a return type ...

Exit codes: 0 PASS, 1 MARAD, 2 PARSE ERROR, 3 STRICT MODE failure.

The additive-only checker is NOT run in single-file mode; it requires a prior-version module for comparison. Cross-version checks live in the test suite via the additive sidecar protocol.

What "structural honesty" means in code

The repository ships a self-contained demonstration in demo/. Three frontier LLMs (ChatGPT, Claude, Gemini) were each handed the same encrypted PDF and asked to summarise its contents. Their behaviour diverged: one named the encryption explicitly, one blamed the file ("unsupported or corrupted", it was neither), one implied user error ("check the file for any issues", there were none).

The Furqan compile-time scan-incomplete primitive rejects the function shape that would promise a complete answer about such a file, in 0.162 ms, with a diagnostic that names the missing incompleteness guard, the offending line, and the minimal fix. The guarantee holds for every function that compiles, not just for the one file the demo tested.

The point is not that the LLMs failed. ChatGPT was honest. The point is that runtime behaviour, even when correct, is not a structural guarantee. The same model on a different file, account tier, or version may behave differently. A compile-time check cannot.

Run the demo on a fresh clone:

bash demo/runner.sh

Four assertions, all passing: encrypted PDF regenerated and verified rejecting open, after-column checker accepts the honest shape and rejects the unguarded shape (both sub-millisecond), all three captured before-column responses classified.

Architecture

src/furqan/
├── parser/
│   ├── tokenizer.py       hand-written lexer; keyword-promotion discipline
│   ├── parser.py          strict recursive-descent; F1/F2 (no opaque eaters)
│   └── ast_nodes.py       frozen dataclasses for every parsed shape
├── checker/
│   ├── bismillah.py            Primitive 1, purpose-hierarchy / scope discipline
│   ├── zahir_batin.py          Primitive 2, surface vs depth layer separation
│   ├── additive.py             Primitive 3, module evolution; sidecar history
│   ├── incomplete.py           Primitive 4, scan-incomplete; the demo target
│   ├── mizan.py                Primitive 5, three-valued calibration blocks
│   ├── tanzil.py               Primitive 6, build-ordering discipline
│   ├── ring_close.py           Primitive 7, ring closure / type-resolution
│   ├── status_coverage.py      D-extension D11, status-coverage propagation
│   ├── return_type_match.py    D-extension D22, return-type match
│   └── all_paths_return.py     D-extension D24, all-paths-return
├── project.py                  D9 / D20 multi-module graph; cross-module analysis
├── __main__.py                 CLI entry point: furqan check / version
├── types/                      shared type model (used by ring_close, return_type_match)
└── errors/
    └── marad.py                diagnostic record: diagnosis, location, fix, regression check

Every checker is a pure function over a parsed Module AST. Every diagnostic is a Marad record with a structured payload (diagnosis text, source span, minimal fix, regression check). No mutation, no I/O, no exceptions on the success path.

Documentation

  • HANDOFF.md, rolling session-close audit log; the most recent verified state is at the top, prior sessions are appended below as isnad.
  • CHANGELOG.md, every minor-version bump registers the source-language additions and breaking-change boundary.
  • docs/NAMING.md, naming-convention discipline; common-English-word reservation policy; additive-only invariant.
  • docs/CONTRIBUTING.md, session-close protocol; polish-patch protocol §8.
  • docs/internals/CHECKER.md, per-primitive checker semantics, scope, and limits.
  • docs/internals/LEXER.md, tokenizer extensions per phase.

The thesis paper

The compile-time primitives implemented here are derived from Furqan: A Programming Language for Structural Honesty, published on Zenodo:

Companion papers establishing the surrounding architecture (Bayyinah input-layer defense, Bilal honest-autonomous LLM architecture, the Munafiq Protocol for cross-verification) are linked from the thesis paper's references section.

What this repository is not

  • Not a full compiler. This is a type-checker over a minimal surface syntax. Code generation, runtime, and FFI are out of scope for Phase 2.
  • Not a static analyzer for an existing language. Furqan is a new source language with its own grammar; the checker operates on .fqn files, not on Python or any other host language.
  • Not an LLM, not an LLM wrapper, not a prompt-engineering toolkit. No model is invoked at any point in the parser or checker. The thesis claim is about language design, not model behaviour.
  • Not a finished system. The seven Phase 2 compile-time primitives plus the Phase 3 multi-module graph (D9 / D20 in v0.10.0, D23 cross-module type resolution in v0.10.1) are shipped, sub-millisecond, audit-clean, and demo-ready. What remains is the Phase 3 runtime evaluator and cross-module status-coverage propagation. The full thesis is not yet executable end-to-end at runtime.

Honest registers

  • Test count is N=539 paired fixtures + property tests + named-rule tests + a seven-primitive integration capstone + Phase 3 polish (else arm, string escapes, structured tokenize errors) + the D9 / D20 / D23 multi-module suites; not a formal proof. Falsifying a primitive requires a fixture that escapes the rule's intent. The known limitations for each checker are documented in docs/internals/CHECKER.md.
  • The cross-model audit's null-finding rate has held at zero across seven sessions and seven primitives. This is N=1 in the program's own hands; whether the methodology generalizes is a future-work question.
  • The demo's three-vendor capture is N=1 per vendor at fixed timestamps on free-tier UI. Vendor behaviour drifts across model versions and account tiers; the captures are pinned to the timestamps recorded in demo/before/responses/*.md.

Authors

  • Bilal Syed Arfeen, product, architecture, research lead
  • Fraz Ashraf, co-architect, governance protocol, first author on the Furqan thesis paper

With AI collaborators (acknowledged contributors, not co-founders): Anthropic Claude, xAI Grok, Perplexity Computer.

License

Apache License 2.0, see LICENSE.

Citation

If you use Furqan in academic work:

@software{furqan_2026,
  author    = {Arfeen, Bilal Syed and Ashraf, Fraz},
  title     = {Furqan: A Programming-Language Type-Checker for Structural Honesty},
  year      = {2026},
  version   = {0.11.1},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.19750529},
  url       = {https://github.com/BayyinahEnterprise/furqan-programming-language}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

furqan-0.11.1.tar.gz (147.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

furqan-0.11.1-py3-none-any.whl (104.8 kB view details)

Uploaded Python 3

File details

Details for the file furqan-0.11.1.tar.gz.

File metadata

  • Download URL: furqan-0.11.1.tar.gz
  • Upload date:
  • Size: 147.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for furqan-0.11.1.tar.gz
Algorithm Hash digest
SHA256 780403652b99e4835aed6a35f2e5e0d5508eb2d9a27779f728cf0eb43c8ba3db
MD5 7fb076e6b1bcf591de3b4d145ac33882
BLAKE2b-256 1a147234ef86ccc406395c5be4f4be23c0a2799a0a9d8b505b512805d3eb4f69

See more details on using hashes here.

Provenance

The following attestation bundles were made for furqan-0.11.1.tar.gz:

Publisher: release.yml on BayyinahEnterprise/furqan-programming-language

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file furqan-0.11.1-py3-none-any.whl.

File metadata

  • Download URL: furqan-0.11.1-py3-none-any.whl
  • Upload date:
  • Size: 104.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for furqan-0.11.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0e41f12c0fc254884abb4638366fc7f499b3e859b50effd9ce76573947fe7aee
MD5 8de0234994f831f1cc944732572f10cd
BLAKE2b-256 1669e24b64fdd7140206a3f88c3efccb3b161e5284e51dfb0addda3b2ab6dda4

See more details on using hashes here.

Provenance

The following attestation bundles were made for furqan-0.11.1-py3-none-any.whl:

Publisher: release.yml on BayyinahEnterprise/furqan-programming-language

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page