Audit the capability gap between frontier AI models and the models tested in academic papers.

These details have not been verified by PyPI

Project links

Project description

frontierlag

Audit the capability gap between frontier AI and the models tested in academic papers.

Paste a DOI. Get a report: what model the paper tested, where it sat relative to the frontier at evaluation date, what configuration the paper disclosed, and whether the paper fails all three audit dimensions at the pre-registered thresholds from the companion study.

$ pip install frontierlag
$ frontierlag check 10.1038/s41591-024-03425-5

This package is a companion to the paper Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation (Gringras, 2026; arXiv:TBD). The audit dataset embedded here is the frozen snapshot used in that paper; quarterly refreshes are shipped as point releases.

What it does

frontierlag measures three dimensions of the gap between what published AI evaluations test and what the frontier can do at the same moment:

Dimension	What it captures
Capability gap	ECI points and calendar months between the tested model and the frontier at evaluation date.
Tier gap	Number of same-family siblings with higher ECI that were already available at evaluation date.
Configuration	Fraction of reasoning mode, tools, scaffolding, and sampling items the paper discloses (items match the VERSIO-AI v1 checklist).

A paper that fails all three at the pre-registered thresholds is flagged as a compound failure. See frontierlag/config.yaml for the thresholds (they mirror the paper's pre-registration).

The package does not estimate counterfactual capability — it does not claim "the paper's conclusion would have been X if they had used Y." That move is absent from the companion paper by design, and it is absent here too.

Quick start

import frontierlag as fl

# By DOI (hits the frozen corpus if the paper is in the audit; otherwise
# resolves publication date via CrossRef and leaves you to supply the model).
report = fl.check("10.1038/s41591-024-03425-5")
print(report.to_text())

# Override / supply fields for a paper not in the frozen corpus.
report = fl.check(
    "10.1000/your-doi",
    primary_model="GPT-4",
    evaluation_date="2024-06-01",
    configuration_disclosures={
        "model_version_exact": True,
        "access_date": True,
        "reasoning_mode": None,  # not applicable to GPT-4
        "tool_use": False,
        # ... other items default to "not reported"
    },
)

# Or audit already-extracted metadata directly.
from frontierlag import audit, PaperMetadata
m = PaperMetadata(primary_model="GPT-3.5", publication_date="2024-07-01")
print(fl.audit(m).to_text())

# Individual lookups.
fl.lookup_model("claude-3.5-sonnet")          # → ModelRecord
fl.get_frontier_at_date("2025-06-01")          # → FrontierSnapshot
fl.list_known_models()                         # → list[str]

CLI

frontierlag check <DOI>               audit a paper
frontierlag lookup <MODEL>            single-model metadata
frontierlag frontier <YYYY-MM-DD>     frontier at a date
frontierlag models                    list known canonical names
frontierlag info                      version + data-freeze date

Every command accepts --json for machine-readable output. frontierlag check accepts --model, --eval-date, and --config-file to override or supply fields a paper does not otherwise provide.

Example output

$ frontierlag check 10.1038/s41746-023-00961-1 --model GPT-4 --eval-date 2023-03-20
frontierlag audit (data freeze: 2026-04-01)
========================================================================
Paper:  ChatGPT performance on USMLE-style medical examinations
DOI:    10.1038/s41746-023-00961-1
Evaluation date: 2023-03-20

Primary model tested
  input  : 'GPT-4' → canonical: GPT-4 (Mar 2023)
  release: 2023-03-15     ECI: +126.2

Frontier at evaluation date
  GPT-4 (Mar 2023) (released 2023-03-15, ECI +126.2)

Audit dimensions
  Capability gap : +0.0 ECI pts   (+0 months)
  Tier gap       : 0 stronger same-family sibling(s) available
  Configuration  : —  of applicable items disclosed

Compound failure: undetermined (insufficient structured metadata).

(A fully-extracted audit with configuration disclosures returns a clean PASS/FAIL verdict.)

Data freeze

The embedded dataset is frozen at FREEZE_DATE = 2026-04-01. Every report prints this at the top so readers know how stale the comparison is. Quarterly updates ship as frontierlag >= 1.0.X; a banner on the static site tracks the current freeze.

File	Source
`data/eci_scores.csv`	Epoch AI Capabilities Index snapshot (Epoch AI, 2026)
`data/monthly_frontier_trajectory.csv`	Derived from ECI + model release dates
`data/model_version_lookup.json`	Maintainer-curated, cross-checked against Epoch AI model tracker
`data/frozen_audit.json`	The companion paper's extracted audit (empty until production extraction completes)

All dataset files are plain text and diffable; the freeze history is visible in git log.

Install

pip install frontierlag

From source:

git clone https://github.com/davidgringras/frontierlag.git
cd frontierlag
pip install -e '.[test]'
pytest

Requires Python ≥ 3.9. Runtime dependencies are requests and pyyaml; no heavy scientific stack.

Citation

@misc{gringras2026frontierlag,
  author       = {Gringras, David},
  title        = {Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic {AI} Evaluation},
  year         = {2026},
  eprint       = {TBD},
  archivePrefix= {arXiv},
  primaryClass = {cs.AI},
  note         = {Companion package: \url{https://github.com/davidgringras/frontierlag}}
}

Contributing

Two things the package needs from the community and will welcome pull requests for:

Model aliases. Every paper spells model names differently. config.yaml::aliases is the single file to extend. PRs that add an alias mapping without touching code are the fastest path to review.
Frontier trajectory updates. When a new model ships, add a row to data/monthly_frontier_trajectory.csv and bump _version.py::FREEZE_DATE. The package has a quarterly release cadence; out-of-cycle PRs are welcome for newly-released frontier models.

Code changes should include tests and run pytest. See tests/ for conventions.

License

MIT. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0

May 7, 2026

This version

0.1.0

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

frontierlag-0.1.0.tar.gz (36.6 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

frontierlag-0.1.0-py3-none-any.whl (34.7 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file frontierlag-0.1.0.tar.gz.

File metadata

Download URL: frontierlag-0.1.0.tar.gz
Upload date: Apr 16, 2026
Size: 36.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for frontierlag-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`699e80ff877dc540d03b62f63af949b40883a1143589fb92dcb68448f8290fc1`
MD5	`0134b6e5bd3844c0547136cd4d601b06`
BLAKE2b-256	`f2e0733db2caf582776e68c93711100b7db3f0f04a1c13992041f01f8c80749b`

See more details on using hashes here.

File details

Details for the file frontierlag-0.1.0-py3-none-any.whl.

File metadata

Download URL: frontierlag-0.1.0-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 34.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for frontierlag-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`37ecc8e45596fcc8f1ab27bfbe1344f051f5b618c56689f495bbab368c9c00d3`
MD5	`63c7edf708f0a70e098ccd2fcab93999`
BLAKE2b-256	`28a542314988e64f7c86ec9e754887ade1808664bdbf44bf6632150de44c8073`

See more details on using hashes here.

frontierlag 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

frontierlag

What it does

Quick start

CLI

Example output

Data freeze

Install

Citation

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes