Skip to main content

Transparent AI-slop writing-pattern analysis for essays, blog posts, Markdown, JSON, and websites.

Project description

slopscore

PyPI Python License: MIT CI Docs

A transparent linter for AI-slop writing patterns in essays, blog posts, Markdown, JSON, and websites.

slopscore reads text and returns a 0 to 100 SlopScore measuring the density of formulaic, generic, low-specificity, over-polished writing patterns associated with low-effort LLM output. It reports per-dimension scores and evidence spans (the exact phrases that triggered each finding), so you can see and fix what it flags.

⚠️ What slopscore is NOT

It does not detect whether text was written by AI, and must never be used to accuse a writer. It flags writing patterns in text (not authorship, not authors): patterns common in low-effort or AI-like prose and in plenty of human writing. Use it as a prose linter to nudge toward clearer, more specific writing, not as an AI detector. Authorship detectors are unreliable and biased; slopscore deliberately is not one.

What it is, and what it is not

slopscore detects writing patterns, not authorship. It does not claim a text was written by AI, and it should never be used to accuse a writer. AI-authorship detectors are unreliable on short, edited, translated, and non-native-English text, so slopscore takes a more honest and more useful position:

"This text has a high concentration of generic, formulaic, low-evidence writing patterns."

not

"This was written by AI."

Think of it as a linter for slop, closer to Vale or ruff than to a black-box AI detector. Every point in the score comes from a visible rule with an evidence span.

Install

pip install slopscore-lint            # lean, rule-based core
pip install "slopscore-lint[web]"     # + website extraction (trafilatura)
pip install "slopscore-lint[nlp]"     # + spaCy NER and sentence-transformer embeddings
pip install "slopscore-lint[lang]"    # + non-English language detection
pip install "slopscore-lint[report]"  # + HTML report rendering (Jinja2)
pip install "slopscore-lint[all]"     # everything

Name note: the PyPI package is slopscore-lint (plain slopscore belongs to a different tool). The import stays import slopscore, and the command is slopscore-lint.

Usage

slopscore-lint scan post.md
slopscore-lint scan essay.txt --format json
slopscore-lint scan content.json --json-path "$.article.body"
slopscore-lint scan https://example.com/post        # requires slopscore-lint[web]
slopscore-lint scan src/app.py                       # lints docstring/comment prose, ignores code
slopscore-lint scan post.md --by-paragraph           # surfaces a sloppy section in a clean doc

Lint the prose inside code

scan reads the natural-language prose out of source files (Python docstrings and comments, JS/TS JSDoc) and ignores the code itself, so it catches slop in documentation that code linters skip:

slopscore-lint scan src/                  --recursive   # docstrings + comments across a package
slopscore-lint scan README.md CHANGELOG.md --fail-on high

Audit fairness

slopscore reports how often each rule fires on competent plain and non-native English, the writing that pattern detectors are known to over-flag. No other slop linter publishes this:

slopscore-lint fairness        # per-rule false-positive rate on the plain/ESL benchmark slices

Calibrate against your own writing

Instead of asking "does this look like AI?", ask "does this deviate from my usual style in sloppy ways?". Build a baseline from a folder of your past writing, then compare new drafts to it:

slopscore-lint calibrate ./my-old-posts --name me
slopscore-lint scan new-post.md --baseline me     # reports per-dimension z-score deviations

Higher-precision syntactic detection (optional)

The default install detects syntactic tells (trailing "-ing" analyses, and so on) with regex. Install the [nlp] extra and the spaCy English model for a higher-precision, lower-false-positive path:

pip install "slopscore-lint[nlp]"
python -m spacy download en_core_web_sm

slopscore auto-upgrades to the spaCy path when the model is present; nothing else changes.

Use it as a linter in CI

slopscore-lint scan ./content --recursive --fail-on high          # exit 1 if any high finding
slopscore-lint scan ./content --recursive --format sarif -o out.sarif   # for GitHub code scanning
slopscore-lint scan post.md --format html -o report.html          # highlighted-span HTML (needs [report])
slopscore-lint scan . --diff origin/main --fail-on medium         # only files changed vs a ref

Exit codes: 0 clean (or below --fail-on), 1 findings at or above the threshold, 2 usage error, 3 a needed extra is missing. A composite GitHub Action (action.yml) scans, uploads SARIF to code scanning, and fails by threshold; a pre-commit hook (.pre-commit-hooks.yaml) is published for pre-commit. SARIF and HTML line numbers for Markdown and code are relative to the extracted prose (raw-source mapping is a later enhancement).

from slopscore import SlopScorer

scorer = SlopScorer(profile="blog", strictness="conservative")
# the argument below is an example of the slop the tool flags:
report = scorer.scan_text("In today's fast-paced digital landscape, it is crucial to leverage synergy.")
print(report.score.slop_score, report.score.label)
print(report.evidence[:3])

Status

v0.7: accuracy and robustness. Fixed a false "severe" on Markdown posts with code blocks (the code fences inflated prompt_residue when ingested as text). The [nlp] extra now genuinely upgrades two dimensions: spaCy named-entity density for genericity (benchmark AUROC 0.888 to 0.902) and sentence-transformer embeddings for rephrased redundancy, both validated to keep the fairness gate at 0% false positives on plain and non-native English. Added rhetorical question-and-answer scaffold detection and a slopscore-lint explain command. A sentence-length burstiness signal was tried and reverted for regressing the non-native slice.

v0.6: differentiation and reach. Lints the prose inside code (Python docstrings/comments, JS/TS JSDoc) so it catches slop that code linters skip; a fairness command that reports per-rule false-positive rates on plain and non-native English (no other slop linter publishes this); and --by-paragraph to surface a sloppy section inside an otherwise-clean document. Interpretable feature work (spaCy NER, semantic redundancy, burstiness) is on the v0.7 roadmap. Settled by evaluation: no model retrain and no gradient-boosting (XGBoost/LightGBM), since the held-out ceiling is set by features, not the model class, and trees break the numpy-only path and the fairness gate.

v0.5: a real slop-labeled benchmark (eval/datasets/benchmark.jsonl) with simple_english and non_native fairness slices, plus a held-out Wikipedia AI-Cleanup slice. Measured numbers in eval/RESULTS.md: strong on overt slop (PR-AUC 0.91), honestly weak on subtle real-world slop (held-out AUROC 0.69), which is why the accuracy claims stay modest.

v0.4: linter maturity. slopscore.toml / [tool.slopscore] config with per-rule toggles and severity overrides, inline <!-- slopscore-disable … --> suppression, a findings baseline (--fail-on-new), the implemented unsupported_claims dimension, opt-in --suggest rewrite suggestions (with SARIF fixes), an optional separate authorship-adapter interface (no detector bundled), PyPI packaging, and a docs site.

v0.3: an evaluation framework (slopscore-lint eval: TPR@FPR, PR-AUC, calibration, per-subgroup FPR) and a transparent learned scorer, a sign-constrained, calibrated logistic regression over the 13 dimensions, serialized as auditable JSON and run with pure numpy (--scorer ml). The rule scorer stays the default: under a replace-if-wins gate the learned model must beat it on held-out TPR@1%FPR without regressing subgroup false positives, and on the seed set it does not (it over-flags plain English). See MODEL_CARD.md and DATA_SOURCES.md.

v0.2.1: productionization. console/JSON/Markdown/SARIF/HTML reports, recursive and changed-files (--diff) batch scanning with CI exit codes, a GitHub Action, and a pre-commit hook.

v0.2: detection expansion grounded in Wikipedia's "Signs of AI writing" field guide. Dimensions: lexical markers, formulaic structure, significance inflation, superficial "-ing" analyses, vague or over-attribution, negative parallelism and rule-of-three, copula avoidance, genericity, redundancy, cadence, formatting tells, prompt residue, and a negative human-writing signal. Scoring is conservative by default: a corroboration gate damps weak-alone tells, and scores abstain on short or non-English input. See MODEL_CARD.md for citations and limitations.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slopscore_lint-0.7.0.tar.gz (377.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slopscore_lint-0.7.0-py3-none-any.whl (103.1 kB view details)

Uploaded Python 3

File details

Details for the file slopscore_lint-0.7.0.tar.gz.

File metadata

  • Download URL: slopscore_lint-0.7.0.tar.gz
  • Upload date:
  • Size: 377.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for slopscore_lint-0.7.0.tar.gz
Algorithm Hash digest
SHA256 e99c5ad4208ba7d1b1f358a05f9bfe04275698cf910aed9629e56e3d02d43c6b
MD5 3045051058d5b38ec8f397673d5013cc
BLAKE2b-256 b0e1a4d18c89f0534869561cbde5ed1cd72bcf651038b214fd6e85aac15dc487

See more details on using hashes here.

File details

Details for the file slopscore_lint-0.7.0-py3-none-any.whl.

File metadata

  • Download URL: slopscore_lint-0.7.0-py3-none-any.whl
  • Upload date:
  • Size: 103.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for slopscore_lint-0.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f1680d367e5bad8104724ede51e69ca25405cadc422f71a49ba6792e45c371fb
MD5 7f042156d9cfd8bff39dd60e40e8bcc2
BLAKE2b-256 3ad3b70ca0a56ba934bd5b769e2c3b1eccfca937dffed5b8c99dea14acc261b8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page