Skip to main content

A Drake Equation for English writing quality - CLI tool for prose analysis

Project description

Redpen

A "Drake Equation" for English writing quality — a CLI tool that produces a single quality score from multiple prose metrics.

Python 3.10+ License: MIT Test Coverage

Overview

Redpen produces a single Writing Quality Index (WQI) score (0.0-1.0) by combining multiple prose quality metrics using a geometric mean, similar to how MFCQI works for code quality.

Key Features:

  • Single composite score from multiple metrics
  • Git integration for analyzing only changed files
  • Configurable profiles for different writing styles
  • Optional AI-powered feedback via LiteLLM
  • CI/CD friendly with JSON output and exit codes
  • Rich terminal output with detailed issue reporting

Installation

# Using pip
pip install redpen

# Using uv (recommended)
uv pip install redpen

# From source
git clone https://github.com/youruser/redpen.git
cd redpen
uv pip install -e .

Optional Dependencies

For AI-powered feedback:

pip install redpen[ai]
# or
pip install litellm

For enhanced grammar checking with LanguageTool:

pip install language-tool-python

Quick Start

# Analyze a single file
redpen analyze README.md

# Analyze multiple files or directories
redpen analyze README.md docs/ CONTRIBUTING.md

# Analyze only changed files (git diff)
redpen analyze --diff

# Analyze staged changes only
redpen analyze --diff --staged

# Use a specific writing profile
redpen analyze docs/ --profile technical

# Get AI-powered feedback
redpen analyze README.md --ai

# JSON output for CI/CD
redpen analyze . --format json

# Fail if score is below threshold
redpen analyze . --min-score 0.75

The Writing Quality Index (WQI)

The WQI is calculated using a geometric mean of individual metric scores:

WQI = (M₁ × M₂ × M₃ × M₄)^(1/4)

This approach ensures that:

  • A low score in any single metric significantly impacts the overall score
  • Balanced quality across all dimensions is rewarded
  • The score ranges from 0.0 (poor) to 1.0 (excellent)

Score Interpretation

Score Range Rating Description
0.90 - 1.00 Excellent Publication-ready prose
0.75 - 0.89 Good Minor improvements possible
0.60 - 0.74 Fair Several areas need attention
0.40 - 0.59 Poor Significant revision needed
0.00 - 0.39 Very Poor Major rewrite recommended

Metrics

Redpen evaluates text across four dimensions:

Readability

Measures how easy the text is to read using established formulas:

  • Flesch-Kincaid Grade Level: Target grade level for comprehension
  • Gunning Fog Index: Years of education needed
  • SMOG Index: Simple Measure of Gobbledygook
  • Coleman-Liau Index: Character-based readability
  • Automated Readability Index: Sentence and word length analysis

The metric compares the consensus grade level against a configurable target (default: grade 10).

Grammar

Checks for grammatical errors and issues:

  • LanguageTool (if installed): Comprehensive grammar, punctuation, and style checking
  • Proselint (fallback): Prose linting for common writing issues

Issues are weighted by severity to calculate the score.

Spelling

Identifies spelling errors with smart filtering:

  • Recognizes common technical terms (API, JSON, HTTP, etc.)
  • Ignores code blocks (fenced with ```)
  • Supports custom word lists
  • Provides correction suggestions

Style

Analyzes writing style and clarity:

  • Passive Voice Detection: Identifies passive constructions
  • Sentence Length: Flags sentences exceeding the configured maximum
  • Weasel Words: Detects vague qualifiers (very, really, quite, etc.)

Configuration

Redpen looks for configuration in these locations (in order):

  1. .redpen.toml in current or parent directories
  2. redpen.toml in current or parent directories
  3. pyproject.toml under [tool.redpen]

Generating a Configuration File

Use the config init command to generate a comprehensive example configuration:

# Generate .redpen.toml in current directory
redpen config init

# Generate with a custom filename
redpen config init --output custom.toml

# Preview the config without writing to file
redpen config init --stdout

# Overwrite an existing config file
redpen config init --force

The generated file includes all available options with documentation comments.

Configuration File Format

# .redpen.toml

# Default profile to use
default_profile = "technical"

# File extensions to analyze
extensions = [".md", ".txt", ".rst"]

# Profile definitions
[profiles.technical]
name = "technical"
description = "Technical documentation"
target_grade = 12
max_sentence_words = 35
check_passive = false  # Technical docs often use passive voice
check_weasel = true
custom_words = ["kubernetes", "microservice", "async"]

[profiles.casual]
name = "casual"
description = "Blog posts and casual writing"
target_grade = 8
max_sentence_words = 25
check_passive = true
check_weasel = true

[profiles.academic]
name = "academic"
description = "Academic and formal writing"
target_grade = 14
max_sentence_words = 40
check_passive = true
check_weasel = false

# Metric-specific configuration
[metrics.readability]
enabled = true
weight = 1.0

[metrics.grammar]
enabled = true
weight = 1.5  # Weight grammar more heavily
options = { disabled_rules = ["WHITESPACE_RULE"] }

[metrics.spelling]
enabled = true
weight = 1.0
options = { custom_words = ["redpen", "WQI"] }

[metrics.style]
enabled = true
weight = 1.0

# LLM configuration for AI feedback
[llm]
enabled = false
provider = "openai"
model = "gpt-4o-mini"
temperature = 0.3
context = "Technical documentation for developers"

Built-in Profiles

Profile Target Grade Max Sentence Passive Check Use Case
default 10 30 words Yes General writing
technical 12 35 words No Technical docs
casual 8 25 words Yes Blog posts
academic 14 40 words Yes Academic papers

Disabling Specific Rules

You can disable specific grammar and style rules that don't apply to your project. Add them to the disabled_rules list in your grammar metric configuration:

[metrics.grammar.options]
disabled_rules = [
    "typography.symbols.curly_quotes",  # Disable curly quote suggestions
    "typography.symbols.ellipsis",      # Disable ellipsis suggestions
    "misc.phrasal_adjectives",          # Hyphenation suggestions
]

Common Proselint rules to disable:

Rule ID Description
typography.symbols.curly_quotes Suggests using curly quotes instead of straight quotes
typography.symbols.ellipsis Suggests using the ellipsis character (…)
typography.symbols.sentence_spacing Checks spacing after periods
misc.phrasal_adjectives Hyphenation of phrasal adjectives
misc.preferred_forms Suggests preferred word forms
leonard.exclamation Warns about exclamation marks
redundancy.ras_syndrome Redundant acronym syndrome (e.g., "ATM machine")
dates_times.am_pm AM/PM formatting suggestions

LanguageTool rules (if installed):

Rule ID Description
WHITESPACE_RULE Whitespace issues
EN_QUOTES Quote style checking
COMMA_PARENTHESIS_WHITESPACE Comma/parenthesis spacing
UPPERCASE_SENTENCE_START Sentence capitalization

Run redpen config init to generate a configuration file with all common rules documented.

Git Integration

Redpen integrates with Git to analyze only changed content:

# Analyze all unstaged changes
redpen analyze --diff

# Analyze only staged changes (useful for pre-commit)
redpen analyze --diff --staged

This is particularly useful for:

  • Pre-commit hooks
  • CI/CD pipelines
  • Incremental documentation review

AI-Powered Feedback

With LiteLLM installed, Redpen can provide AI-powered writing suggestions:

# Get AI feedback on your writing
redpen analyze README.md --ai

# Use a specific model
redpen analyze README.md --ai --ai-model gpt-4

The AI advisor:

  • Analyzes the WQI results and identified issues
  • Provides context-aware suggestions
  • Offers specific rewrite recommendations
  • Considers your configured writing context

Supported LLM Providers

Redpen uses LiteLLM, supporting 100+ LLM providers:

  • OpenAI (GPT-4, GPT-3.5)
  • Anthropic (Claude)
  • Google (Gemini)
  • Azure OpenAI
  • Local models via Ollama
  • And many more

Set your API key via environment variable:

export OPENAI_API_KEY="your-key"
# or
export ANTHROPIC_API_KEY="your-key"

CI/CD Integration

GitHub Actions

name: Writing Quality Check

on: [push, pull_request]

jobs:
  redpen:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.12'

      - name: Install Redpen
        run: pip install redpen

      - name: Check writing quality
        run: redpen analyze docs/ --min-score 0.7 --format json

Pre-commit Hook

Add to .pre-commit-config.yaml:

repos:
  - repo: local
    hooks:
      - id: redpen
        name: Check writing quality
        entry: redpen analyze --diff --staged --min-score 0.7
        language: python
        types: [markdown, text, rst]
        pass_filenames: false

Exit Codes

Code Meaning
0 Success (score meets threshold)
1 Score below --min-score threshold
2 Error (invalid input, missing files)

CLI Reference

redpen [OPTIONS] COMMAND [ARGS]...

Commands:
  analyze  Analyze text files for writing quality
  config   Manage Redpen configuration

Options:
  --version  Show version and exit
  --help     Show help message and exit

analyze Command

redpen analyze [OPTIONS] [PATHS]...

Arguments:
  PATHS  Files or directories to analyze

Options:
  --min-score FLOAT        Minimum WQI score (exit 1 if below)
  --format [rich|json]     Output format (default: rich)
  --diff                   Analyze only changed files (git diff)
  --staged                 With --diff, analyze only staged changes
  --profile TEXT           Writing profile to use
  --ai                     Enable AI-powered feedback
  --ai-model TEXT          LLM model for AI feedback
  --help                   Show help message and exit

config Command

redpen config [OPTIONS] COMMAND [ARGS]...

Commands:
  init  Generate an example configuration file

config init Command

redpen config init [OPTIONS]

Options:
  -o, --output PATH  Output file path (default: .redpen.toml)
  -s, --stdout       Print to stdout instead of writing to file
  -f, --force        Overwrite existing configuration file
  --help             Show help message and exit

Programmatic Usage

from redpen.calculator import Calculator
from redpen.metrics.readability import ReadabilityMetric
from redpen.metrics.grammar import GrammarMetric
from redpen.metrics.spelling import SpellingMetric
from redpen.metrics.style import StyleMetric

# Create calculator with metrics
calculator = Calculator([
    ReadabilityMetric({"target_grade": 10}),
    GrammarMetric(),
    SpellingMetric({"custom_words": ["API", "JSON"]}),
    StyleMetric({"check_passive": True, "max_sentence_words": 30}),
])

# Analyze text
text = """
Your document content here.
"""

result = calculator.analyze(text)

print(f"WQI Score: {result.wqi_score:.2f}")
print(f"Rating: {result.rating.value}")

for metric in result.metrics:
    print(f"  {metric.name}: {metric.score:.2f}")
    for issue in metric.issues[:3]:
        print(f"    - {issue.message}")

Development

# Clone the repository
git clone https://github.com/youruser/redpen.git
cd redpen

# Create virtual environment
uv venv
source .venv/bin/activate

# Install in development mode
uv pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=redpen --cov-report=term-missing

# Run the CLI
redpen analyze README.md

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests for your changes
  4. Ensure all tests pass (pytest tests/)
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

redpen-0.1.0.tar.gz (30.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

redpen-0.1.0-py3-none-any.whl (27.4 kB view details)

Uploaded Python 3

File details

Details for the file redpen-0.1.0.tar.gz.

File metadata

  • Download URL: redpen-0.1.0.tar.gz
  • Upload date:
  • Size: 30.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for redpen-0.1.0.tar.gz
Algorithm Hash digest
SHA256 8240eb624ce558304cb270c52fb49228e137d7d1cd9e4c5cc2d27fbf5fc698cc
MD5 4af627c6ea2d0c6b38795da0ddd4b22e
BLAKE2b-256 94f54b2ea1849416422a5594148c505f0e7f4ec337bc7596ffa226c2ca2428f0

See more details on using hashes here.

File details

Details for the file redpen-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: redpen-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 27.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.15 {"installer":{"name":"uv","version":"0.9.15","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for redpen-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5370bf144fb0fdf7d1bb627ef24b7a9d66624254d1c8c9ea98c09c78ad21e9ea
MD5 b4cfa3ede76698a1dbad02e5f0b78c30
BLAKE2b-256 64448459c1c9097b9e6d9bff44630a6a01e915282c33bd726d989bc71d1293f8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page