A language-agnostic docstyle compliance & remediation tool
Project description
DocOctopy
A language-agnostic docstyle compliance & remediation tool that scans code for docstring/docblock presence and style, reports findings, and can auto-propose LLM-based fixes.
Features
๐ Comprehensive Scanning
- Python-first with extensible architecture for other languages
- Google-style docstring validation with detailed compliance checking
- AST-based analysis for accurate symbol and signature detection
- Smart caching with incremental scanning for large codebases
๐ Multiple Output Formats
- Pretty console output with Rich formatting
- JSON reports for CI/CD integration
- SARIF format for GitHub Code Scanning
- Configurable exit codes based on severity levels
๐ค LLM-Powered Remediation
- Automatic docstring generation for missing documentation
- Smart fixing of non-compliant docstrings
- Enhancement of existing docstrings with missing elements
- DSPy integration for reliable, structured LLM interactions
โ๏ธ Flexible Configuration
- pyproject.toml integration with rule enable/disable switches
- Per-path overrides for different project sections
- Gitignore-style exclusions with pathspec support
- Rule severity customization (error, warning, info, off)
Installation
Basic Installation
pip install dococtopy
With LLM Support
pip install dococtopy[llm]
Development Installation
git clone https://github.com/yourusername/dococtopy.git
cd dococtopy
pip install -e .
Quick Start
1. Scan Your Code
# Scan current directory
dococtopy scan .
# Scan specific paths
dococtopy scan src/ tests/
# Get JSON output
dococtopy scan . --format json --output-file report.json
# Use SARIF for GitHub Code Scanning
dococtopy scan . --format sarif --output-file report.sarif
2. Fix Issues with LLM Assistance
# Dry-run mode (safe, shows what would be fixed)
dococtopy fix . --dry-run
# Fix specific rules only
dococtopy fix . --rule DG101,DG202 --dry-run
# Use different LLM provider
dococtopy fix . --llm-provider anthropic --llm-model claude-3-haiku-20240307
3. Configure Your Project
Create a pyproject.toml file:
[tool.docguard]
exclude = ["**/.venv/**", "**/build/**", "**/node_modules/**"]
[tool.docguard.rules]
DG101 = "error" # Missing docstrings
DG201 = "error" # Google style parse errors
DG202 = "error" # Missing parameters
DG203 = "error" # Extra parameters
DG204 = "warning" # Returns section issues
DG205 = "info" # Raises validation
DG301 = "warning" # Summary style
DG302 = "warning" # Blank line after summary
Rules Reference
Basic Compliance Rules
- DG101: Missing docstring (functions and classes)
- DG301: Summary first line should end with period
- DG302: Blank line required after summary
Google Style Validation Rules
- DG201: Google style docstring parse error
- DG202: Parameter missing from docstring
- DG203: Extra parameter in docstring
- DG204: Returns section missing or mismatched
- DG205: Raises section validation
Configuration
pyproject.toml Settings
[tool.docguard]
# Paths to scan (default: current directory)
paths = ["src", "tests"]
# Exclude patterns (gitignore-style)
exclude = ["**/.venv/**", "**/build/**", "**/node_modules/**"]
# Rule configuration
[tool.docguard.rules]
DG101 = "error" # error, warning, info, off
DG201 = "error"
DG202 = "warning"
DG203 = "warning"
DG204 = "info"
DG205 = "info"
DG301 = "warning"
DG302 = "warning"
# Per-path overrides
[[tool.docguard.overrides]]
patterns = ["tests/**"]
rules.DG101 = "off" # Disable missing docstrings in tests
Environment Variables
For LLM functionality:
# OpenAI
export OPENAI_API_KEY="your-api-key"
# Anthropic
export ANTHROPIC_API_KEY="your-api-key"
# Ollama (local)
# No API key needed, runs locally
CLI Reference
dococtopy scan
Scan paths for documentation compliance issues.
dococtopy scan [PATHS...] [OPTIONS]
Options:
--format {pretty,json,sarif,both} Output format [default: pretty]
--config PATH Config file path [default: pyproject.toml]
--fail-level {error,warning,info} Exit code threshold [default: error]
--no-cache Disable caching
--changed-only Only scan changed files
--stats Show cache statistics
--output-file PATH Write output to file
dococtopy fix
Fix documentation issues using LLM assistance.
dococtopy fix [PATHS...] [OPTIONS]
Options:
--dry-run Show changes without applying [default: True]
--interactive Accept/reject each fix interactively
--rule TEXT Comma-separated rule IDs to fix
--max-changes INTEGER Maximum number of changes
--llm-provider {openai,anthropic,ollama} LLM provider [default: openai]
--llm-model TEXT LLM model to use [default: gpt-4o-mini]
--config PATH Config file path
Examples
Example 1: Basic Project Scan
# Clone a project
git clone https://github.com/someuser/someproject.git
cd someproject
# Install DocOctopy
pip install dococtopy
# Scan for issues
dococtopy scan .
# Output:
# Scan Results
# Files scanned: 42
# Files compliant: 35
# Overall coverage: 83.3%
#
# src/main.py [NON_COMPLIANT] (Coverage: 60.0%)
# [ERROR] DG101: Function 'process_data' is missing a docstring at 15:0
# [WARNING] DG301: Docstring summary should end with a period. at 23:0
Example 2: LLM-Powered Fixes
# Install with LLM support
pip install dococtopy[llm]
# Set up API key
export OPENAI_API_KEY="your-key"
# Fix issues (dry-run)
dococtopy fix . --dry-run
# Output:
# Scanning for documentation issues...
# Processing src/main.py...
# Found 2 changes for src/main.py
#
# Change: process_data (function)
# Issues: DG101
# Dry run - no changes applied
#
# Change: validate_input (function)
# Issues: DG202, DG301
# Dry run - no changes applied
#
# Total changes: 2
# Run without --dry-run to apply changes
Example 3: CI/CD Integration
# .github/workflows/docstring-check.yml
name: Docstring Compliance
on: [push, pull_request]
jobs:
docstring-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install dococtopy
- run: dococtopy scan . --format json --output-file report.json --fail-level error
- name: Upload report
uses: actions/upload-artifact@v4
with:
name: docstring-report
path: report.json
Architecture
DocOctopy is built with a modular, extensible architecture:
dococtopy/
โโโ cli/ # Command-line interface
โโโ core/ # Core engine, discovery, caching
โโโ adapters/ # Language-specific adapters
โโโ rules/ # Compliance rules and registry
โโโ remediation/ # LLM-powered fixing
โโโ reporters/ # Output formatters
Key Components
- Discovery Engine: Finds files using gitignore-style patterns
- Language Adapters: Parse code and extract symbols/docstrings
- Rule Engine: Applies compliance rules with configurable severity
- Remediation Engine: Uses DSPy for structured LLM interactions
- Caching System: Incremental scanning with fingerprint-based invalidation
Publishing
DocOctopy is automatically published to PyPI via GitHub Actions when a release is created.
Manual Publishing (for maintainers)
-
Update version in
pyproject.toml -
Build and test the package:
./scripts/publish.sh
-
Create a GitHub release with tag
v0.1.0(matching the version) -
GitHub Action will automatically publish to PyPI
PyPI Setup (one-time)
To enable automatic publishing, configure trusted publishing in PyPI:
- Go to PyPI Account Settings
- Navigate to "Publishing" โ "Publishing tokens" โ "Add a new pending publisher"
- Configure:
- PyPI project name:
dococtopy - Owner:
yourusername(your GitHub username) - Repository name:
dococtopy - Workflow filename:
publish.yml - Environment name: (leave empty)
- PyPI project name:
Contributing
We welcome contributions! Please see our Contributing Guide for details.
Development Setup
git clone https://github.com/yourusername/dococtopy.git
cd dococtopy
uv sync --dev
uv run pytest
Adding New Rules
- Create rule class in
src/dococtopy/rules/ - Implement
check()method - Register with
register()function - Add tests in
tests/unit/
Adding New Languages
- Implement
LanguageAdapterinterface - Create symbol extraction logic
- Add language-specific rules
- Update discovery patterns
Roadmap
MVP (Current)
- โ Python docstring compliance checking
- โ Google-style validation rules
- โ LLM-powered remediation
- โ Multiple output formats
- โ Configuration system
- โ Caching and incremental scanning
V1 (Next)
- ๐ Interactive fix workflows
- ๐ File writing capabilities
- ๐ GitHub Action and pre-commit hooks
- ๐ Playground UI for prompt experimentation
- ๐ Additional Python rules (coverage thresholds, etc.)
Future
- ๐ JavaScript/TypeScript support
- ๐ Go documentation checking
- ๐ Rust documentation checking
- ๐ Language server integration
- ๐ Advanced prompt optimization
License
MIT License - see LICENSE file for details.
Acknowledgments
- Built with DSPy for reliable LLM interactions
- Uses docstring-parser for Google-style parsing
- Powered by Typer for CLI interface
- Styled with Rich for beautiful output
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dococtopy-0.1.0.tar.gz.
File metadata
- Download URL: dococtopy-0.1.0.tar.gz
- Upload date:
- Size: 31.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2df57384b0f966b9fbe847f55496f76d386c5b1bd912853a2afaf6ab3c7eed85
|
|
| MD5 |
88a702db416473256b68262440a6e94b
|
|
| BLAKE2b-256 |
37238addd73b0c6626e98d2cb2605939d9e714378f07702015be90b2046143c9
|
Provenance
The following attestation bundles were made for dococtopy-0.1.0.tar.gz:
Publisher:
publish.yml on CrazyBonze/DocOctopy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dococtopy-0.1.0.tar.gz -
Subject digest:
2df57384b0f966b9fbe847f55496f76d386c5b1bd912853a2afaf6ab3c7eed85 - Sigstore transparency entry: 526404245
- Sigstore integration time:
-
Permalink:
CrazyBonze/DocOctopy@dc3eb2d5ab1b52bc1a39006ff6678d4b6c79bc5d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/CrazyBonze
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc3eb2d5ab1b52bc1a39006ff6678d4b6c79bc5d -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file dococtopy-0.1.0-py3-none-any.whl.
File metadata
- Download URL: dococtopy-0.1.0-py3-none-any.whl
- Upload date:
- Size: 32.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ab83eea8477edc2501ab744858705d40bd03d76887ce022776714ddab015d44a
|
|
| MD5 |
2f23078d366afff3fb7ce59311feaa8a
|
|
| BLAKE2b-256 |
560dff0d818966159690dcdb9b162d46652c05135aef36b31e505578ae61608d
|
Provenance
The following attestation bundles were made for dococtopy-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on CrazyBonze/DocOctopy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dococtopy-0.1.0-py3-none-any.whl -
Subject digest:
ab83eea8477edc2501ab744858705d40bd03d76887ce022776714ddab015d44a - Sigstore transparency entry: 526404450
- Sigstore integration time:
-
Permalink:
CrazyBonze/DocOctopy@dc3eb2d5ab1b52bc1a39006ff6678d4b6c79bc5d -
Branch / Tag:
refs/heads/main - Owner: https://github.com/CrazyBonze
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@dc3eb2d5ab1b52bc1a39006ff6678d4b6c79bc5d -
Trigger Event:
workflow_dispatch
-
Statement type: