Dead link scanner for .md and .html docs
Project description
BrokeLink
A powerful CLI tool to scan Markdown and HTML files for broken links, missing images, and invalid references.
Features
- ✅ Broken local links - Find
[text](broken-link.md)that don't exist - ✅ Dead image references - Detect
pointing to missing files - ✅ Outdated internal references - Check heading anchors and fragments
- 🎨 Colored output - Easy-to-read results with syntax highlighting
- 📄 Multiple formats - Support for Markdown (
.md) and HTML (.html/.htm) - 🔧 Flexible scanning - Include/exclude patterns, recursive directory scanning
- 📊 JSON output - Machine-readable results for CI/CD integration
Installation
From PyPI (when published)
pip install brokelink
From Source
git clone https://github.com/000xs/brokelink.git
cd brokelink
pip install -e .
verify installation
brokelink
Quick Start
# Scan current directory
brokelink
# Scan specific directory
brokelink ./docs
# Scan with verbose output
brokelink -v
# Check only Markdown files
brokelink --include="*.md"
# Output as JSON
brokelink --format=json
# Include anchor checking (experimental)
brokelink --check-anchors
Usage
Usage: brokelink [OPTIONS] [PATH]
🔗 BrokeLink - Scan for broken links in Markdown and HTML files.
Options:
-i, --include TEXT File patterns to include (default: *.md *.html)
-e, --exclude TEXT File patterns to exclude
-img, --check-images Check image references (default: enabled)
-a, --check-anchors Check heading anchors (experimental)
-v, --verbose Verbose output
-q, --quiet Only show errors
-f, --format [text|json] Output format
--help Show this message and exit.
Examples
Basic Usage
# Scan all .md and .html files in current directory
brokelink
# Scan specific file
brokelink README.md
# Scan docs folder recursively
brokelink ./docs
Advanced Filtering
# Only check Markdown files
brokelink --include="*.md"
# Exclude certain directories
brokelink --exclude="node_modules/*" --exclude="build/*"
# Skip image checking
brokelink --no-check-images
CI/CD Integration
# JSON output for parsing
brokelink --format=json --quiet > broken-links.json
# Exit code 1 if broken links found (perfect for CI)
brokelink || echo "Broken links detected!"
Output Example
🔗 BrokeLink v1.0.0 - Scanning for broken links...
💥 Found 3 broken link(s) in 2 file(s):
📄 docs/README.md:
🔗 Missing files (2):
Line 15: ./nonexistent.md ('Documentation Link')
Line 23: ../missing-guide.md ('Setup Guide')
🖼️ Missing images (1):
Line 8: ./images/logo.png ('BrokeLink Logo')
📄 index.html:
⚓ Invalid anchors (1):
Line 42: #missing-section ('Jump to Section')
Development
Setup Development Environment
git clone https://github.com/000xs/brokelink.git
cd brokelink
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e .
Running Tests
python -m pytest tests/
Project Structure
brokelink/
├── brokelink/ # Main package
│ ├── __init__.py
│ ├── cli.py # CLI interface
│ ├── parser.py # Link extraction
│ └── utils.py # Link checking & reporting
├── demo/ # Sample files for testing
├── tests/ # Test suite
├── README.md
├── LICENSE
└── pyproject.toml # Modern Python packaging
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Roadmap
- External URL checking (with timeout/retry)
- Whitelist/blacklist for URLs
- Integration with popular static site generators
- Performance optimizations for large repositories
- GitHub Actions integration
- VS Code extension
Made with ❤️ for documentation maintainers everywhere!"# brokelink"
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
brokelink-0.1.0.tar.gz
(9.0 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file brokelink-0.1.0.tar.gz.
File metadata
- Download URL: brokelink-0.1.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e7a14958b88299fabae39e8e1c16cb20776f7f8f8af1952e39bb1d7f033501a3
|
|
| MD5 |
690a72d437a418c211b443a306383ad3
|
|
| BLAKE2b-256 |
5d1fcb5cdf50c10b35240dd20c80b7f11658de394591635c80bf8da432122938
|
File details
Details for the file brokelink-0.1.0-py3-none-any.whl.
File metadata
- Download URL: brokelink-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87440e0e8ea605312a1f3a78710fc5c24eb4c528bf320661a99cfbc6f9fe4c57
|
|
| MD5 |
df2baf8d44f9d25a5dc01bdacf93c704
|
|
| BLAKE2b-256 |
3b5589fedc3660771fafd41de6248b49c4b3a587a9a79c60e1ecc6c4a67191e2
|