A smart caching wrapper for pylint that avoids re-running checks on unchanged files
Project description
pylint-cache
A smart caching wrapper for pylint that avoids re-running checks on unchanged files.
Why Bother?
Pylint has a built-in caching mechanism, but it does not skip work on subsequent runs. Even with caching enabled, Pylint will:
- re-open every file
- re-parse the AST
- re-run its full suite of checks
- re-evaluate imports and module relationships
As a result, Pylint performance remains largely proportional to the number of files being analyzed—no matter how often you run it.
This project provides a pragmatic alternative: content-based caching of entire Pylint results. If a file's contents have not changed since the previous run, its prior lint output is reused immediately, and Pylint is never invoked for that file.
The impact is significant:
- First run: Pylint performs full analysis.
- Subsequent runs: Unchanged files are resolved directly from cache.
This produces dramatic, measurable speedups—often reducing multi-second runs to just tens of milliseconds—without altering lint results or behavior.
In short:
- Stock Pylint caches internals, not results.
- This tool caches results, not internals.
- Only this approach eliminates unnecessary work.
It's a simple optimization that makes repeated linting practical, fast, and pleasant—especially in large codebases or workflows where rapid iteration matters.
Demo
When we run pylint-cache and the files have not been prcossed before, we get the same experience as running pylint on its own, except that each file is shown with a [RUNNING] prefix:
admin baconator (527) >> time pylint-cache . --args="-E"
Found 168 Python file(s) to check
Pylint args: -E
--------------------------------------------------------------------------------
[RUNNING] test_cluster_routing.py
[RUNNING] walk_sessions.py
************* Module walk_sessions
walk_sessions.py:799:35: E0601: Using variable 'json' before assignment (used-before-assignment)
[RUNNING] test_density_weighted_embedding.py
[RUNNING] repair_all_sessions_batch.py
[RUNNING] analyze_geometric_compression.py
[RUNNING] visualize_results.py
[RUNNING] system_monitor.py
************* Module system_monitor
system_monitor.py:506:12: E1123: Unexpected keyword argument 'throttle' in function call (unexpected-keyword-arg)
system_monitor.py:506:12: E1123: Unexpected keyword argument 'skip_hostname_prefix' in function call (unexpected-keyword-arg)
system_monitor.py:971:8: E0401: Unable to import 'lib.featrix_debug' (import-error)
system_monitor.py:971:8: E0611: No name 'featrix_debug' in module 'lib' (no-name-in-module)
[RUNNING] test_list_permutations.py
[RUNNING] debug_masking.py
[RUNNING] demo_cluster_routing.py
[RUNNING] test_credit_full_dataset.py
--------------------------------------------------------------------------------
Summary: 168 files total, 0 cached, 168 ran
real 5m31.171s
user 5m32.309s
sys 4m32.291s
You see here this took over 5 minutes on my Mac Studio M2 Ultra. When we run the next time through, we will see all the same errors (if we haven't fixed them) and [CACHED] before each file. You will also note the time savings at the end.
[taco-fixes] ~/Desktop/tetra-ws/featrix/taco-fixes
admin baconator (528) >> time pylint-cache . --args="-E"
Found 168 Python file(s) to check
Pylint args: -E
--------------------------------------------------------------------------------
[CACHED] test_cluster_routing.py
[CACHED] walk_sessions.py
************* Module walk_sessions
walk_sessions.py:799:35: E0601: Using variable 'json' before assignment (used-before-assignment)
[CACHED] test_density_weighted_embedding.py
[CACHED] repair_all_sessions_batch.py
[CACHED] analyze_geometric_compression.py
[CACHED] visualize_results.py
[CACHED] system_monitor.py
************* Module system_monitor
system_monitor.py:506:12: E1123: Unexpected keyword argument 'throttle' in function call (unexpected-keyword-arg)
system_monitor.py:506:12: E1123: Unexpected keyword argument 'skip_hostname_prefix' in function call (unexpected-keyword-arg)
system_monitor.py:971:8: E0401: Unable to import 'lib.featrix_debug' (import-error)
system_monitor.py:971:8: E0611: No name 'featrix_debug' in module 'lib' (no-name-in-module)
--------------------------------------------------------------------------------
📊 Summary:
Total files checked: 168
✅ Cached (skipped): 168
🔄 Newly analyzed: 0
⚡ Time saved this run: 331.17s
🎯 Cumulative time saved: 331.17s (5.5 min)
[STATS] files=168 cached=168 ran=0 saved=331.17s cumulative=331.17s
real 0m0.199s
user 0m0.047s
sys 0m0.096s
(base)
[taco-fixes] ~/Desktop/tetra-ws/featrix/taco-fixes
admin baconator (529) >>
Installation
Option 1: Install with pip (recommended)
# Install from local directory
pip install .
# Or install in development/editable mode
pip install -e .
# Uninstall
pip uninstall pylint-cache
Option 2: System-wide installation
# Install system-wide (requires sudo)
sudo ./install.sh
# This will:
# - Copy pylint_cache.py to /opt/pylint-cache/
# - Create a symlink at /usr/local/bin/pylint-cache
# - Make it available in your PATH
# Uninstall
sudo ./install.sh uninstall
Features
- Intelligent Caching: Tracks file MD5 hash, modification time, and size
- SQLite Backend: Stores results in a local
.pylint-cache.dbdatabase - Argument Tracking: Caches results per unique set of pylint arguments
- Fast: Only re-runs pylint when files actually change
- Easy to Use: Drop-in replacement for pylint with the same arguments
Usage
After installation:
# Check a single file
pylint-cache myfile.py
# Check multiple files
pylint-cache file1.py file2.py file3.py
# Check with pylint arguments (using --)
pylint-cache src/*.py -- --disable=C0111 --max-line-length=100
# Check with pylint arguments (using --args=)
pylint-cache src/ --args='--disable=C0111 --max-line-length=100'
# Check entire directory (recursively finds .py files)
pylint-cache src/
# Force rebuild - ignore cache and re-run pylint on everything
pylint-cache src/ --force
pylint-cache src/ -f # Short form
Or run directly without installation:
./pylint_cache.py myfile.py
When to Use --force
The --force (or -f) flag bypasses the cache and re-runs pylint on all files. Use it when:
- Testing changes to pylint configuration (e.g., modified
.pylintrc) - After upgrading pylint to ensure rules are applied with new version
- Cache corruption suspected - rebuild from scratch
- Changed pylint arguments significantly (though different args get separate cache entries)
- Debugging - verify cached results match fresh analysis
# Example: After updating .pylintrc
pylint-cache src/ --force --args="-E"
# Example: After upgrading pylint
pip install --upgrade pylint
pylint-cache . -f
Directory Recursion
When given a directory, pylint-cache recursively finds all .py files while automatically ignoring common non-code directories:
Ignored directories:
- Virtual environments:
venv/,env/,.venv/,virtualenv/ - Version control:
.git/,.svn/,.hg/ - Build artifacts:
build/,dist/,*.egg-info/ - Cache directories:
__pycache__/,.mypy_cache/,.pytest_cache/ - Dependencies:
node_modules/,site-packages/ - IDE:
.idea/,.vscode/
This matches typical pylint behavior and prevents scanning 57,000+ files in large projects!
Time Savings Tracking
Every time you run pylint-cache, it tracks:
- How long each pylint invocation took
- How much time was saved by using cached results
- Cumulative time saved across all runs
Example output:
--------------------------------------------------------------------------------
📊 Summary:
Total files checked: 247
✅ Cached (skipped): 245
🔄 Newly analyzed: 2
⚡ Time saved this run: 45.23s
🎯 Cumulative time saved: 1847.56s (30.8 min)
[STATS] files=247 cached=245 ran=2 saved=45.23s cumulative=1847.56s
This shows you the real-world impact of caching - how many minutes/hours you've saved by not re-running pylint on unchanged files!
The [STATS] line is machine-parseable for scripts and CI integration.
Use in Makefiles
.PHONY: lint
lint:
@echo "🔍 Running pylint error checks..."
@pylint-cache src/ --args="-E" || exit 1
@echo "✅ Pylint check completed"
.PHONY: test
test: lint
pytest
.PHONY: build
build: lint
python setup.py build
The tool exits with the highest pylint exit code from all files, so make will properly fail if any file has issues.
Parsing Output in Scripts
The [STATS] line provides machine-parseable output:
#!/bin/bash
output=$(pylint-cache src/ --args="-E" 2>&1)
stats=$(echo "$output" | grep "^\[STATS\]")
# Extract values
files=$(echo "$stats" | grep -o 'files=[0-9]*' | cut -d= -f2)
cached=$(echo "$stats" | grep -o 'cached=[0-9]*' | cut -d= -f2)
ran=$(echo "$stats" | grep -o 'ran=[0-9]*' | cut -d= -f2)
saved=$(echo "$stats" | grep -o 'saved=[0-9.]*s' | cut -d= -f2 | tr -d 's')
echo "Checked $files files, $cached from cache, $ran newly analyzed"
echo "Saved ${saved}s this run"
Force Rebuild in CI/CD
For CI/CD pipelines, you might want to force a full rebuild periodically:
# .gitlab-ci.yml example
lint:
script:
# Use cache for speed
- pylint-cache src/ --args="-E"
lint-weekly-full:
script:
# Full rebuild once a week to ensure accuracy
- pylint-cache src/ --force --args="-E"
only:
- schedules
Background Monitoring (Recommended)
Problem: Caching per-file is fast but might miss cross-file dependency issues.
Solution: Run a background monitor that detects changes and triggers full re-analysis.
# 1. Register your project(s)
pylint-cache-monitor add /path/to/project --dirs src,lib --args "-E"
# 2. Test it
pylint-cache-monitor run -v
# 3. Add to crontab
crontab -e
# Add: */15 * * * * pylint-cache-monitor run
See MONITOR_SETUP.md for detailed instructions.
How it works:
- Monitor wakes up every 15-30 minutes
- Checks if ANY Python file changed since last run
- If changes detected → runs pylint on ENTIRE tree
- Results are cached → developers get instant feedback with cross-file analysis
Benefits:
- 🔍 Catches import errors and cross-file issues
- ⚡ Developers still get instant cache hits
- 🔄 Automatic full re-analysis when needed
- 🎯 Best of both worlds: speed + accuracy
Automated Cache Pre-warming (Optional)
Pre-populate the cache for multiple projects:
# Run every night at 2 AM
0 2 * * * /path/to/pylint-cache-cron.sh
See CRON_SETUP.md for detailed instructions.
How It Works
- For each Python file, computes MD5 hash and gets modification time
- Checks SQLite database for cached results using MD5 hash as the primary key:
- If we've ever seen this exact file content before (even at a different path or time), reuse that result!
- Cache lookup is based on: MD5 hash + pylint arguments
- If cache hit: displays cached output (marked as
[CACHED]or[CACHED from other/path.py]) - If cache miss: runs pylint and stores result (marked as
[RUNNING])
Smart Content-Based Caching
The cache uses MD5 as the primary lookup key, which means:
- ✅ Moving a file to a different location? Still cached!
- ✅ Copying a file? Reuses the existing result!
- ✅ Touching a file (updating mtime) without changing content? Still cached!
- ✅ Same file analyzed in different projects? Reuses results across projects!
Cache Location
The cache is stored in ~/.pylint-cache.db in your home directory by default.
This means:
- ✅ Single shared cache across all your projects
- ✅ If you've linted a file in project A, the same file in project B reuses the result
- ✅ No
.pylint-cache.dbfiles cluttering your project directories - ✅ Easy to back up or clear: just delete
~/.pylint-cache.db
You can override the location by setting the PYLINT_CACHE_DB environment variable:
export PYLINT_CACHE_DB=/path/to/custom.db
pylint-cache src/
Database Schema
The cache uses a normalized three-table design:
Table 1: file_content
Tracks unique file content by MD5 hash:
md5_hash(PRIMARY KEY) - Content hashfile_size- Size in bytesfirst_seen- Timestamp when first encountered
Table 2: file_paths
Maps file paths to their content:
file_path(PRIMARY KEY) - Full file pathmd5_hash(FOREIGN KEY) - Links to file_contentmod_time- Last modification timelast_checked- When we last checked this path
Table 3: pylint_results
Stores pylint results per content + args:
md5_hash(PRIMARY KEY part 1) - Links to file_contentpylint_args(PRIMARY KEY part 2) - Pylint arguments usedpylint_output- Full output from pylintexit_code- Return code from pylintduration- How long pylint took to run (seconds)timestamp- When this result was generated
Table 4: cache_stats
Tracks cumulative time savings:
id- Auto-increment IDrun_timestamp- When this run occurredfiles_checked- Total files in this runfiles_cached- Files that used cachefiles_ran- Files that ran pylinttime_saved- Time saved this run (seconds)cumulative_time_saved- Total time saved ever (seconds)
This design allows multiple file paths to reference the same content, efficiently tracks which files we've seen, and shows you exactly how much time the cache has saved you.
Exit Codes
The tool exits with the highest exit code from all pylint runs (cached or fresh).
Limitations & Future Ideas
Current Limitations
- No automatic cross-file dependency tracking: If
file_a.pyimportsfile_b.pyandfile_b.pychanges, we won't automatically re-checkfile_a.pyunless you use the monitor script.- Solution: Use
pylint-cache-monitor.shto periodically trigger full re-analysis
- Solution: Use
- Single-threaded: Files are checked sequentially (though this is still faster than pylint due to caching)
Potential Future Enhancements
Want to help extend this? Here are some ideas:
- 🔗 Detect changed transitive imports - Track import graphs and invalidate cache when dependencies change
- ⚡ Parallel execution - Check multiple files simultaneously
- 📊 Track errors over time - Historical tracking of what errors changed
- 📄 HTML reports - Generate browsable reports of issues
- 🔧 Multi-tool caching - Unified cache for
ruff+pylint+mypy - 🌐 Shared team cache - Central cache server for CI/CD
Pull requests welcome! Cache pylint results.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pylint_cache-1.0.1.tar.gz.
File metadata
- Download URL: pylint_cache-1.0.1.tar.gz
- Upload date:
- Size: 28.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db12936c4e757519249cfbc2588250629cd9de0509a22deaf661e1fb1a064281
|
|
| MD5 |
091dd3ee6000c6a92261ee72f142b47a
|
|
| BLAKE2b-256 |
f69f30cc171da2fe6a962e3e48e3af0db8af1f5a8d18c5964cf65dee96b867ae
|
File details
Details for the file pylint_cache-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pylint_cache-1.0.1-py3-none-any.whl
- Upload date:
- Size: 17.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6db11a8b5d6a9cc308fe3c3c52a694518696490620c2e0a7f3a8a866e7c9196
|
|
| MD5 |
81944eeb2ec9730a1a77c1121921b578
|
|
| BLAKE2b-256 |
e9fbb5ed5276cd54ce2b3ccd00ebb1d632bc7917c227c9bdadc419e4b94b7ba4
|