A Python tool to scan directories for DLL files and extract metadata
Project description
DLL Scanner
A powerful Python tool for scanning directories to find DLL files, extracting comprehensive metadata, and performing static code analysis to confirm dependencies.
Features
- 🔍 Recursive Directory Scanning: Scan entire directory trees for DLL files
- 📊 Comprehensive Metadata Extraction: Extract detailed information from PE headers including:
- Architecture and machine type
- Version information (product, file, company) with enhanced Microsoft DLL support
- Native win32api integration on Windows for more reliable version extraction
- Import/export tables
- Security characteristics
- Digital signature status
- 🧠 Static Code Analysis: Analyze source code to confirm DLL dependencies with:
- Support for multiple programming languages (C/C++, C#, Python, Java, etc.)
- Pattern matching for LoadLibrary calls, DllImport attributes, and function references
- Confidence scoring for dependency matches
- ⚡ Parallel Processing: Multi-threaded scanning for improved performance
- 🎨 Rich CLI Interface: Beautiful command-line interface with progress bars and formatted output
- 📄 Multiple Output Formats: JSON export and CycloneDX SBOM format for integration with other tools
- 🔒 Security & Compliance: CycloneDX SBOM generation for supply chain security analysis
- 🐍 Python API: Use as a library in your own projects
Installation
From PyPI (when available)
pip install dll-scanner
From Source
git clone https://github.com/FlaccidFacade/dll-scanner.git
cd dll-scanner
# For basic usage
pip install -e .
# For development with all dev tools (recommended for contributors)
pip install -e ".[dev]"
The .[dev] extra installs additional development dependencies defined in pyproject.toml, including testing tools (pytest, pytest-cov), code formatting (black), linting (flake8), type checking (mypy), and pre-commit hooks.
Quick Start
Scan a Directory
# Basic directory scan
dll-scanner scan /path/to/project
# Recursive scan with dependency analysis
dll-scanner scan /path/to/project --analyze-dependencies --source-dir /path/to/source
# Save results to JSON
dll-scanner scan /path/to/project --output results.json
Inspect a Single DLL
dll-scanner inspect path/to/file.dll
Analyze Dependencies
dll-scanner analyze /path/to/source file1.dll file2.dll --output analysis.json
Usage Examples
Command Line Interface
Basic Directory Scan
# Scan current directory recursively
dll-scanner scan .
# Scan specific directory without recursion
dll-scanner scan /path/to/dlls --no-recursive
# Use custom number of worker threads
dll-scanner scan /path/to/dlls --max-workers 8
Dependency Analysis
# Analyze DLL dependencies in source code
dll-scanner scan /path/to/project \
--analyze-dependencies \
--source-dir /path/to/source \
--output full_analysis.json
CycloneDX SBOM Export
# Export scan results as CycloneDX SBOM
dll-scanner scan /path/to/project \
--cyclonedx \
--project-name "MyProject" \
--project-version "2.1.0" \
--output project_sbom.json
# Combine with dependency analysis
dll-scanner scan /path/to/project \
--analyze-dependencies \
--source-dir /path/to/source \
--cyclonedx \
--project-name "MyProject" \
--project-version "2.1.0" \
--output project_sbom.json
# Export single DLL as CycloneDX SBOM
dll-scanner inspect mylib.dll --cyclonedx --output mylib_sbom.json
#### Page Generation
```bash
# Generate GitHub Pages content
dll-scanner generate-pages --output ./pages-output --generate-data
# Generate pages with specific scan results
dll-scanner generate-pages \
--input scan_results.json \
--project-name "My Project" \
--output ./pages-output
Single File Inspection
# Inspect a specific DLL file
dll-scanner inspect kernel32.dll --output kernel32_metadata.json
Python API
from dll_scanner import DLLScanner, DependencyAnalyzer
from pathlib import Path
# Initialize scanner
scanner = DLLScanner(max_workers=4)
# Scan directory
result = scanner.scan_directory(Path("/path/to/project"))
print(f"Found {result.total_dlls_found} DLL files")
for dll in result.dll_files:
print(f"- {dll.file_name}: {dll.architecture}, {dll.company_name}")
# Analyze dependencies
analyzer = DependencyAnalyzer()
for dll_metadata in result.dll_files:
analysis = analyzer.analyze_dll_dependencies(
dll_metadata,
Path("/path/to/source")
)
print(f"{dll_metadata.file_name}: {len(analysis.confirmed_dependencies)} confirmed")
CycloneDX SBOM Export
from dll_scanner import DLLScanner, CycloneDXExporter
from pathlib import Path
# Scan directory
scanner = DLLScanner()
result = scanner.scan_directory(Path("/path/to/project"))
# Export to CycloneDX SBOM
exporter = CycloneDXExporter()
cyclonedx_json = exporter.export_to_json(
result,
project_name="MyProject",
project_version="1.0.0",
output_file=Path("project_sbom.json")
)
# Get SBOM summary
bom = exporter.export_to_cyclonedx(result, project_name="MyProject")
summary = exporter.get_component_summary(bom)
print(f"SBOM contains {summary['total_components']} components")
Advanced Usage
Custom Progress Callback
from rich.console import Console
console = Console()
def progress_callback(message):
console.print(f"[dim]{message}[/dim]")
scanner = DLLScanner(progress_callback=progress_callback)
result = scanner.scan_directory(Path("/path/to/project"))
Filtering and Analysis
# Get summary statistics
stats = scanner.get_summary_stats(result)
print(f"Architectures found: {stats['architectures']}")
print(f"Most common imports: {stats['most_common_imports']}")
# Filter DLLs by criteria
x64_dlls = [dll for dll in result.dll_files if dll.architecture == 'x64']
unsigned_dlls = [dll for dll in result.dll_files if not dll.is_signed]
GitHub Pages
DLL Scanner includes a comprehensive GitHub Pages integration for hosting interactive web tools and documentation.
Features
- 📊 Interactive Scan Results Viewer: Upload and analyze DLL scan results with charts, filtering, and export capabilities
- 📋 Dynamic Changelog: Auto-generated changelog viewer with search and filtering
- 🏠 Project Documentation: Complete project overview and getting started guide
- 🔗 URL Integration: Direct linking to scan results and specific changelog versions
Accessing the Pages
Visit the GitHub Pages site at: https://flaccidfacade.github.io/dll-scanner
Page Generation
Generate static pages for your own results:
# Generate basic pages
dll-scanner generate-pages --output ./my-pages --generate-data
# Generate pages with specific scan results
dll-scanner generate-pages \
--input my_scan_results.json \
--project-name "My Project Analysis" \
--output ./my-pages
# Serve locally for testing
python -m http.server 8000 -d ./my-pages
Interactive Tools
Scan Results Viewer
- File Upload: Drag and drop JSON scan results
- URL Loading: Load results from any accessible URL
- Live Filtering: Search and filter DLLs by name, architecture, or signing status
- Visualizations: Architecture distribution charts and company breakdowns
- Export Options: JSON, CSV, and summary report exports
Changelog Browser
- GitHub Integration: Automatically loads latest changelog from repository
- Search & Filter: Find specific versions or change types
- Timeline View: Visual timeline with version badges
URL Parameters
Direct link to specific data:
# Load specific scan results
https://your-pages-url/pages/scan-results.html?url=data/my_scan.json
# Filter changelog to specific version
https://your-pages-url/pages/changelog.html?version=1.0.0
For more details, see the Pages Documentation.
Output Format
Scan Results (JSON)
{
"scan_path": "/path/to/project",
"recursive": true,
"total_files_scanned": 42,
"total_dlls_found": 15,
"scan_duration_seconds": 2.34,
"errors": [],
"dll_files": [
{
"file_name": "kernel32.dll",
"file_path": "/path/to/kernel32.dll",
"file_size": 663552,
"architecture": "x64",
"machine_type": "amd64",
"company_name": "Microsoft Corporation",
"product_name": "Microsoft® Windows® Operating System",
"product_version": "10.0.19041.1901",
"file_version": "10.0.19041.1901 (WinBuild.160101.0800)",
"file_description": "Windows NT Base API Client DLL",
"internal_name": "kernel32",
"legal_copyright": "© Microsoft Corporation. All rights reserved.",
"original_filename": "KERNEL32.DLL",
"imported_dlls": ["ntdll.dll", "KERNELBASE.dll"],
"exported_functions": ["CreateFileA", "CreateFileW", "ReadFile", "WriteFile"],
"is_signed": true,
"checksum": "0x5B2D1E8F"
},
{
"file_name": "example.dll",
"file_path": "/path/to/example.dll",
"file_size": 65536,
"architecture": "x64",
"machine_type": "amd64",
"company_name": "Example Corporation",
"product_version": "2.1.0",
"file_version": "2.1.0.123",
"imported_dlls": ["kernel32.dll", "user32.dll"],
"exported_functions": ["ExampleFunction", "AnotherFunction"],
"is_signed": false
}
]
}
Dependency Analysis
{
"summary": {
"total_dlls_analyzed": 15,
"dlls_with_confirmed_usage": 12,
"dlls_potentially_unused": 3,
"total_confirmed_dependencies": 28,
"total_potential_dependencies": 5
},
"confirmed_dlls": [
{
"dll_name": "custom.dll",
"confirmed_references": 3,
"analysis_confidence": 0.95
}
],
"potentially_unused_dlls": [
{
"dll_name": "unused.dll",
"file_size": 32768,
"company": "Unknown"
}
]
}
Supported Languages for Dependency Analysis
The static code analyzer can detect DLL dependencies in the following languages:
- C/C++: LoadLibrary calls, #pragma lib comments, function references
- C#: DllImport attributes, P/Invoke declarations
- Python: ctypes library usage, LoadLibrary calls
- Java: JNI library loading
- JavaScript/TypeScript: Node.js native module references
- Go: CGO library references
- Rust: FFI declarations
- PHP: dl() function calls
- Ruby: DL library usage
Requirements
Runtime Requirements
- Windows operating system (for DLL scanning functionality)
- Python 3.9+
- pefile >= 2023.2.7
- click >= 8.0.0
- rich >= 13.0.0
- pathlib-mate >= 1.0.0
- cyclonedx-bom >= 4.0.0 (for CycloneDX SBOM export)
- pywin32 >= 306 (Windows only) - For enhanced version extraction using native Windows APIs
Development and Testing
- Cross-platform support: Tests run on Windows, Ubuntu, and Debian
- While the primary functionality requires Windows DLLs, the codebase is designed to be maintainable across platforms
- Static code analysis features work on any platform
Development
Setting up Development Environment
git clone https://github.com/FlaccidFacade/dll-scanner.git
cd dll-scanner
# Install in development mode with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov=dll_scanner
# Format code
black src/ tests/
# Setup pre-commit hooks (automatically run Black before each commit)
pip install pre-commit
pre-commit install
# Type checking
mypy src/
# Linting
flake8 src/
Cross-Platform Testing
This project uses GitHub Actions to test on multiple platforms:
- Windows: Full functionality testing with actual DLL files
- Ubuntu: Core functionality and code quality testing
- Debian: Additional Linux distribution testing using Docker containers
While the primary DLL scanning functionality requires Windows, the test suite ensures code quality and maintainability across platforms.
Coverage Reporting
The project uses Codecov for coverage reporting and analysis:
- Coverage reports are automatically generated on every CI run
- Coverage badges show current test coverage status
- Detailed reports available at https://app.codecov.io/gh/FlaccidFacade/dll-scanner
# Generate coverage report locally
pytest --cov=dll_scanner --cov-report=html --cov-report=term-missing
# View HTML coverage report
open htmlcov/index.html # macOS
start htmlcov/index.html # Windows
xdg-open htmlcov/index.html # Linux
Building and Publishing
Automated Publishing (Recommended)
The project uses GitHub Actions for automated publishing to PyPI with OIDC trusted publishing:
- Automatic: Publishing happens automatically when a new release is created on GitHub
- Manual: You can manually trigger publishing using the "Publish to PyPI" workflow in the Actions tab
The workflow file .github/workflows/publish.yml handles:
- Building the package
- Running quality checks
- OIDC token minting for secure authentication
- Publishing to PyPI or Test PyPI with proper audience configuration
- Environment protection with
pypiandtest-pypienvironments
Security: Uses OIDC trusted publishing - no API tokens required!
Manual Publishing
# Build package
python -m build
# Check package quality
twine check dist/*
# Upload to PyPI
twine upload dist/*
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Troubleshooting
Common Issues
"pefile library is required"
pip install pefile
Permission denied errors
- Run as administrator or ensure you have read permissions for the target directory
ImportError with optional dependencies
pip install dll-scanner[dev]
Performance Tips
- Use
--max-workersto control memory usage vs. speed - Disable
--parallelfor very large numbers of small files - Use
--no-recursivewhen you only need files in the target directory
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dll_scanner-0.6.4.tar.gz.
File metadata
- Download URL: dll_scanner-0.6.4.tar.gz
- Upload date:
- Size: 63.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc57e7e66924f1528b6ea0cb4a26a68570b17c59532690cf085aaa5b2ad736ed
|
|
| MD5 |
a6dfaa6b24c79b4759c9d1c4c9c21150
|
|
| BLAKE2b-256 |
0c518493784fd012b4bf644f0e1ef2d66c5630b37b7e99302c2d088a094a3b02
|
Provenance
The following attestation bundles were made for dll_scanner-0.6.4.tar.gz:
Publisher:
publish.yml on FlaccidFacade/dll-scanner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dll_scanner-0.6.4.tar.gz -
Subject digest:
fc57e7e66924f1528b6ea0cb4a26a68570b17c59532690cf085aaa5b2ad736ed - Sigstore transparency entry: 561566118
- Sigstore integration time:
-
Permalink:
FlaccidFacade/dll-scanner@c46503725be50bfbb46a2c95f31582149399985b -
Branch / Tag:
refs/tags/v0.6.4 - Owner: https://github.com/FlaccidFacade
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c46503725be50bfbb46a2c95f31582149399985b -
Trigger Event:
release
-
Statement type:
File details
Details for the file dll_scanner-0.6.4-py3-none-any.whl.
File metadata
- Download URL: dll_scanner-0.6.4-py3-none-any.whl
- Upload date:
- Size: 45.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9d2fcf3c8dd8d6def949bc4bb60224cb6a1ca4bc0c063702c4feff7d223f1b7
|
|
| MD5 |
e235f8d2ba95484724e397bcf62c1ff8
|
|
| BLAKE2b-256 |
d162cc2d7bd287a92692c451649b54f02b6a10a710a117073334dcbc5e4a44d7
|
Provenance
The following attestation bundles were made for dll_scanner-0.6.4-py3-none-any.whl:
Publisher:
publish.yml on FlaccidFacade/dll-scanner
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dll_scanner-0.6.4-py3-none-any.whl -
Subject digest:
c9d2fcf3c8dd8d6def949bc4bb60224cb6a1ca4bc0c063702c4feff7d223f1b7 - Sigstore transparency entry: 561566138
- Sigstore integration time:
-
Permalink:
FlaccidFacade/dll-scanner@c46503725be50bfbb46a2c95f31582149399985b -
Branch / Tag:
refs/tags/v0.6.4 - Owner: https://github.com/FlaccidFacade
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@c46503725be50bfbb46a2c95f31582149399985b -
Trigger Event:
release
-
Statement type: