RE-cue: Universal reverse engineering toolkit for multi-framework codebases
Project description
RE-cue - Python Implementation
A Python command-line tool for reverse-engineering specifications from existing codebases with multi-framework support. This implementation extends the original bash script with enhanced templating, cross-platform compatibility, and support for multiple technology stacks.
Features
- ๐ Multi-Framework Support: Java (Spring Boot), Node.js (Express, NestJS), Python (Django, Flask, FastAPI), .NET (ASP.NET Core)
- ๐ Automatic Discovery: Finds API endpoints, data models, views, and services
- ๐ Multiple Formats: Generates Markdown and JSON specifications
- ๐ฏ OpenAPI Support: Creates OpenAPI 3.0 API contracts
- โจ Advanced Templating: Jinja2-powered templates with conditionals, loops, and filters
- ๐ญ Interactive Use Case Refinement: Edit and improve generated use cases through text-based interface
- ๐งช Comprehensive Testing: 90+ test cases for quality assurance
- ๐ Minimal Dependencies: Only PyYAML and Jinja2 required
- ๐ป Cross-Platform: Works on macOS, Linux, and Windows
- ๐ Interactive Progress: Real-time feedback with analysis stages
- โก Performance Optimizations: Caching, parallel processing, and incremental analysis for large codebases
Installation
From Source
# Clone the repository
git clone https://github.com/cue-3/re-cue.git
cd re-cue/reverse-engineer-python
# Install the package
pip install -e .
# Or install directly
python setup.py install
Using pip (if published)
pip install reverse-engineer
Usage
The Python CLI tool has the same interface as the bash script:
Basic Usage
# Generate specification document
recue --spec --description "forecast sprint delivery"
# Generate implementation plan
recue --plan
# Generate data model documentation
recue --data-model
# Generate OpenAPI contract
recue --api-contract
# Analyze a project at a specific path
recue --spec --path /path/to/project --description "external project"
# Generate everything
recue --spec --plan --data-model --api-contract --description "project description"
Options
--spec Generate specification document (spec.md)
--plan Generate implementation plan (plan.md)
--data-model Generate data model documentation (data-model.md)
--api-contract Generate API contract (api-spec.json)
--use-cases Generate use case analysis (phase1-4 documents)
--refine-use-cases FILE Interactively refine existing use cases from FILE
-d, --description TEXT Project description (required for --spec)
-o, --output PATH Output file path (default: specs/001-reverse/spec.md)
-f, --format FORMAT Output format: markdown or json (default: markdown)
-p, --path PATH Path to project directory (default: auto-detect)
-v, --verbose Show detailed analysis progress
--help Show help message
Interactive Use Case Refinement
After generating use cases, you can interactively refine them:
# Generate initial use cases
recue --use-cases /path/to/project
# Refine use cases interactively
recue --refine-use-cases re-myproject/phase4-use-cases.md
Features:
- Edit use case names and descriptions
- Modify actors (primary and secondary)
- Add, edit, or remove preconditions and postconditions
- Refine main scenario steps
- Add extension scenarios for error handling
- Automatic backup before saving changes
- Preserves document structure and metadata
See docs/INTERACTIVE-USE-CASE-REFINEMENT.md for detailed guide.
Performance Optimization Options (for Large Codebases)
For projects with 1000+ files, RE-cue offers several performance optimizations:
--cache Enable result caching for faster re-runs (default: enabled)
--no-cache Disable result caching
--clear-cache Clear all cached results before analysis
--cache-stats Display cache statistics and exit
--cleanup-cache Clean up expired and invalid cache entries
--parallel Enable parallel file processing (default: enabled)
--no-parallel Disable parallel processing
--incremental Enable incremental analysis - skip unchanged files (default: enabled)
--no-incremental Disable incremental analysis - analyze all files
--max-workers N Maximum number of worker processes (default: CPU count)
Performance Features:
-
Caching System: Stores analysis results for unchanged files โจ NEW
- File-level caching based on SHA-256 content hash
- 5-10x speedup on re-runs for unchanged codebases
- Persistent cache storage survives restarts
- Automatic cache invalidation when files change
- Support for multiple analysis types per file
- Cache statistics tracking (hit rate, size, entries)
- See Caching Documentation for details
-
Parallel Processing: Analyzes multiple files concurrently using multiprocessing
- Automatically uses optimal worker count based on CPU cores
- Graceful error handling with configurable thresholds
- Clean shutdown on interruption (Ctrl+C)
-
Incremental Analysis: Skips unchanged files on re-analysis
- Tracks file metadata (size, modification time)
- In a benchmark on a 1200-file Python project, incremental analysis provided a 5.96x speedup on repeated runs. Actual speedup may vary depending on project size and file change frequency.
- JSON-based state persistence across runs
- Automatic change detection for modified files
- Works alongside caching for maximum performance
-
Memory Efficient: Handles large files safely
- Configurable file size limits (default: 10MB per file)
- Stream-based reading with error recovery
- Prevents memory exhaustion on huge files
-
Progress Reporting: Live updates during analysis
- Real-time progress bars with percentage and ETA
- Error tracking and summary reporting
- Verbose mode for detailed diagnostics
Example Usage:
# Analyze large codebase with all optimizations (default)
recue --spec --path ~/large-project
# View cache statistics
recue --cache-stats --path ~/large-project
# Clear cache before analysis
recue --clear-cache --spec --path ~/large-project
# Clean up invalid cache entries
recue --cleanup-cache --path ~/large-project
# Use 8 worker processes for faster analysis
recue --spec --max-workers 8 --path ~/large-project
# Force full re-analysis (disable caching and incremental)
recue --no-cache --no-incremental --spec --path ~/large-project
# Sequential processing for debugging
recue --spec --no-parallel --verbose --path ~/large-project
# Optimal for very large projects (1000+ files)
recue --spec --verbose --max-workers 16 --path ~/enterprise-app
Performance Benchmarks:
- Test project: 225 files (50 controllers, 100 models, 75 services)
- First analysis: ~0.023s
- Re-analysis with caching (unchanged): ~0.002s (11x speedup) โจ NEW
- Re-analysis with incremental (unchanged): ~0.004s (5.96x speedup)
- Memory usage: Minimal, scales linearly with worker count
- Cache overhead: ~2MB for 1000 files
Examples
# Generate spec with custom output location
recue --spec --description "manage orders" --output docs/api-spec.md
# Generate JSON format specification
recue --spec --description "track inventory" --format json
# Analyze a project in a different directory
recue --spec --path ~/projects/my-app --description "external codebase"
# Verbose mode for debugging
recue --spec --plan --verbose --description "process payments"
# Generate API contract for documentation
recue --api-contract --output api-docs/openapi.json
What Gets Generated
Specification (spec.md)
- User stories with acceptance criteria
- Functional requirements
- Success criteria
- Technical implementation details
- Discovered endpoints, models, and services
Implementation Plan (plan.md)
- Technical context and architecture
- Complexity tracking
- API contract documentation
- Key decisions and rationale
- Testing strategy
Data Model Documentation (data-model.md)
- Detailed field information for each model
- Relationships between models
- Usage patterns
- MongoDB/JPA annotations
API Contract (api-spec.json)
- OpenAPI 3.0 specification
- Complete endpoint documentation
- Request/response schemas
- Authentication requirements
Advanced Templating with Jinja2
The Python version now uses Jinja2 as its template engine, enabling sophisticated template features:
Key Capabilities
Conditional Sections: Show content only when relevant
{% if actor_count > 0 %}
## Actors ({{actor_count}})
{% for actor in actors %}
- {{actor.name}} ({{actor.type}})
{% endfor %}
{% endif %}
Loops: Iterate over collections
{% for endpoint in endpoints %}
{{loop.index}}. {{endpoint.method}} {{endpoint.path}}
{% endfor %}
Filters: Transform data during rendering
{{project_name | upper}} {# MY PROJECT #}
{{text | replace('_', ' ') | title}} {# Hello World #}
{{items | length}} {# 5 #}
Complex Logic: Multi-level conditionals
{% if score >= 90 %}A
{% elif score >= 80 %}B
{% else %}F
{% endif %}
Documentation
For complete templating documentation and examples:
- See docs/JINJA2-TEMPLATE-GUIDE.md for the full guide
- See docs/JINJA2-GENERATOR-EXAMPLES.md for practical examples
- Check templates/common/example-jinja2-features.md for template examples
All existing templates remain compatible - Jinja2 adds capabilities without breaking changes.
Analysis Stages
The tool performs discovery in 5 interactive stages with real-time progress feedback:
๐ Starting project analysis...
๐ Stage 1/5: Discovering API endpoints... โ Found X endpoints
๐ฆ Stage 2/5: Analyzing data models... โ Found X models
๐จ Stage 3/5: Discovering UI views... โ Found X views
โ๏ธ Stage 4/5: Detecting backend services... โ Found X services
โจ Stage 5/5: Extracting features... โ Identified X features
โ
Analysis complete!
Each stage completes independently, providing immediate feedback on discovery progress. Use --verbose for detailed logging within each stage.
Project Structure
reverse_engineer/
โโโ __init__.py # Package initialization
โโโ __main__.py # Module entry point
โโโ cli.py # Command-line interface
โโโ analyzer.py # Project analysis logic
โโโ generators.py # Documentation generators
โโโ phase_manager.py # Phase execution management
โโโ utils.py # Utility functions
โโโ templates/ # Jinja2 template system
โโโ template_loader.py # Template loading logic
โโโ template_validator.py # Template validation
โโโ common/ # Common templates
โโโ frameworks/ # Framework-specific templates
setup.py # Package setup
requirements.txt # Dependencies
README-PYTHON.md # This file
tests/ # Test suite (90+ tests)
Supported Project Types
The tool can analyze multiple technology stacks:
- Java: Spring Boot applications (2.x, 3.x) with Maven/Gradle
- Node.js: Express and NestJS applications
- Python: Django, Flask, and FastAPI applications
- .NET: ASP.NET Core applications (6.0+)
- Frontend: Vue.js, React, Angular applications
- Multiple frameworks in the same project
For framework-specific details, see the Framework Guides.
Comparison with Bash Version
| Feature | Bash Script | Python CLI |
|---|---|---|
| Endpoint Discovery | โ | โ |
| Model Analysis | โ | โ |
| View Discovery | โ | โ |
| Service Detection | โ | โ |
| OpenAPI Generation | โ | โ |
| Cross-Platform | โ ๏ธ (Unix only) | โ (All platforms) |
| Installation | Copy script | pip install |
| Speed | Fast | Fast |
| Extensibility | Limited | Easy to extend |
Development
Running Tests
# Install development dependencies
pip install pytest pytest-cov
# Run tests
pytest tests/
# Run with coverage
pytest --cov=reverse_engineer tests/
Code Style
# Install development tools
pip install black flake8 mypy
# Format code
black reverse_engineer/
# Lint
flake8 reverse_engineer/
# Type check
mypy reverse_engineer/
Troubleshooting
Common Issues
"Error: Could not determine repository root"
- Make sure you're running the command from within a git repository
- Or ensure a
.specifydirectory exists in your project
"Error: --description parameter is required"
- The
--specflag requires a description parameter - Use:
recue --spec --description "your project description"
No endpoints found
- Check that your project follows standard naming conventions
- Controllers should be in
src/main/java/.../controller/or similar - Files should end with
Controller.java
Models not detected
- Ensure model files are in standard locations
src/main/java/.../model/orentity/ordomain/- Files should be plain Java POJOs with private fields
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
MIT License - see LICENSE file for details
Links
- GitHub Repository: https://github.com/cue-3/re-cue
- Documentation Website: https://cue-3.github.io/re-cue/
- Framework Guides: docs/frameworks/
- Original Bash Script:
reverse-engineer-bash/reverse-engineer.sh - Main Documentation: See README.md
๐ RE-cue: Universal Reverse Engineering Toolkit
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file re_cue-1.0.1.tar.gz.
File metadata
- Download URL: re_cue-1.0.1.tar.gz
- Upload date:
- Size: 199.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b927ea24d1fe72590336862a1153df46857c4a2ff05ebdfa0a43e471c6d6339b
|
|
| MD5 |
76b42de95caff13b3fe37b1423b897f6
|
|
| BLAKE2b-256 |
fbd171d314fb475409927d50517617271a1087df3e84e55d0c2030e913e1804f
|
Provenance
The following attestation bundles were made for re_cue-1.0.1.tar.gz:
Publisher:
publish-package.yml on cue-3/re-cue
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
re_cue-1.0.1.tar.gz -
Subject digest:
b927ea24d1fe72590336862a1153df46857c4a2ff05ebdfa0a43e471c6d6339b - Sigstore transparency entry: 724307080
- Sigstore integration time:
-
Permalink:
cue-3/re-cue@1db7b948192424ddf4c5495ebc8833f209f2f392 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/cue-3
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@1db7b948192424ddf4c5495ebc8833f209f2f392 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file re_cue-1.0.1-py3-none-any.whl.
File metadata
- Download URL: re_cue-1.0.1-py3-none-any.whl
- Upload date:
- Size: 177.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
766a64fde00c984b0e68c1089df8de907e5fbcd41e913fa425a4fecb85247482
|
|
| MD5 |
557a85c3865671e6f737ac81b5381cb8
|
|
| BLAKE2b-256 |
0572b6efee968bc345473fc51c8897747966708581c1d1e5d3ad83033024ca32
|
Provenance
The following attestation bundles were made for re_cue-1.0.1-py3-none-any.whl:
Publisher:
publish-package.yml on cue-3/re-cue
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
re_cue-1.0.1-py3-none-any.whl -
Subject digest:
766a64fde00c984b0e68c1089df8de907e5fbcd41e913fa425a4fecb85247482 - Sigstore transparency entry: 724307083
- Sigstore integration time:
-
Permalink:
cue-3/re-cue@1db7b948192424ddf4c5495ebc8833f209f2f392 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/cue-3
-
Access:
internal
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-package.yml@1db7b948192424ddf4c5495ebc8833f209f2f392 -
Trigger Event:
workflow_dispatch
-
Statement type: