RE-cue: Universal reverse engineering toolkit for multi-framework codebases

These details have not been verified by PyPI

Project description

RE-cue - Python Implementation

A Python command-line tool for reverse-engineering specifications from existing codebases with multi-framework support. This implementation extends the original bash script with enhanced templating, cross-platform compatibility, and support for multiple technology stacks.

Features

🌐 Multi-Framework Support: Java (Spring Boot), Node.js (Express, NestJS), Python (Django, Flask, FastAPI), .NET (ASP.NET Core)
🔍 Automatic Discovery: Finds API endpoints, data models, views, and services
📝 Multiple Formats: Generates Markdown and JSON specifications
🎯 OpenAPI Support: Creates OpenAPI 3.0 API contracts
✨ Advanced Templating: Jinja2-powered templates with conditionals, loops, and filters
🎭 Interactive Use Case Refinement: Edit and improve generated use cases through text-based interface
🧪 Comprehensive Testing: 90+ test cases for quality assurance
🚀 Minimal Dependencies: Only PyYAML and Jinja2 required
💻 Cross-Platform: Works on macOS, Linux, and Windows
📊 Interactive Progress: Real-time feedback with analysis stages
⚡ Performance Optimizations: Caching, parallel processing, and incremental analysis for large codebases

Installation

From Source

# Clone the repository
git clone https://github.com/cue-3/re-cue.git
cd re-cue/reverse-engineer-python

# Install the package
pip install -e .

# Or install directly
python setup.py install

Using pip (if published)

pip install reverse-engineer

Usage

The Python CLI tool has the same interface as the bash script:

Basic Usage

# Generate specification document
recue --spec --description "forecast sprint delivery"

# Generate implementation plan
recue --plan

# Generate data model documentation
recue --data-model

# Generate OpenAPI contract
recue --api-contract

# Analyze a project at a specific path
recue --spec --path /path/to/project --description "external project"

# Generate everything
recue --spec --plan --data-model --api-contract --description "project description"

Options

--spec                 Generate specification document (spec.md)
--plan                 Generate implementation plan (plan.md)
--data-model           Generate data model documentation (data-model.md)
--api-contract         Generate API contract (api-spec.json)
--use-cases            Generate use case analysis (phase1-4 documents)
--refine-use-cases FILE Interactively refine existing use cases from FILE

-d, --description TEXT Project description (required for --spec)
-o, --output PATH      Output file path (default: specs/001-reverse/spec.md)
-f, --format FORMAT    Output format: markdown or json (default: markdown)
-p, --path PATH        Path to project directory (default: auto-detect)
-v, --verbose          Show detailed analysis progress
--help                 Show help message

Interactive Use Case Refinement

After generating use cases, you can interactively refine them:

# Generate initial use cases
recue --use-cases /path/to/project

# Refine use cases interactively
recue --refine-use-cases re-myproject/phase4-use-cases.md

Features:

Edit use case names and descriptions
Modify actors (primary and secondary)
Add, edit, or remove preconditions and postconditions
Refine main scenario steps
Add extension scenarios for error handling
Automatic backup before saving changes
Preserves document structure and metadata

See docs/INTERACTIVE-USE-CASE-REFINEMENT.md for detailed guide.

Performance Optimization Options (for Large Codebases)

For projects with 1000+ files, RE-cue offers several performance optimizations:

--cache               Enable result caching for faster re-runs (default: enabled)
--no-cache            Disable result caching
--clear-cache         Clear all cached results before analysis
--cache-stats         Display cache statistics and exit
--cleanup-cache       Clean up expired and invalid cache entries
--parallel            Enable parallel file processing (default: enabled)
--no-parallel         Disable parallel processing
--incremental         Enable incremental analysis - skip unchanged files (default: enabled)
--no-incremental      Disable incremental analysis - analyze all files
--max-workers N       Maximum number of worker processes (default: CPU count)

Performance Features:

Caching System: Stores analysis results for unchanged files ✨ NEW
- File-level caching based on SHA-256 content hash
- 5-10x speedup on re-runs for unchanged codebases
- Persistent cache storage survives restarts
- Automatic cache invalidation when files change
- Support for multiple analysis types per file
- Cache statistics tracking (hit rate, size, entries)
- See Caching Documentation for details
Parallel Processing: Analyzes multiple files concurrently using multiprocessing
- Automatically uses optimal worker count based on CPU cores
- Graceful error handling with configurable thresholds
- Clean shutdown on interruption (Ctrl+C)
Incremental Analysis: Skips unchanged files on re-analysis
- Tracks file metadata (size, modification time)
- In a benchmark on a 1200-file Python project, incremental analysis provided a 5.96x speedup on repeated runs. Actual speedup may vary depending on project size and file change frequency.
- JSON-based state persistence across runs
- Automatic change detection for modified files
- Works alongside caching for maximum performance
Memory Efficient: Handles large files safely
- Configurable file size limits (default: 10MB per file)
- Stream-based reading with error recovery
- Prevents memory exhaustion on huge files
Progress Reporting: Live updates during analysis
- Real-time progress bars with percentage and ETA
- Error tracking and summary reporting
- Verbose mode for detailed diagnostics

Example Usage:

# Analyze large codebase with all optimizations (default)
recue --spec --path ~/large-project

# View cache statistics
recue --cache-stats --path ~/large-project

# Clear cache before analysis
recue --clear-cache --spec --path ~/large-project

# Clean up invalid cache entries
recue --cleanup-cache --path ~/large-project

# Use 8 worker processes for faster analysis
recue --spec --max-workers 8 --path ~/large-project

# Force full re-analysis (disable caching and incremental)
recue --no-cache --no-incremental --spec --path ~/large-project

# Sequential processing for debugging
recue --spec --no-parallel --verbose --path ~/large-project

# Optimal for very large projects (1000+ files)
recue --spec --verbose --max-workers 16 --path ~/enterprise-app

Performance Benchmarks:

Test project: 225 files (50 controllers, 100 models, 75 services)
First analysis: ~0.023s
Re-analysis with caching (unchanged): ~0.002s (11x speedup) ✨ NEW
Re-analysis with incremental (unchanged): ~0.004s (5.96x speedup)
Memory usage: Minimal, scales linearly with worker count
Cache overhead: ~2MB for 1000 files

Examples

# Generate spec with custom output location
recue --spec --description "manage orders" --output docs/api-spec.md

# Generate JSON format specification
recue --spec --description "track inventory" --format json

# Analyze a project in a different directory
recue --spec --path ~/projects/my-app --description "external codebase"

# Verbose mode for debugging
recue --spec --plan --verbose --description "process payments"

# Generate API contract for documentation
recue --api-contract --output api-docs/openapi.json

What Gets Generated

Specification (spec.md)

User stories with acceptance criteria
Functional requirements
Success criteria
Technical implementation details
Discovered endpoints, models, and services

Implementation Plan (plan.md)

Technical context and architecture
Complexity tracking
API contract documentation
Key decisions and rationale
Testing strategy

Data Model Documentation (data-model.md)

Detailed field information for each model
Relationships between models
Usage patterns
MongoDB/JPA annotations

API Contract (api-spec.json)

OpenAPI 3.0 specification
Complete endpoint documentation
Request/response schemas
Authentication requirements

Advanced Templating with Jinja2

The Python version now uses Jinja2 as its template engine, enabling sophisticated template features:

Key Capabilities

Conditional Sections: Show content only when relevant

{% if actor_count > 0 %}
## Actors ({{actor_count}})
{% for actor in actors %}
- {{actor.name}} ({{actor.type}})
{% endfor %}
{% endif %}

Loops: Iterate over collections

{% for endpoint in endpoints %}
{{loop.index}}. {{endpoint.method}} {{endpoint.path}}
{% endfor %}

Filters: Transform data during rendering

{{project_name | upper}}           {# MY PROJECT #}
{{text | replace('_', ' ') | title}} {# Hello World #}
{{items | length}}                  {# 5 #}

Complex Logic: Multi-level conditionals

{% if score >= 90 %}A
{% elif score >= 80 %}B
{% else %}F
{% endif %}

Documentation

For complete templating documentation and examples:

See docs/JINJA2-TEMPLATE-GUIDE.md for the full guide
See docs/JINJA2-GENERATOR-EXAMPLES.md for practical examples
Check templates/common/example-jinja2-features.md for template examples

All existing templates remain compatible - Jinja2 adds capabilities without breaking changes.

Analysis Stages

The tool performs discovery in 5 interactive stages with real-time progress feedback:

🔍 Starting project analysis...

📍 Stage 1/5: Discovering API endpoints... ✓ Found X endpoints
📦 Stage 2/5: Analyzing data models... ✓ Found X models
🎨 Stage 3/5: Discovering UI views... ✓ Found X views
⚙️  Stage 4/5: Detecting backend services... ✓ Found X services
✨ Stage 5/5: Extracting features... ✓ Identified X features

✅ Analysis complete!

Each stage completes independently, providing immediate feedback on discovery progress. Use --verbose for detailed logging within each stage.

Project Structure

reverse_engineer/
├── __init__.py              # Package initialization
├── __main__.py              # Module entry point
├── cli.py                   # Command-line interface
├── analyzer.py              # Project analysis logic
├── generators.py            # Documentation generators
├── phase_manager.py         # Phase execution management
├── utils.py                 # Utility functions
└── templates/               # Jinja2 template system
    ├── template_loader.py   # Template loading logic
    ├── template_validator.py # Template validation
    ├── common/              # Common templates
    └── frameworks/          # Framework-specific templates

setup.py                     # Package setup
requirements.txt             # Dependencies
README-PYTHON.md             # This file
tests/                       # Test suite (90+ tests)

Supported Project Types

The tool can analyze multiple technology stacks:

Java: Spring Boot applications (2.x, 3.x) with Maven/Gradle
Node.js: Express and NestJS applications
Python: Django, Flask, and FastAPI applications
.NET: ASP.NET Core applications (6.0+)
Frontend: Vue.js, React, Angular applications
Multiple frameworks in the same project

For framework-specific details, see the Framework Guides.

Comparison with Bash Version

Feature	Bash Script	Python CLI
Endpoint Discovery	✅	✅
Model Analysis	✅	✅
View Discovery	✅	✅
Service Detection	✅	✅
OpenAPI Generation	✅	✅
Cross-Platform	⚠️ (Unix only)	✅ (All platforms)
Installation	Copy script	pip install
Speed	Fast	Fast
Extensibility	Limited	Easy to extend

Development

Running Tests

# Install development dependencies
pip install pytest pytest-cov

# Run tests
pytest tests/

# Run with coverage
pytest --cov=reverse_engineer tests/

Code Style

# Install development tools
pip install black flake8 mypy

# Format code
black reverse_engineer/

# Lint
flake8 reverse_engineer/

# Type check
mypy reverse_engineer/

Troubleshooting

Common Issues

"Error: Could not determine repository root"

Make sure you're running the command from within a git repository
Or ensure a .specify directory exists in your project

"Error: --description parameter is required"

The --spec flag requires a description parameter
Use: recue --spec --description "your project description"

No endpoints found

Check that your project follows standard naming conventions
Controllers should be in src/main/java/.../controller/ or similar
Files should end with Controller.java

Models not detected

Ensure model files are in standard locations
src/main/java/.../model/ or entity/ or domain/
Files should be plain Java POJOs with private fields

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

MIT License - see LICENSE file for details

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.1

Nov 25, 2025

0.3.4

Dec 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

re_cue-1.0.1.tar.gz (199.1 kB view details)

Uploaded Nov 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

re_cue-1.0.1-py3-none-any.whl (177.5 kB view details)

Uploaded Nov 25, 2025 Python 3

File details

Details for the file re_cue-1.0.1.tar.gz.

File metadata

Download URL: re_cue-1.0.1.tar.gz
Upload date: Nov 25, 2025
Size: 199.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for re_cue-1.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b927ea24d1fe72590336862a1153df46857c4a2ff05ebdfa0a43e471c6d6339b`
MD5	`76b42de95caff13b3fe37b1423b897f6`
BLAKE2b-256	`fbd171d314fb475409927d50517617271a1087df3e84e55d0c2030e913e1804f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for re_cue-1.0.1.tar.gz:

Publisher: publish-package.yml on cue-3/re-cue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: re_cue-1.0.1.tar.gz
- Subject digest: b927ea24d1fe72590336862a1153df46857c4a2ff05ebdfa0a43e471c6d6339b
- Sigstore transparency entry: 724307080
- Sigstore integration time: Nov 25, 2025
Source repository:
- Permalink: cue-3/re-cue@1db7b948192424ddf4c5495ebc8833f209f2f392
- Branch / Tag: refs/heads/main
- Owner: https://github.com/cue-3
- Access: internal
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@1db7b948192424ddf4c5495ebc8833f209f2f392
- Trigger Event: workflow_dispatch

File details

Details for the file re_cue-1.0.1-py3-none-any.whl.

File metadata

Download URL: re_cue-1.0.1-py3-none-any.whl
Upload date: Nov 25, 2025
Size: 177.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for re_cue-1.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`766a64fde00c984b0e68c1089df8de907e5fbcd41e913fa425a4fecb85247482`
MD5	`557a85c3865671e6f737ac81b5381cb8`
BLAKE2b-256	`0572b6efee968bc345473fc51c8897747966708581c1d1e5d3ad83033024ca32`

See more details on using hashes here.

Provenance

The following attestation bundles were made for re_cue-1.0.1-py3-none-any.whl:

Publisher: publish-package.yml on cue-3/re-cue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: re_cue-1.0.1-py3-none-any.whl
- Subject digest: 766a64fde00c984b0e68c1089df8de907e5fbcd41e913fa425a4fecb85247482
- Sigstore transparency entry: 724307083
- Sigstore integration time: Nov 25, 2025
Source repository:
- Permalink: cue-3/re-cue@1db7b948192424ddf4c5495ebc8833f209f2f392
- Branch / Tag: refs/heads/main
- Owner: https://github.com/cue-3
- Access: internal
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-package.yml@1db7b948192424ddf4c5495ebc8833f209f2f392
- Trigger Event: workflow_dispatch

re-cue 1.0.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

RE-cue - Python Implementation

Features

Installation

From Source

Using pip (if published)

Usage

Basic Usage

Options

Interactive Use Case Refinement

Performance Optimization Options (for Large Codebases)

Examples

What Gets Generated

Specification (spec.md)

Implementation Plan (plan.md)

Data Model Documentation (data-model.md)

API Contract (api-spec.json)

Advanced Templating with Jinja2

Key Capabilities

Documentation

Analysis Stages

Project Structure

Supported Project Types

Comparison with Bash Version

Development

Running Tests

Code Style

Troubleshooting

Common Issues

Contributing

License

Links

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance