Skip to main content

RE-cue: Universal reverse engineering toolkit for multi-framework codebases

Project description

RE-cue - Python Implementation

A Python command-line tool for reverse-engineering specifications from existing codebases with multi-framework support. This implementation extends the original bash script with enhanced templating, cross-platform compatibility, and support for multiple technology stacks.

Features

  • ๐ŸŒ Multi-Framework Support: Java (Spring Boot), Node.js (Express, NestJS), Python (Django, Flask, FastAPI), .NET (ASP.NET Core)
  • ๐Ÿ” Automatic Discovery: Finds API endpoints, data models, views, and services
  • ๐Ÿ“ Multiple Formats: Generates Markdown and JSON specifications
  • ๐ŸŽฏ OpenAPI Support: Creates OpenAPI 3.0 API contracts
  • โœจ Advanced Templating: Jinja2-powered templates with conditionals, loops, and filters
  • ๐ŸŽญ Interactive Use Case Refinement: Edit and improve generated use cases through text-based interface
  • ๐Ÿงช Comprehensive Testing: 90+ test cases for quality assurance
  • ๐Ÿš€ Minimal Dependencies: Only PyYAML and Jinja2 required
  • ๐Ÿ’ป Cross-Platform: Works on macOS, Linux, and Windows
  • ๐Ÿ“Š Interactive Progress: Real-time feedback with analysis stages
  • โšก Performance Optimizations: Caching, parallel processing, and incremental analysis for large codebases

Installation

From Source

# Clone the repository
git clone https://github.com/cue-3/re-cue.git
cd re-cue/reverse-engineer-python

# Install the package
pip install -e .

# Or install directly
python setup.py install

Using pip (if published)

pip install reverse-engineer

Usage

The Python CLI tool has the same interface as the bash script:

Basic Usage

# Generate specification document
recue --spec --description "forecast sprint delivery"

# Generate implementation plan
recue --plan

# Generate data model documentation
recue --data-model

# Generate OpenAPI contract
recue --api-contract

# Analyze a project at a specific path
recue --spec --path /path/to/project --description "external project"

# Generate everything
recue --spec --plan --data-model --api-contract --description "project description"

Options

--spec                 Generate specification document (spec.md)
--plan                 Generate implementation plan (plan.md)
--data-model           Generate data model documentation (data-model.md)
--api-contract         Generate API contract (api-spec.json)
--use-cases            Generate use case analysis (phase1-4 documents)
--refine-use-cases FILE Interactively refine existing use cases from FILE

-d, --description TEXT Project description (required for --spec)
-o, --output PATH      Output file path (default: specs/001-reverse/spec.md)
-f, --format FORMAT    Output format: markdown or json (default: markdown)
-p, --path PATH        Path to project directory (default: auto-detect)
-v, --verbose          Show detailed analysis progress
--help                 Show help message

Interactive Use Case Refinement

After generating use cases, you can interactively refine them:

# Generate initial use cases
recue --use-cases /path/to/project

# Refine use cases interactively
recue --refine-use-cases re-myproject/phase4-use-cases.md

Features:

  • Edit use case names and descriptions
  • Modify actors (primary and secondary)
  • Add, edit, or remove preconditions and postconditions
  • Refine main scenario steps
  • Add extension scenarios for error handling
  • Automatic backup before saving changes
  • Preserves document structure and metadata

See docs/INTERACTIVE-USE-CASE-REFINEMENT.md for detailed guide.

Performance Optimization Options (for Large Codebases)

For projects with 1000+ files, RE-cue offers several performance optimizations:

--cache               Enable result caching for faster re-runs (default: enabled)
--no-cache            Disable result caching
--clear-cache         Clear all cached results before analysis
--cache-stats         Display cache statistics and exit
--cleanup-cache       Clean up expired and invalid cache entries
--parallel            Enable parallel file processing (default: enabled)
--no-parallel         Disable parallel processing
--incremental         Enable incremental analysis - skip unchanged files (default: enabled)
--no-incremental      Disable incremental analysis - analyze all files
--max-workers N       Maximum number of worker processes (default: CPU count)

Performance Features:

  • Caching System: Stores analysis results for unchanged files โœจ NEW

    • File-level caching based on SHA-256 content hash
    • 5-10x speedup on re-runs for unchanged codebases
    • Persistent cache storage survives restarts
    • Automatic cache invalidation when files change
    • Support for multiple analysis types per file
    • Cache statistics tracking (hit rate, size, entries)
    • See Caching Documentation for details
  • Parallel Processing: Analyzes multiple files concurrently using multiprocessing

    • Automatically uses optimal worker count based on CPU cores
    • Graceful error handling with configurable thresholds
    • Clean shutdown on interruption (Ctrl+C)
  • Incremental Analysis: Skips unchanged files on re-analysis

    • Tracks file metadata (size, modification time)
    • In a benchmark on a 1200-file Python project, incremental analysis provided a 5.96x speedup on repeated runs. Actual speedup may vary depending on project size and file change frequency.
    • JSON-based state persistence across runs
    • Automatic change detection for modified files
    • Works alongside caching for maximum performance
  • Memory Efficient: Handles large files safely

    • Configurable file size limits (default: 10MB per file)
    • Stream-based reading with error recovery
    • Prevents memory exhaustion on huge files
  • Progress Reporting: Live updates during analysis

    • Real-time progress bars with percentage and ETA
    • Error tracking and summary reporting
    • Verbose mode for detailed diagnostics

Example Usage:

# Analyze large codebase with all optimizations (default)
recue --spec --path ~/large-project

# View cache statistics
recue --cache-stats --path ~/large-project

# Clear cache before analysis
recue --clear-cache --spec --path ~/large-project

# Clean up invalid cache entries
recue --cleanup-cache --path ~/large-project

# Use 8 worker processes for faster analysis
recue --spec --max-workers 8 --path ~/large-project

# Force full re-analysis (disable caching and incremental)
recue --no-cache --no-incremental --spec --path ~/large-project

# Sequential processing for debugging
recue --spec --no-parallel --verbose --path ~/large-project

# Optimal for very large projects (1000+ files)
recue --spec --verbose --max-workers 16 --path ~/enterprise-app

Performance Benchmarks:

  • Test project: 225 files (50 controllers, 100 models, 75 services)
  • First analysis: ~0.023s
  • Re-analysis with caching (unchanged): ~0.002s (11x speedup) โœจ NEW
  • Re-analysis with incremental (unchanged): ~0.004s (5.96x speedup)
  • Memory usage: Minimal, scales linearly with worker count
  • Cache overhead: ~2MB for 1000 files

Examples

# Generate spec with custom output location
recue --spec --description "manage orders" --output docs/api-spec.md

# Generate JSON format specification
recue --spec --description "track inventory" --format json

# Analyze a project in a different directory
recue --spec --path ~/projects/my-app --description "external codebase"

# Verbose mode for debugging
recue --spec --plan --verbose --description "process payments"

# Generate API contract for documentation
recue --api-contract --output api-docs/openapi.json

What Gets Generated

Specification (spec.md)

  • User stories with acceptance criteria
  • Functional requirements
  • Success criteria
  • Technical implementation details
  • Discovered endpoints, models, and services

Implementation Plan (plan.md)

  • Technical context and architecture
  • Complexity tracking
  • API contract documentation
  • Key decisions and rationale
  • Testing strategy

Data Model Documentation (data-model.md)

  • Detailed field information for each model
  • Relationships between models
  • Usage patterns
  • MongoDB/JPA annotations

API Contract (api-spec.json)

  • OpenAPI 3.0 specification
  • Complete endpoint documentation
  • Request/response schemas
  • Authentication requirements

Advanced Templating with Jinja2

The Python version now uses Jinja2 as its template engine, enabling sophisticated template features:

Key Capabilities

Conditional Sections: Show content only when relevant

{% if actor_count > 0 %}
## Actors ({{actor_count}})
{% for actor in actors %}
- {{actor.name}} ({{actor.type}})
{% endfor %}
{% endif %}

Loops: Iterate over collections

{% for endpoint in endpoints %}
{{loop.index}}. {{endpoint.method}} {{endpoint.path}}
{% endfor %}

Filters: Transform data during rendering

{{project_name | upper}}           {# MY PROJECT #}
{{text | replace('_', ' ') | title}} {# Hello World #}
{{items | length}}                  {# 5 #}

Complex Logic: Multi-level conditionals

{% if score >= 90 %}A
{% elif score >= 80 %}B
{% else %}F
{% endif %}

Documentation

For complete templating documentation and examples:

All existing templates remain compatible - Jinja2 adds capabilities without breaking changes.

Analysis Stages

The tool performs discovery in 5 interactive stages with real-time progress feedback:

๐Ÿ” Starting project analysis...

๐Ÿ“ Stage 1/5: Discovering API endpoints... โœ“ Found X endpoints
๐Ÿ“ฆ Stage 2/5: Analyzing data models... โœ“ Found X models
๐ŸŽจ Stage 3/5: Discovering UI views... โœ“ Found X views
โš™๏ธ  Stage 4/5: Detecting backend services... โœ“ Found X services
โœจ Stage 5/5: Extracting features... โœ“ Identified X features

โœ… Analysis complete!

Each stage completes independently, providing immediate feedback on discovery progress. Use --verbose for detailed logging within each stage.

Project Structure

reverse_engineer/
โ”œโ”€โ”€ __init__.py              # Package initialization
โ”œโ”€โ”€ __main__.py              # Module entry point
โ”œโ”€โ”€ cli.py                   # Command-line interface
โ”œโ”€โ”€ analyzer.py              # Project analysis logic
โ”œโ”€โ”€ generators.py            # Documentation generators
โ”œโ”€โ”€ phase_manager.py         # Phase execution management
โ”œโ”€โ”€ utils.py                 # Utility functions
โ””โ”€โ”€ templates/               # Jinja2 template system
    โ”œโ”€โ”€ template_loader.py   # Template loading logic
    โ”œโ”€โ”€ template_validator.py # Template validation
    โ”œโ”€โ”€ common/              # Common templates
    โ””โ”€โ”€ frameworks/          # Framework-specific templates

setup.py                     # Package setup
requirements.txt             # Dependencies
README-PYTHON.md             # This file
tests/                       # Test suite (90+ tests)

Supported Project Types

The tool can analyze multiple technology stacks:

  • Java: Spring Boot applications (2.x, 3.x) with Maven/Gradle
  • Node.js: Express and NestJS applications
  • Python: Django, Flask, and FastAPI applications
  • .NET: ASP.NET Core applications (6.0+)
  • Frontend: Vue.js, React, Angular applications
  • Multiple frameworks in the same project

For framework-specific details, see the Framework Guides.

Comparison with Bash Version

Feature Bash Script Python CLI
Endpoint Discovery โœ… โœ…
Model Analysis โœ… โœ…
View Discovery โœ… โœ…
Service Detection โœ… โœ…
OpenAPI Generation โœ… โœ…
Cross-Platform โš ๏ธ (Unix only) โœ… (All platforms)
Installation Copy script pip install
Speed Fast Fast
Extensibility Limited Easy to extend

Development

Running Tests

# Install development dependencies
pip install pytest pytest-cov

# Run tests
pytest tests/

# Run with coverage
pytest --cov=reverse_engineer tests/

Code Style

# Install development tools
pip install black flake8 mypy

# Format code
black reverse_engineer/

# Lint
flake8 reverse_engineer/

# Type check
mypy reverse_engineer/

Troubleshooting

Common Issues

"Error: Could not determine repository root"

  • Make sure you're running the command from within a git repository
  • Or ensure a .specify directory exists in your project

"Error: --description parameter is required"

  • The --spec flag requires a description parameter
  • Use: recue --spec --description "your project description"

No endpoints found

  • Check that your project follows standard naming conventions
  • Controllers should be in src/main/java/.../controller/ or similar
  • Files should end with Controller.java

Models not detected

  • Ensure model files are in standard locations
  • src/main/java/.../model/ or entity/ or domain/
  • Files should be plain Java POJOs with private fields

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT License - see LICENSE file for details

Links


๐Ÿš€ RE-cue: Universal Reverse Engineering Toolkit

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

re_cue-1.0.1.tar.gz (199.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

re_cue-1.0.1-py3-none-any.whl (177.5 kB view details)

Uploaded Python 3

File details

Details for the file re_cue-1.0.1.tar.gz.

File metadata

  • Download URL: re_cue-1.0.1.tar.gz
  • Upload date:
  • Size: 199.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for re_cue-1.0.1.tar.gz
Algorithm Hash digest
SHA256 b927ea24d1fe72590336862a1153df46857c4a2ff05ebdfa0a43e471c6d6339b
MD5 76b42de95caff13b3fe37b1423b897f6
BLAKE2b-256 fbd171d314fb475409927d50517617271a1087df3e84e55d0c2030e913e1804f

See more details on using hashes here.

Provenance

The following attestation bundles were made for re_cue-1.0.1.tar.gz:

Publisher: publish-package.yml on cue-3/re-cue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file re_cue-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: re_cue-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 177.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for re_cue-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 766a64fde00c984b0e68c1089df8de907e5fbcd41e913fa425a4fecb85247482
MD5 557a85c3865671e6f737ac81b5381cb8
BLAKE2b-256 0572b6efee968bc345473fc51c8897747966708581c1d1e5d3ad83033024ca32

See more details on using hashes here.

Provenance

The following attestation bundles were made for re_cue-1.0.1-py3-none-any.whl:

Publisher: publish-package.yml on cue-3/re-cue

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page