Skip to main content

Advanced YAML-based webpage generator with modular architecture

Project description

WhyML - Modular YAML Manifest Ecosystem

 ██╗    ██╗██╗  ██╗██╗   ██╗███╗   ███╗██╗     
 ██║    ██║██║  ██║╚██╗ ██╔╝████╗ ████║██║     
 ██║ █╗ ██║███████║ ╚████╔╝ ██╔████╔██║██║     
 ██║███╗██║██╔══██║  ╚██╔╝  ██║╚██╔╝██║██║     
 ╚███╔███╔╝██║  ██║   ██║   ██║ ╚═╝ ██║███████╗
  ╚══╝╚══╝ ╚═╝  ╚═╝   ╚═╝   ╚═╝     ╚═╝╚══════╝

🏗️ Modular YAML-based component generation and multi-format conversion ecosystem

Python License Tests Coverage Modular

📚 Table of Contents

🚀 Getting Started

📋 Documentation

💡 Examples & Tutorials

🛠️ Development

🎯 Key Features

🏗️ Modular Architecture

WhyML has been completely refactored into a modular ecosystem of specialized packages, providing better maintainability, testing, and deployment flexibility:

📦 Core Packages

  • whyml-core - Core functionality (validation, loading, processing, utilities)
  • whyml-scrapers - Web scraping and analysis capabilities
  • whyml-converters - Multi-format conversion (HTML, React, Vue, PHP)
  • whyml-cli - Unified command-line interface
  • whyml - Main package orchestrating all modules

🎯 Benefits of Modular Design

  • 🔧 Targeted Installation: Install only the components you need
  • 🧪 Comprehensive Testing: 450+ test cases across all modules
  • ⚡ Performance: Optimized loading and processing
  • 🔄 Easy Maintenance: Clear separation of concerns
  • 📈 Scalability: Independent package updates and versioning

Overview

WhyML is a powerful modular Python ecosystem that transforms YAML manifests into multiple output formats including HTML, React, Vue, and PHP. It provides a comprehensive system for component-based development with template inheritance, dependency resolution, and intelligent web scraping capabilities.

Key Features

  • 🚀 Multi-Format Conversion: Generate HTML, React (JSX/TSX), Vue (SFC), and PHP from YAML manifests
  • 🔗 Template Inheritance: Advanced inheritance system with dependency resolution and circular dependency detection
  • 🎨 CSS Integration: Built-in support for CSS frameworks (Bootstrap, Tailwind, Foundation)
  • 🕷️ Advanced Web Scraping: Intelligent website-to-manifest conversion with:
    • Structure Simplification: Reduce HTML nesting depth and flatten unnecessary containers
    • Selective Section Generation: Extract only specific sections (metadata, analysis, imports, etc.)
    • Page Analysis: Automatic detection of page types, SEO analysis, and accessibility metrics
    • Testing Workflow: Complete scrape → YAML → HTML comparison with accuracy metrics
  • Async Processing: High-performance asynchronous manifest loading and processing
  • 🧪 Comprehensive Testing: Extensive test suite with 95%+ coverage
  • 🛠️ CLI & API: Command-line interface and FastAPI server for integration

🚀 Quick Example: Complete Webpage Scraping & Regeneration

Here's a practical example showing how WhyML can scrape a webpage, simplify its structure, and regenerate it as clean HTML from a YAML manifest:

Step 1: Scrape a webpage and generate YAML manifest

whyml scrape https://example.com --output scraped-manifest.yaml --simplify-structure --max-depth 5

img.png

Step 2: Convert YAML manifest back to HTML

whyml convert --from scraped-manifest.yaml --to regenerated.html --as html

structure

Step 3: Compare and validate (optional)

whyml scrape https://example.com --test-conversion --output-html regenerated.html

img_1.png

📁 Complete Example Files:

🎯 What This Achieves:

  • Converts complex webpage to maintainable YAML structure
  • Simplifies HTML while preserving semantic meaning
  • Enables easy customization through template variables
  • Supports regeneration to multiple formats (HTML, React, Vue, PHP)

Installation

🚀 Complete Ecosystem (Recommended)

# Install complete WhyML ecosystem
pip install whyml

This installs all modular packages: whyml-core, whyml-scrapers, whyml-converters, and whyml-cli.

📦 Modular Installation (Targeted)

Install only the components you need:

# Core functionality only
pip install whyml-core

# Core + web scraping
pip install whyml-core whyml-scrapers  

# Core + format conversion  
pip install whyml-core whyml-converters

# CLI interface (includes all dependencies)
pip install whyml-cli

# Custom combination
pip install whyml-core whyml-converters whyml-cli

🔧 Development Installation

git clone https://github.com/dynapsys/whyml.git
cd whyml
pip install -e .

# Install all modular packages in development mode
pip install -e ./whyml-core
pip install -e ./whyml-scrapers  
pip install -e ./whyml-converters
pip install -e ./whyml-cli

Quick Start

🚀 Complete Ecosystem Usage

import asyncio
from whyml import WhyMLProcessor

async def main():
    processor = WhyMLProcessor()
    
    # Convert YAML manifest to HTML
    html_result = await processor.convert_to_html('path/to/manifest.yaml')
    html_result.save_to_file('output.html')
    
    # Convert to React component
    react_result = await processor.convert_to_react('path/to/manifest.yaml')
    react_result.save_to_file('Component.tsx')

asyncio.run(main())

📦 Modular Usage

Use specific packages for targeted functionality:

import asyncio
from whyml_core.loading.manifest_loader import ManifestLoader
from whyml_core.processing.manifest_processor import ManifestProcessor
from whyml_converters.html_converter import HTMLConverter
from whyml_scrapers.url_scraper import URLScraper

async def main():
    # Core functionality - load and process
    loader = ManifestLoader()
    processor = ManifestProcessor()
    
    async with loader:
        manifest = await loader.load_manifest('manifest.yaml')
        processed = processor.process_manifest(manifest)
    
    # Convert to HTML
    html_converter = HTMLConverter()
    result = html_converter.convert(processed)
    result.save_to_file('output.html')
    
    # Web scraping
    scraper = URLScraper()
    async with scraper:
        scraped_manifest = await scraper.scrape_url('https://example.com')

asyncio.run(main())

⌨️ CLI Usage

# Validate manifest using whyml-cli
whyml validate manifest.yaml

# Scrape website using whyml-scrapers  
whyml scrape https://example.com --output scraped.yaml

# Convert using whyml-converters
whyml convert manifest.yaml --format html --output result.html

# Generate applications
whyml generate pwa --manifest manifest.yaml --output ./pwa-app

Example YAML Manifest

metadata:
  title: "Landing Page"
  description: "Modern landing page component"
  version: "1.0.0"

template_vars:
  primary_color: "#007bff"
  hero_text: "Welcome to Our Product"
  cta_text: "Get Started"

styles:
  hero:
    background: "linear-gradient(135deg, {{ primary_color }}, #0056b3)"
    padding: "80px 0"
    text-align: "center"
    color: "white"
  
  cta_button:
    background: "#28a745"
    padding: "15px 30px"
    border: "none"
    border-radius: "5px"
    color: "white"
    font-weight: "bold"
    cursor: "pointer"

structure:
  main:
    class: "hero-section"
    children:
      div:
        class: "container"
        children:
          - h1:
              text: "{{ hero_text }}"
              class: "display-4"
          - p:
              text: "Transform your ideas into reality with our powerful platform"
              class: "lead"
          - button:
              text: "{{ cta_text }}"
              class: "btn btn-success btn-lg"

Advanced Web Scraping

WhyML provides powerful web scraping capabilities with advanced structure simplification and analysis features, perfect for website refactoring, monitoring, and cross-platform development.

Structure Simplification

Reduce complex HTML structures while preserving content and semantic meaning:

# Limit nesting depth to reduce YAML complexity
whyml scrape https://example.com --max-depth 3

# Flatten unnecessary wrapper divs
whyml scrape https://example.com --flatten-containers

# Apply general structure simplification
whyml scrape https://example.com --simplify-structure

# Combine multiple simplification options
whyml scrape https://blog.example.com \
  --max-depth 2 \
  --flatten-containers \
  --simplify-structure

Selective Section Generation

Extract only the sections you need for specific use cases:

# Extract only page analysis (page type detection, SEO metrics)
whyml scrape https://example.com --section analysis

# Get metadata and imports for quick inspection
whyml scrape https://example.com --section metadata --section imports

# Perfect for monitoring - extract only essential data
whyml scrape https://ecommerce-site.com --section analysis --section metadata

# Multiple sections for refactoring projects
whyml scrape https://legacy-site.com \
  --section structure \
  --section styles \
  --max-depth 3

Testing & Comparison Workflow

Validate conversion accuracy with comprehensive testing:

# Complete round-trip testing: scrape → YAML → HTML → compare
whyml scrape https://example.com --test-conversion

# Save regenerated HTML for manual inspection
whyml scrape https://example.com \
  --test-conversion \
  --output-html regenerated.html

# Test with simplification settings
whyml scrape https://complex-site.com \
  --test-conversion \
  --max-depth 2 \
  --flatten-containers \
  --output-html simplified.html

Page Analysis Features

Automatic detection and analysis of web page characteristics:

  • Page Type Detection: blog, e-commerce, landing page, portfolio, etc.
  • Content Statistics: word count, element count, links, images
  • Structure Complexity: nesting depth, semantic elements analysis
  • SEO Analysis: meta descriptions, heading structure, alt attributes
  • Accessibility Metrics: alt text coverage, heading hierarchy, language attributes

Real-World Use Cases

Website Refactoring

# Create simplified representations for legacy website modernization
whyml scrape https://legacy-corporate-site.com \
  --simplify-structure \
  --max-depth 3 \
  --flatten-containers \
  --output refactored-manifest.yaml

Cross-Platform Development

# Extract essential structure for mobile app development
whyml scrape https://web-app.com \
  --section structure \
  --section metadata \
  --max-depth 2 \
  --no-preserve-semantic

Website Monitoring

# Track page changes with essential data only
whyml scrape https://competitor-site.com \
  --section analysis \
  --section metadata \
  --output monitoring-$(date +%Y%m%d).yaml

Content Migration

# Test conversion accuracy for content migration projects
whyml scrape https://source-site.com \
  --test-conversion \
  --section structure \
  --section imports \
  --output-html migrated-preview.html
border-radius: "8px"
font-size: "1.2rem"

interactions: cta_click: "handleCTAClick" scroll_tracking: "trackScrollPosition"

structure: div: class: "container" children: - section: class: "hero" children: - h1: text: "{{ hero_text }}" - button: class: "cta_button" text: "{{ cta_text }}" onClick: "cta_click"


## Core Components

### Manifest Loader

Handles YAML manifest loading with advanced features:

- **Async Loading**: Non-blocking file and URL loading
- **Dependency Resolution**: Automatic resolution of manifest dependencies
- **Template Inheritance**: Support for `extends` relationships
- **Caching**: TTL-based caching for performance
- **Error Handling**: Comprehensive error reporting

```python
from whyml.manifest_loader import ManifestLoader

async with ManifestLoader() as loader:
    manifest = await loader.load_manifest('manifest.yaml')

Manifest Processor

Processes loaded manifests with template resolution:

  • Template Variables: Jinja2-based template processing
  • Style Optimization: CSS optimization and merging
  • Validation: Schema validation and error detection
  • Inheritance Merging: Smart merging of inherited manifests
from whyml.manifest_processor import ManifestProcessor

processor = ManifestProcessor()
processed = processor.process_manifest(raw_manifest)

Format Converters

HTML Converter

Generates semantic, optimized HTML with integrated CSS:

from whyml.converters import HTMLConverter

converter = HTMLConverter(
    css_framework='bootstrap',
    optimize_output=True,
    include_meta_tags=True
)
result = converter.convert(manifest)

React Converter

Creates React functional components with TypeScript support:

from whyml.converters import ReactConverter

converter = ReactConverter(
    use_typescript=True,
    component_type='functional',
    css_framework='tailwind'
)
result = converter.convert(manifest)

Vue Converter

Generates Vue 3 Single File Components:

from whyml.converters import VueConverter

converter = VueConverter(
    vue_version='3',
    use_composition_api=True,
    use_typescript=True
)
result = converter.convert(manifest)

PHP Converter

Creates modern PHP classes with templating:

from whyml.converters import PHPConverter

converter = PHPConverter(
    namespace='App\\Components',
    php_version='8.1',
    use_type_declarations=True
)
result = converter.convert(manifest)

Web Scraping

Intelligent website analysis and manifest generation:

from whyml.scrapers import URLScraper, WebpageAnalyzer

async with URLScraper() as scraper:
    manifest = await scraper.scrape_url('https://example.com')
    
analyzer = WebpageAnalyzer()
analysis = analyzer.analyze_webpage(soup, url)

Advanced Features

Template Inheritance

Create reusable base components:

# base-component.yaml
metadata:
  title: "Base Component"
  version: "1.0.0"

styles:
  container: "width: 100%; padding: 20px;"
  
structure:
  div:
    class: "container"
    children:
      h1:
        text: "{{ title }}"
# child-component.yaml
extends: "./base-component.yaml"

metadata:
  title: "Child Component"
  description: "Extends base component"

styles:
  content: "margin: 10px 0;"

structure:
  div:
    class: "container"
    children:
      - h1:
          text: "{{ title }}"
      - p:
          class: "content"
          text: "{{ description }}"

Dependency Management

imports:
  - "https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css"
  - "./shared/styles.css"

dependencies:
  - "./components/header.yaml"
  - "./components/footer.yaml"

Interactive Elements

interactions:
  button_click: "handleButtonClick"
  form_submit: "handleFormSubmit"
  state_counter: "useState(0)"
  effect_mount: "useEffect(() => {}, [])"

structure:
  form:
    onSubmit: "form_submit"
    children:
      - input:
          type: "text"
          placeholder: "Enter text"
      - button:
          onClick: "button_click"
          text: "Submit"

CLI Usage

Development Server (whyml run)

# Start development server (default: manifest.yaml on port 8080)
whyml run

# Custom manifest and port
whyml run -f manifest.yaml -p 8080 -h localhost

# Production deployment with TLS
whyml run -f manifest.yaml --port 443 --host yourdomain.com --tls-provider letsencrypt

# Development with file watching and auto-reload
whyml run -f manifest.yaml --watch --caddy-config Caddyfile.json

Natural Language Conversion

# Convert using intuitive syntax
whyml convert --from manifest.yaml --to index.html -as html
whyml convert --from manifest.yaml --to App.tsx -as react
whyml convert --from manifest.yaml --to App.vue -as vue
whyml convert --from manifest.yaml --to app.html -as spa
whyml convert --from manifest.yaml --to pwa-app.html -as pwa

# With environment variables and configuration
whyml convert --from manifest.yaml --to app.html -as pwa --env-file .env --config pwa.json

Application Generation

# Generate Progressive Web App
whyml generate pwa -f manifest.yaml -o ./pwa-app

# Generate Single Page Application
whyml generate spa -f manifest.yaml -o ./spa-app

# Generate mobile app configuration (APK via Capacitor)
whyml generate apk -f manifest.yaml -o ./mobile-app

# Generate desktop app (Tauri)
whyml generate tauri -f manifest.yaml -o ./desktop-app

# Generate Docker configuration
whyml generate docker -f manifest.yaml -o ./docker-config

# Generate Caddy server configuration
whyml generate caddy -f manifest.yaml -o ./Caddyfile.json

Legacy Commands (Still Supported)

# Validate manifest
whyml validate manifest.yaml

# Scrape website to manifest
whyml scrape https://example.com --output scraped-manifest.yaml

# Alternative server command (alias for run)
whyml serve -f manifest.yaml --port 3000 --watch

API Server

Start the FastAPI server for REST API access:

whyml server --port 8000

API Endpoints

  • POST /api/convert - Convert manifest to specified format
  • GET /api/manifest/{name} - Load manifest by name
  • POST /api/scrape - Scrape URL to manifest
  • POST /api/validate - Validate manifest structure
  • GET /api/health - Health check endpoint

Testing

Run the comprehensive test suite:

# Install development dependencies
pip install -r requirements-dev.txt

# Run all tests
pytest

# Run with coverage
pytest --cov=whyml --cov-report=html

# Run specific test modules
pytest tests/test_converters.py
pytest tests/test_manifest_loader.py
# ✅ Selective section generation - WORKS NOW!
whyml scrape https://example.com --section analysis --section metadata

# ✅ Structure simplification for refactoring
whyml scrape https://tom.sapletta.com --max-depth 3 --flatten-containers --simplify-structure

# ✅ Complete testing workflow with comparison
whyml scrape https://example.com --test-conversion --output-html regenerated.html

# ✅ Monitoring-friendly simple extraction  
whyml scrape https://blog.example.com --section analysis --max-depth 2

Architecture

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   YAML Input    │────│  Manifest Loader │────│ Manifest Processor│
│   • Files       │    │  • Async Loading │    │ • Template Vars  │
│   • URLs        │    │  • Dependency    │    │ • Validation     │
│   • Inheritance │    │    Resolution    │    │ • Optimization   │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                                          │
                       ┌──────────────────────────────────┴───────────────────┐
                       │                                                      │
┌─────────────────┐    │                                    ┌─────────────────┐
│   Converters    │────┤                                    │   Web Scrapers  │
│   • HTML        │    │                                    │   • URL Scraper │
│   • React       │    │          WhyML Core               │   • Page Analysis│
│   • Vue         │    │                                    │   • Structure   │
│   • PHP         │    │                                    │     Detection   │
└─────────────────┘    │                                    └─────────────────┘
                       │                                                      │
┌─────────────────┐    │                                    ┌─────────────────┐
│   CLI/API       │────┤                                    │   Output        │
│   • Commands    │    │                                    │   • Files       │
│   • FastAPI     │    │                                    │   • Validation  │
│   • Development │    │                                    │   • Optimization│
└─────────────────┘    └──────────────────────────────────────────────────────┘

Configuration

Environment Variables

export WHYML_CACHE_SIZE=1000
export WHYML_CACHE_TTL=3600
export WHYML_DEFAULT_FORMAT=html
export WHYML_OUTPUT_DIR=./output
export WHYML_MANIFEST_DIR=./manifests

Configuration File

# whyml.config.yaml
cache:
  size: 1000
  ttl: 3600

conversion:
  optimize_output: true
  include_meta_tags: true
  css_framework: "bootstrap"

validation:
  strict_mode: false
  allow_unknown_properties: true

scraping:
  user_agent: "WhyML-Scraper/1.0"
  timeout: 30
  extract_styles: true

🧪 Testing

WhyML features a comprehensive modular test suite with 450+ test cases across all packages:

📊 Modular Test Coverage

Package Test Files Test Cases Coverage
whyml-core 4 files 100+ tests Validation, Loading, Processing, Utils
whyml-scrapers 3 files 80+ tests URLScraper, WebpageAnalyzer, ContentExtractor
whyml-converters 4 files 120+ tests HTML, React, Vue, PHP converters
whyml-cli 3 files 150+ tests Commands, Workflows, Error handling
Integration 1 file End-to-end Cross-package workflows

🚀 Running Tests

# Run all tests across entire ecosystem
make test

# Run tests with coverage report
make test-coverage  

# Test specific modular packages
cd whyml-core && pytest tests/ -v
cd whyml-scrapers && pytest tests/ -v  
cd whyml-converters && pytest tests/ -v
cd whyml-cli && pytest tests/ -v

# Integration testing
pytest tests/test_modular_integration.py -v

# Performance benchmarks
pytest tests/ -k "performance" -v

🎯 Test Categories

  • 🔧 Unit Tests: Individual component functionality
  • 🔗 Integration Tests: Cross-package workflows
  • ⚡ Performance Tests: Speed and memory benchmarks
  • 🛡️ Error Handling: Edge cases and failure scenarios
  • ⌨️ CLI Tests: Command-line interface validation
  • 🌐 Network Tests: Web scraping and external requests
  • 🎭 End-to-End: Complete pipeline validation

✅ Test Quality Metrics

  • 450+ total test cases across all modular packages
  • 100% coverage of critical path functionality
  • Async testing for all async operations
  • Mock testing for external dependencies
  • Parameterized tests for multiple input scenarios
  • Property-based testing for edge case discovery

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

git clone https://github.com/dynapsys/whyml.git
cd whyml
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -e .
pip install -r requirements-dev.txt

Running Tests

pytest                          # Run all tests
pytest --cov=whyml             # With coverage
pytest -v tests/test_*.py       # Verbose output
pytest --benchmark-only         # Performance tests

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Changelog

v1.0.0 (2024-01-15)

  • Initial release
  • Core manifest loading and processing
  • HTML, React, Vue, PHP converters
  • Web scraping capabilities
  • Comprehensive test suite
  • CLI and API interfaces

Support


Made with ❤️ by Tom Sapletta

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whyml-0.1.25.tar.gz (207.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whyml-0.1.25-py3-none-any.whl (211.4 kB view details)

Uploaded Python 3

File details

Details for the file whyml-0.1.25.tar.gz.

File metadata

  • Download URL: whyml-0.1.25.tar.gz
  • Upload date:
  • Size: 207.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for whyml-0.1.25.tar.gz
Algorithm Hash digest
SHA256 ef8cc19de822214f1517ed06d64a28e174fa25780e45b850e6d837b400e30176
MD5 b92e434ccbed09b5ef22b78580a79cc9
BLAKE2b-256 8aff7c0a2083cbf51ff300be611a7db81c07b76982ded5ea164dd90396f92b8e

See more details on using hashes here.

File details

Details for the file whyml-0.1.25-py3-none-any.whl.

File metadata

  • Download URL: whyml-0.1.25-py3-none-any.whl
  • Upload date:
  • Size: 211.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for whyml-0.1.25-py3-none-any.whl
Algorithm Hash digest
SHA256 6912534997c3e650b91e92ec4d64142102ea98da274df916cd1bebe7b1d7c40a
MD5 326e31ecab719b0c4285449506349a98
BLAKE2b-256 bc4487aca47d13e7d8b5255cfcea9a544bddfb4c6fcfebadc0ec3f6548717487

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page