Skip to main content

A schema critic for databases - analyze structure, surface trade-offs, propose alternatives

Project description

flaqes ๐Ÿ”

Python 3.13+ License: MIT Coverage: 100%

A schema critic for PostgreSQL databases

flaqes analyzes database structures and surfaces design tensions, trade-offs, and alternative approaches based on your stated intent. Think of it as a thoughtful colleague who reviews your schema and explains why things are the way they are, not just what they are.

Features

  • ๐ŸŽฏ Intent-Aware Analysis - Provides contextual advice based on your workload (OLTP, OLAP, or mixed)
  • ๐Ÿ” Role Detection - Identifies semantic roles (fact tables, dimensions, events, junctions, etc.) with confidence scores
  • ๐ŸŽจ Pattern Recognition - Detects design patterns like SCD Type 2, soft deletes, polymorphic associations, and more
  • โš–๏ธ Design Tensions - Surfaces trade-offs in your current design with alternatives and effort estimates
  • ๐Ÿ“Š Comprehensive Reports - Generates structured reports in Markdown or JSON format
  • ๐Ÿ“ˆ Mermaid ERD Diagrams - Generate beautiful entity-relationship diagrams for documentation
  • ๐Ÿ–ฅ๏ธ CLI Interface - Analyze databases or DDL files from the command line
  • ๐Ÿ“„ DDL Parsing - Analyze schema from DDL files without database connection
  • ๐Ÿ”ฌ No Mutations - Analysis only, never modifies your database

Installation

# Basic installation (includes PostgreSQL support)
pip install flaqes

# With development dependencies
pip install flaqes[dev]

Or using uv:

uv pip install flaqes

Quick Start

Python API

import asyncio
from flaqes import analyze_schema, Intent

async def main():
    # Define your workload intent
    intent = Intent(
        workload="OLAP",
        write_frequency="low",
        read_patterns=["aggregation", "range_scan"],
        data_volume="large",
        evolution_rate="high",
    )
    
    # Analyze your database
    report = await analyze_schema(
        dsn="postgresql://user:pass@localhost/mydb",
        intent=intent,
    )
    
    # View the markdown report
    print(report.to_markdown())
    
    # Or export as JSON
    import json
    print(json.dumps(report.to_dict(), indent=2))

asyncio.run(main())

Command Line Interface

# Analyze a live PostgreSQL database
flaqes analyze postgresql://user:pass@localhost/mydb

# Analyze with workload intent
flaqes analyze postgresql://localhost/mydb --workload OLTP --volume small

# Output as JSON
flaqes analyze postgresql://localhost/mydb --format json

# Analyze DDL files (no database connection required)
flaqes analyze-ddl schema.sql

# Multiple DDL files
flaqes analyze-ddl schema.sql migrations/*.sql

# Introspect schema structure only
flaqes introspect --dsn postgresql://localhost/mydb
flaqes introspect --dsn postgresql://localhost/mydb --format json

# Generate Mermaid ERD diagram
flaqes diagram --ddl schema.sql
flaqes diagram --ddl schema.sql --wrap  # Wrap in markdown code block
flaqes diagram --ddl schema.sql --no-columns  # Tables only, no column details
flaqes diagram --dsn postgresql://localhost/mydb  # From live database

What Makes flaqes Different?

Unlike traditional schema validators or linters, flaqes:

  1. Understands Intent - Recommendations depend on your workload. A denormalized table might be problematic for OLTP but perfect for OLAP.

  2. Embraces Uncertainty - Every inference includes a confidence score and the signals that led to it. No black-box "best practices."

  3. Explains Trade-offs - Instead of saying "this is wrong," flaqes says "here's what you gain, here's what you risk, and here's when it might break."

  4. Never Mutates - flaqes is read-only. It analyzes and advises, never changes your database.

Example Output

# Schema Analysis Report: public

**Tables analyzed:** 12
**Workload:** OLAP
**Data volume:** large

## Summary

### Table Roles
- **FACT**: 3
- **DIMENSION**: 6
- **JUNCTION**: 2
- **EVENT**: 1

### Design Patterns
- **AUDIT_TIMESTAMPS**: 8
- **SOFT_DELETE**: 4
- **SCD_TYPE_2**: 2

### Design Tensions
- ๐Ÿ”ด **Critical**: 2
- ๐ŸŸก **Warning**: 5
- ๐Ÿ”ต **Info**: 3

---

## Design Tensions

### ๐Ÿ”ด Critical Issues

#### public.orders: Missing index on frequently joined column
**Risk:** Full table scans on large table during joins will severely impact query performance.
**Breaking point:** When table exceeds 100K rows or join queries exceed 1s response time.
**Alternatives:** 2
- Add B-tree index on customer_id (low effort)
- Partition table by order_date and add local indexes (medium effort)

API Reference

Core Functions

analyze_schema

Analyze a database schema and generate a comprehensive report.

async def analyze_schema(
    dsn: str,
    intent: Intent | None = None,
    tables: list[str] | None = None,
    schemas: list[str] | None = None,
    exclude_patterns: list[str] | None = None,
) -> SchemaReport:

Parameters:

  • dsn: Database connection string (e.g., "postgresql://user:pass@host/db")
  • intent: Optional workload intent for contextual analysis
  • tables: Optional list of specific tables to analyze
  • schemas: Optional list of schemas (default: ["public"])
  • exclude_patterns: Optional patterns to exclude (e.g., ["tmp_*"])

Returns: SchemaReport with analysis results

introspect_schema

Introspect a database and return the raw schema graph.

async def introspect_schema(
    dsn: str,
    tables: list[str] | None = None,
    schemas: list[str] | None = None,
    exclude_patterns: list[str] | None = None,
) -> SchemaGraph:

generate_report

Generate a report from a schema graph.

def generate_report(
    graph: SchemaGraph,
    intent: Intent | None = None,
) -> SchemaReport:

DDL Parsing

Analyze schemas directly from DDL files:

from flaqes.introspection import parse_ddl, parse_ddl_file
from flaqes import generate_report, Intent

# Parse DDL string
ddl = """
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    created_at TIMESTAMP DEFAULT NOW()
);
"""
graph = parse_ddl(ddl)

# Parse DDL file
graph = parse_ddl_file("schema.sql")

# Analyze
report = generate_report(graph, intent=Intent(workload="OLTP"))
print(report.to_markdown())

Intent Specification

The Intent dataclass captures your workload characteristics:

from flaqes import Intent

intent = Intent(
    workload="OLTP",           # "OLTP" | "OLAP" | "mixed"
    write_frequency="high",     # "high" | "medium" | "low"
    read_patterns=["point_lookup", "join_heavy"],  # List of patterns
    consistency="strong",       # "strong" | "eventual"
    evolution_rate="high",      # "high" | "medium" | "low" | "frozen"
    data_volume="medium",       # "small" | "medium" | "large" | "massive"
)

Common presets:

from flaqes.core.intent import (
    OLTP_INTENT,           # High-frequency transactional workload
    OLAP_INTENT,           # Analytics/reporting workload
    EVENT_SOURCING_INTENT, # Append-only event streams
    STARTUP_MVP_INTENT,    # Rapid iteration, schema flexibility
)

Lower-Level API

For custom analysis workflows:

from flaqes import introspect_schema
from flaqes.analysis import RoleDetector, PatternDetector, TensionAnalyzer

# Just introspect the schema
graph = await introspect_schema("postgresql://localhost/mydb")

# Run individual analyzers
role_detector = RoleDetector()
pattern_detector = PatternDetector()
tension_analyzer = TensionAnalyzer(intent=intent)

for table in graph:
    # Detect table role
    role_result = role_detector.detect(table, graph)
    print(f"{table.name}: {role_result.primary_role.name} ({role_result.confidence:.0%})")
    
    # Detect patterns
    patterns = pattern_detector.detect(table, graph)
    for pattern in patterns:
        print(f"  Pattern: {pattern.pattern_type.name}")
    
    # Analyze tensions
    tensions = tension_analyzer.analyze_table(table, graph)
    for tension in tensions:
        print(f"  Tension: {tension.description}")

SchemaReport API

report = await analyze_schema(dsn, intent=intent)

# Properties
report.table_count           # Number of tables analyzed
report.table_roles           # Dict[str, RoleResult]
report.patterns              # List of detected patterns
report.tensions              # List of design tensions
report.intent                # The intent used for analysis

# Export methods
report.to_markdown()         # Formatted markdown string
report.to_dict()             # JSON-serializable dictionary

Architecture

flaqes operates in three layers:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    Intent-Aware Analysis                     โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚   Tension   โ”‚  โ”‚  Severity   โ”‚  โ”‚    Alternatives     โ”‚  โ”‚
โ”‚  โ”‚  Detection  โ”‚  โ”‚  Scoring    โ”‚  โ”‚    & Trade-offs     โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                  Semantic Heuristics                         โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚    Role     โ”‚  โ”‚   Pattern   โ”‚  โ”‚    Confidence       โ”‚  โ”‚
โ”‚  โ”‚  Detection  โ”‚  โ”‚  Matching   โ”‚  โ”‚    Scoring          โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                   Structural Facts                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚
โ”‚  โ”‚   Tables    โ”‚  โ”‚  Indexes &  โ”‚  โ”‚   Relationships     โ”‚  โ”‚
โ”‚  โ”‚ & Columns   โ”‚  โ”‚ Constraints โ”‚  โ”‚   & Cardinality     โ”‚  โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚         PostgreSQL Catalog / DDL Parser                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
  1. Structural Facts Layer (Objective)

    • Introspects database catalogs or parses DDL
    • Extracts tables, columns, keys, constraints, indexes
    • Builds a complete SchemaGraph
  2. Semantic Heuristics Layer (Probabilistic)

    • Detects table roles via structural signals
    • Identifies design patterns via naming and structure
    • All with confidence scores
  3. Intent-Aware Analysis Layer (Advisory)

    • Analyzes design tensions based on stated intent
    • Proposes alternatives with trade-off explanations
    • Severity depends on workload characteristics

Development Status

Version: 0.1.0

Completed โœ…

  • โœ… PostgreSQL introspection (live database)
  • โœ… DDL parsing (offline analysis)
  • โœ… Role detection (fact, dimension, event, junction, config, lookup, etc.)
  • โœ… Pattern matching (SCD, soft delete, polymorphic, audit, JSONB, etc.)
  • โœ… Tension analysis (normalization, performance, evolution)
  • โœ… Report generation (Markdown, JSON)
  • โœ… CLI interface (flaqes analyze, flaqes analyze-ddl, flaqes introspect)
  • โœ… Comprehensive test suite (100% coverage)

Roadmap ๐Ÿšง

  • MySQL support
  • SQLite support
  • Historical schema tracking
  • LLM integration for natural language explanations
  • VS Code extension

Requirements

  • Python 3.13+
  • PostgreSQL 12+ (for database introspection)
  • asyncpg (included by default)
  • Docker (for running integration tests)

Contributing

Contributions welcome! This is an early-stage project.

# Clone and install
git clone https://github.com/brunolnetto/flaqes.git
cd flaqes
uv pip install -e .[dev]

# Run tests
uv run pytest

# Run with integration tests (requires Docker)
uv run pytest --run-integration

# Check coverage
uv run pytest --cov=flaqes --cov-report=term-missing

See IMPLEMENTATION_PLAN.md for architecture details.

License

MIT License - see LICENSE file for details

Acknowledgments

Inspired by the need for thoughtful schema review tools that understand context and trade-offs rather than enforcing rigid "best practices."


flaqes - Because your schema deserves a thoughtful review, not just a lint check.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flaqes-0.2.0.tar.gz (121.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flaqes-0.2.0-py3-none-any.whl (57.0 kB view details)

Uploaded Python 3

File details

Details for the file flaqes-0.2.0.tar.gz.

File metadata

  • Download URL: flaqes-0.2.0.tar.gz
  • Upload date:
  • Size: 121.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for flaqes-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e49a2cbeff766adcb6b40a3baab0969a13cba19f9416a7a890090a1794d2f4b0
MD5 78a9bfbccc64c460eabce1d5ebe895db
BLAKE2b-256 9424473d600b8e103bf7bbeb1d92acf0e1b74a894bd67111a70e3826dcc9489f

See more details on using hashes here.

File details

Details for the file flaqes-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: flaqes-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 57.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for flaqes-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e7b280ab89e191f38e79201fa370d5245b1a98205d063c81da32b90b48ef6aa
MD5 22b63b2a4e357131701414a0425b5ae9
BLAKE2b-256 1fadbb439acec4b8a9b5199630705397f63d5d7a657c69c749cd7e20dea96dfc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page