Skip to main content

A schema critic for databases - analyze structure, surface trade-offs, propose alternatives

Project description

flaqes 🔍

A schema critic for PostgreSQL databases

flaqes analyzes database structures and surfaces design tensions, trade-offs, and alternative approaches based on your stated intent. Think of it as a thoughtful colleague who reviews your schema and explains why things are the way they are, not just what they are.

Features

  • 🎯 Intent-Aware Analysis - Provides contextual advice based on your workload (OLTP, OLAP, or mixed)
  • 🔍 Role Detection - Identifies semantic roles (fact tables, dimensions, events, junctions, etc.) with confidence scores
  • 🎨 Pattern Recognition - Detects design patterns like SCD Type 2, soft deletes, polymorphic associations, and more
  • ⚖️ Design Tensions - Surfaces trade-offs in your current design with alternatives and effort estimates
  • 📊 Comprehensive Reports - Generates structured reports in Markdown or JSON format
  • 🔬 No Mutations - Analysis only, never modifies your database

Installation

# Basic installation
pip install flaqes

# With PostgreSQL support (required for v0.1)
pip install flaqes[postgresql]

# Development installation
pip install flaqes[dev]

Or using uv:

uv pip install flaqes[postgresql]

Quick Start

Command-Line Interface

# Analyze entire database
flaqes analyze postgresql://user:pass@localhost/mydb

# Use OLAP intent preset
flaqes analyze --intent olap postgresql://localhost/mydb

# Analyze specific tables and save to file
flaqes analyze --tables users,orders --output report.md postgresql://localhost/mydb

# JSON output for automation
flaqes analyze --format json --output report.json postgresql://localhost/mydb

See the CLI Guide for comprehensive usage examples.

Python API

import asyncio
from flaqes import analyze_schema, Intent

async def main():
    # Define your workload intent
    intent = Intent(
        workload="OLAP",
        write_frequency="low",
        read_patterns=["aggregation", "range_scan"],
        data_volume="large",
        evolution_rate="high",
    )
    
    # Analyze your database
    report = await analyze_schema(
        dsn="postgresql://user:pass@localhost/mydb",
        intent=intent,
    )
    
    # View the markdown report
    print(report.to_markdown())
    
    # Or export as JSON
    import json
    print(json.dumps(report.to_dict(), indent=2))

asyncio.run(main())

What Makes flaqes Different?

Unlike traditional schema validators or linters, flaqes:

  1. Understands Intent - Recommendations depend on your workload. A denormalized table might be problematic for OLTP but perfect for OLAP.

  2. Embraces Uncertainty - Every inference includes a confidence score and the signals that led to it. No black-box "best practices."

  3. Explains Trade-offs - Instead of saying "this is wrong," flaqes says "here's what you gain, here's what you risk, and here's when it might break."

  4. Never Mutates - flaqes is read-only. It analyzes and advises, never changes your database.

Example Output

# Schema Analysis Report: public

**Tables analyzed:** 12
**Workload:** OLAP
**Data volume:** large

## Summary

### Table Roles
- **FACT**: 3
- **DIMENSION**: 6
- **JUNCTION**: 2
- **EVENT**: 1

### Design Patterns
- **AUDIT_TIMESTAMPS**: 8
- **SOFT_DELETE**: 4
- **SCD_TYPE_2**: 2

### Design Tensions
- 🔴 **Critical**: 2
- 🟡 **Warning**: 5
- 🔵 **Info**: 3

---

## Design Tensions

### 🔴 Critical Issues

#### public.orders: Missing index on frequently joined column
**Risk:** Full table scans on large table during joins will severely impact query performance.
**Breaking point:** When table exceeds 100K rows or join queries exceed 1s response time.
**Alternatives:** 2
- Add B-tree index on customer_id (low effort)
- Partition table by order_date and add local indexes (medium effort)

Documentation

Intent Specification

The Intent dataclass captures your workload characteristics:

from flaqes import Intent

intent = Intent(
    workload="OLTP" | "OLAP" | "mixed",
    write_frequency="high" | "medium" | "low",
    read_patterns=["point_lookup", "range_scan", "aggregation", "join_heavy"],
    consistency="strong" | "eventual",
    evolution_rate="high" | "medium" | "low" | "frozen",
    data_volume="small" | "medium" | "large" | "massive",
)

Common presets are available:

from flaqes.core.intent import (
    OLTP_INTENT,
    OLAP_INTENT,
    EVENT_SOURCING_INTENT,
    STARTUP_MVP_INTENT,
)

Lower-Level API

For custom analysis workflows:

from flaqes import introspect_schema
from flaqes.analysis import RoleDetector, PatternDetector, TensionAnalyzer

# Just introspect the schema
graph = await introspect_schema("postgresql://localhost/mydb")

# Run individual analyzers
role_detector = RoleDetector()
pattern_detector = PatternDetector()
tension_analyzer = TensionAnalyzer(intent=intent)

for table in graph:
    role_result = role_detector.detect(table, graph)
    print(f"{table.name}: {role_result.primary_role.name}")

Architecture

flaqes operates in three layers:

  1. Structural Facts Layer (Objective)

    • Introspects database catalogs
    • Extracts tables, columns, keys, constraints, indexes
    • Builds a complete SchemaGraph
  2. Semantic Heuristics Layer (Probabilistic)

    • Detects table roles via structural signals
    • Identifies design patterns via naming and structure
    • All with confidence scores
  3. Intent-Aware Analysis Layer (Advisory)

    • Analyzes design tensions based on stated intent
    • Proposes alternatives with trade-off explanations
    • Severity depends on workload characteristics

Development Status

Version: 0.1.0 (Alpha)

Completed ✅

  • ✅ PostgreSQL introspection
  • ✅ Role detection (fact, dimension, event, junction, etc.)
  • ✅ Pattern matching (SCD, soft delete, polymorphic, audit, etc.)
  • ✅ Tension analysis (normalization, performance, evolution)
  • ✅ Report generation (Markdown, JSON)
  • ✅ Comprehensive test suite
  • ✅ Command-line interface

Roadmap 🚧

  • DDL parsing for offline analysis
  • MySQL support
  • SQLite support
  • Historical schema tracking
  • LLM integration for natural language explanations

Requirements

  • Python 3.10+
  • PostgreSQL 12+ (for database introspection)
  • asyncpg (installed with flaqes[postgresql])

Contributing

Contributions welcome! This is an early-stage project. See IMPLEMENTATION_PLAN.md for architecture details.

License

MIT License - see LICENSE file for details

Acknowledgments

Inspired by the need for thoughtful schema review tools that understand context and trade-offs rather than enforcing rigid "best practices."


Note: flaqes is alpha software. The API may change in future versions. Use in production with caution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flaqes-0.1.0.tar.gz (124.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flaqes-0.1.0-py3-none-any.whl (50.2 kB view details)

Uploaded Python 3

File details

Details for the file flaqes-0.1.0.tar.gz.

File metadata

  • Download URL: flaqes-0.1.0.tar.gz
  • Upload date:
  • Size: 124.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for flaqes-0.1.0.tar.gz
Algorithm Hash digest
SHA256 af3f033cd3d6f5eb91396c59e067eb1b7c68f719eab3b45aef7da2ab79997b9b
MD5 d60ce036b2fd4e6952284f7013d18907
BLAKE2b-256 885b155b7b8a67c10f81cf3745000efb1f61780b60fda42af32133d5ad4e1c8a

See more details on using hashes here.

File details

Details for the file flaqes-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: flaqes-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 50.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.2

File hashes

Hashes for flaqes-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bd707990a5d121f2d476003000f0ed0e08d26473c37ea81a873dcd23d1c99967
MD5 1a4e2b09341543caf8ba24828d775621
BLAKE2b-256 5f6c8b6f0fd898212cec22fcc254e32d199e60662b1de1ad30d53c70553d4099

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page