Skip to main content

Advanced MySQL SQL Parser with Visualization Component Support

Project description

SQL Splitter 🎯

Advanced MySQL SQL Parser with Visualization Component Support

Python Version License Version

🚀 Features

  • 🎨 Field Type Classification: Automatically categorizes fields as column, aggregation, expression, or computed
  • 🔗 Aggregation Scope Tracking: Tracks which tables are involved in aggregation functions like COUNT(*)
  • 📊 Visualization-Ready Output: Enhanced JSON format perfect for SQL diagram generation
  • 🛠️ Advanced JOIN Detection: Handles complex nested JOINs and old-style comma-separated syntax
  • 🏷️ Smart Alias Resolution: Context-aware alias mapping and resolution
  • 🐬 MySQL Compatibility: Full MySQL syntax support with normalization
  • 📋 Comprehensive Metadata: Provides detailed parsing information for debugging and visualization

📦 Installation

pip install sql-splitter

Or install from source:

git clone https://github.com/alexkwok22/sql-splitter.git
cd sql-splitter
pip install -e .

🎯 Quick Start

Basic Usage

from sql_splitter import SQLParserAST

# Initialize parser
parser = SQLParserAST()

# Parse SQL query
sql = """
SELECT 
    users.name,
    COUNT(*) as total_orders,
    SUM(orders.amount) as total_revenue
FROM users 
JOIN orders ON users.id = orders.user_id 
WHERE users.status = 'active'
GROUP BY users.name
"""

result = parser.parse(sql)
print(result)

Enhanced JSON Output

{
  "success": true,
  "fields": [
    {
      "table": "users",
      "field": "users.name",
      "alias": "name",
      "fieldType": "column",
      "involvedTables": ["users"]
    },
    {
      "table": null,
      "field": "COUNT(*)",
      "alias": "total_orders",
      "fieldType": "aggregation",
      "aggregationScope": ["users", "orders"],
      "involvedTables": ["users", "orders"]
    },
    {
      "table": "orders",
      "field": "SUM(orders.amount)",
      "alias": "total_revenue", 
      "fieldType": "aggregation",
      "involvedTables": ["orders"]
    }
  ],
  "tables": ["users", "orders"],
  "joins": [
    {
      "type": "JOIN",
      "leftTable": "users",
      "leftField": "id",
      "rightTable": "orders", 
      "rightField": "user_id",
      "condition": "users.id = orders.user_id"
    }
  ],
  "whereConditions": ["users.status = 'active'"],
  "parser": "sqlsplit",
  "metadata": {
    "aliasMapping": {},
    "aggregationFields": ["total_orders", "total_revenue"],
    "computedFields": [],
    "unresolved": {
      "aliases": [],
      "fields": []
    }
  }
}

🎨 Visualization Components Support

Field Type Classification

SQL Splitter automatically classifies fields into four types:

  • column: Simple table columns (users.name)
  • aggregation: Aggregate functions (COUNT(*), SUM(amount))
  • expression: Complex expressions (DATE_FORMAT(created_at, '%Y-%m'))
  • computed: Conditional logic (CASE WHEN status = 1 THEN 'active' END)

Aggregation Scope Tracking

For visualization components, aggregation functions include aggregationScope to show which tables are involved:

# COUNT(*) shows all tables in the query
{
  "field": "COUNT(*)",
  "fieldType": "aggregation",
  "aggregationScope": ["users", "orders", "products"]  # All related tables
}

# Specific aggregations show only relevant tables
{
  "field": "SUM(orders.amount)",
  "fieldType": "aggregation", 
  "aggregationScope": ["orders"]  # Only orders table
}

🛠️ Advanced Features

MySQL Normalization

from sql_splitter import MySQLCompatibleNormalizer

normalizer = MySQLCompatibleNormalizer()
normalized_sql, rules, errors = normalizer.normalize_query(sql)

Old-Style JOIN Conversion

Automatically converts old-style comma-separated JOINs:

-- Input: Old-style
SELECT * FROM users a, orders b WHERE a.id = b.user_id

-- Output: Modern JOIN
SELECT * FROM users a JOIN orders b ON a.id = b.user_id

Context-Aware Alias Resolution

Handles complex alias scenarios:

# Resolves aliases like 'u' -> 'users', 'o' -> 'orders'
"metadata": {
  "aliasMapping": {
    "u": "users",
    "o": "orders"
  }
}

📚 Documentation

🧪 Testing

# Run basic tests
python -m pytest tests/

# Run with coverage
python -m pytest tests/ --cov=sql_splitter --cov-report=html

📋 Requirements

  • Python 3.7+
  • No external dependencies (pure Python implementation)

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Built for SQL visualization component developers
  • Supports complex MySQL queries and edge cases
  • Designed with performance and accuracy in mind

Made with ❤️ for the SQL visualization community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sql_splitter-6.0.1.tar.gz (43.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sql_splitter-6.0.1-py3-none-any.whl (48.2 kB view details)

Uploaded Python 3

File details

Details for the file sql_splitter-6.0.1.tar.gz.

File metadata

  • Download URL: sql_splitter-6.0.1.tar.gz
  • Upload date:
  • Size: 43.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.9

File hashes

Hashes for sql_splitter-6.0.1.tar.gz
Algorithm Hash digest
SHA256 e135c3ccb452daa98bb79060e58ed97bc53ba0c6e09084cf6c9b72345821d63c
MD5 24ee0efda0dfe69c991b5bc61e867470
BLAKE2b-256 262b24a47c508d44103f38447a37128ad1e3c566da40bdbf7f7f110dbd5de571

See more details on using hashes here.

File details

Details for the file sql_splitter-6.0.1-py3-none-any.whl.

File metadata

  • Download URL: sql_splitter-6.0.1-py3-none-any.whl
  • Upload date:
  • Size: 48.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.9

File hashes

Hashes for sql_splitter-6.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2a62c8fe9899fabc70dda5aa851a573ea4a2e47a610bc110c2677b1732a83a7b
MD5 63ae0f8cfbf612f9cb7f3b04677678d7
BLAKE2b-256 2b509f6acee82781477267f505e633cf6f59f416adafbf9e3b29e512dff0785a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page