Skip to main content

Advanced MySQL SQL Parser with Visualization Component Support

Project description

SQL Splitter 🎯

Advanced MySQL SQL Parser with Visualization Component Support

Python Version License Version

🚀 Features

  • 🎨 Field Type Classification: Automatically categorizes fields as column, aggregation, expression, or computed
  • 🔗 Aggregation Scope Tracking: Tracks which tables are involved in aggregation functions like COUNT(*)
  • 📊 Visualization-Ready Output: Enhanced JSON format perfect for SQL diagram generation
  • 🛠️ Advanced JOIN Detection: Handles complex nested JOINs and old-style comma-separated syntax
  • 🏷️ Smart Alias Resolution: Context-aware alias mapping and resolution
  • 🐬 MySQL Compatibility: Full MySQL syntax support with normalization
  • 📋 Comprehensive Metadata: Provides detailed parsing information for debugging and visualization

📦 Installation

pip install sql-splitter

Or install from source:

git clone https://github.com/alexkwok22/sql-splitter.git
cd sql-splitter
pip install -e .

🎯 Quick Start

Basic Usage

from sql_splitter import SQLParserAST

# Initialize parser
parser = SQLParserAST()

# Parse SQL query
sql = """
SELECT 
    users.name,
    COUNT(*) as total_orders,
    SUM(orders.amount) as total_revenue
FROM users 
JOIN orders ON users.id = orders.user_id 
WHERE users.status = 'active'
GROUP BY users.name
"""

result = parser.parse(sql)
print(result)

Enhanced JSON Output

{
  "success": true,
  "fields": [
    {
      "table": "users",
      "field": "users.name",
      "alias": "name",
      "fieldType": "column",
      "involvedTables": ["users"]
    },
    {
      "table": null,
      "field": "COUNT(*)",
      "alias": "total_orders",
      "fieldType": "aggregation",
      "aggregationScope": ["users", "orders"],
      "involvedTables": ["users", "orders"]
    },
    {
      "table": "orders",
      "field": "SUM(orders.amount)",
      "alias": "total_revenue", 
      "fieldType": "aggregation",
      "involvedTables": ["orders"]
    }
  ],
  "tables": ["users", "orders"],
  "joins": [
    {
      "type": "JOIN",
      "leftTable": "users",
      "leftField": "id",
      "rightTable": "orders", 
      "rightField": "user_id",
      "condition": "users.id = orders.user_id"
    }
  ],
  "whereConditions": ["users.status = 'active'"],
  "parser": "sqlsplit",
  "metadata": {
    "aliasMapping": {},
    "aggregationFields": ["total_orders", "total_revenue"],
    "computedFields": [],
    "unresolved": {
      "aliases": [],
      "fields": []
    }
  }
}

🎨 Visualization Components Support

Field Type Classification

SQL Splitter automatically classifies fields into four types:

  • column: Simple table columns (users.name)
  • aggregation: Aggregate functions (COUNT(*), SUM(amount))
  • expression: Complex expressions (DATE_FORMAT(created_at, '%Y-%m'))
  • computed: Conditional logic (CASE WHEN status = 1 THEN 'active' END)

Aggregation Scope Tracking

For visualization components, aggregation functions include aggregationScope to show which tables are involved:

# COUNT(*) shows all tables in the query
{
  "field": "COUNT(*)",
  "fieldType": "aggregation",
  "aggregationScope": ["users", "orders", "products"]  # All related tables
}

# Specific aggregations show only relevant tables
{
  "field": "SUM(orders.amount)",
  "fieldType": "aggregation", 
  "aggregationScope": ["orders"]  # Only orders table
}

🛠️ Advanced Features

MySQL Normalization

from sql_splitter import MySQLCompatibleNormalizer

normalizer = MySQLCompatibleNormalizer()
normalized_sql, rules, errors = normalizer.normalize_query(sql)

Old-Style JOIN Conversion

Automatically converts old-style comma-separated JOINs:

-- Input: Old-style
SELECT * FROM users a, orders b WHERE a.id = b.user_id

-- Output: Modern JOIN
SELECT * FROM users a JOIN orders b ON a.id = b.user_id

Context-Aware Alias Resolution

Handles complex alias scenarios:

# Resolves aliases like 'u' -> 'users', 'o' -> 'orders'
"metadata": {
  "aliasMapping": {
    "u": "users",
    "o": "orders"
  }
}

📚 Documentation

🧪 Testing

# Run basic tests
python -m pytest tests/

# Run with coverage
python -m pytest tests/ --cov=sql_splitter --cov-report=html

📋 Requirements

  • Python 3.7+
  • No external dependencies (pure Python implementation)

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Built for SQL visualization component developers
  • Supports complex MySQL queries and edge cases
  • Designed with performance and accuracy in mind

Made with ❤️ for the SQL visualization community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sql_splitter-6.1.0.tar.gz (45.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sql_splitter-6.1.0-py3-none-any.whl (49.2 kB view details)

Uploaded Python 3

File details

Details for the file sql_splitter-6.1.0.tar.gz.

File metadata

  • Download URL: sql_splitter-6.1.0.tar.gz
  • Upload date:
  • Size: 45.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.9

File hashes

Hashes for sql_splitter-6.1.0.tar.gz
Algorithm Hash digest
SHA256 28269494aa6307fb9c0f286716f60f88482591035695c28f34095605c489b85e
MD5 7d2d8fb3b0ccfafb252daeec0a56a187
BLAKE2b-256 8a667976de83db4e823bc0a7fd583c896944ff196afbe8993ce3280c25e5005d

See more details on using hashes here.

File details

Details for the file sql_splitter-6.1.0-py3-none-any.whl.

File metadata

  • Download URL: sql_splitter-6.1.0-py3-none-any.whl
  • Upload date:
  • Size: 49.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.9

File hashes

Hashes for sql_splitter-6.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5e6b4737670704078da136ea1a141c825542bd73c5acd60f8d8cd25f06483503
MD5 79122285eb56de4b9d6324af22d64e0f
BLAKE2b-256 b46c24af72d5d25cd6e5cdda8890ad201721df38b302f72150d7aca3617a96a7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page