Skip to main content

CLI tool and Python SDK for importing Excel files to SQLite with advanced transformations and validation

Project description

Excel to SQLite

A powerful CLI tool and Python SDK for importing Excel files into SQLite databases with advanced data transformation, validation, and quality profiling features.

Python PyPI License

Overview

Excel to SQLite simplifies the process of importing Excel data into SQLite databases. It provides automatic schema detection, data transformations, validation rules, and includes an intelligent Auto-Pilot mode for zero-configuration setup.

Key Features:

  • Smart Import with automatic schema detection
  • Flexible data transformations (value mappings, calculated columns)
  • Comprehensive validation system
  • Data quality profiling and scoring
  • Auto-Pilot mode with pattern detection
  • Python SDK for programmatic access
  • Rich terminal output with detailed progress reporting

Table of Contents

Installation

From PyPI

pip install excel-to-sql

With uv

uv pip install excel-to-sql

From Source

git clone https://github.com/wareflowx/excel-to-sql.git
cd excel-to-sql
uv sync

Quick Start

Excel to SQLite provides two ways to get started: choose Auto-Pilot for automatic setup or Manual Configuration for complete control.

Option 1: Auto-Pilot Mode (Recommended)

The fastest way to import Excel files. Auto-Pilot automatically detects patterns, suggests transformations, and guides you through the setup.

# Analyze Excel files and generate configuration automatically
excel-to-sql magic --data ./path/to/excels

# Interactive mode with step-by-step guidance
excel-to-sql magic --data ./path/to/excels --interactive

# Dry run to analyze without generating configuration
excel-to-sql magic --data ./path/to/excels --dry-run

Auto-Pilot detects:

  • Primary and foreign keys automatically
  • Value mappings (e.g., "1"/"0" to "Active"/"Inactive")
  • Data quality issues with prioritized recommendations
  • Optimal data types for each column

Option 2: Manual Configuration

For complete control over the import process.

# 1. Initialize project
excel-to-sql init

# 2. Add mapping configuration
excel-to-sql config add --type products --table products --pk id

# 3. Import Excel file
excel-to-sql import --file products.xlsx --type products

# 4. Export data back to Excel
excel-to-sql export --table products --output report.xlsx

# 5. Profile data quality
excel-to-sql profile --table products --output quality-report.html

CLI Reference

Project Commands

Initialize Project

Creates a new excel-to-sql project with the required directory structure.

excel-to-sql init

Creates:

  • data/ - SQLite database location
  • config/ - Configuration files
  • imports/ - Imported Excel files
  • exports/ - Exported Excel files

Import Command

Import an Excel file into the database.

excel-to-sql import --file data.xlsx --type products

Options:

  • --file, -f - Path to Excel file (required)
  • --type, -t - Type configuration name (required)
  • --force - Re-import even if content unchanged

Export Command

Export database data back to Excel.

excel-to-sql export --table products --output report.xlsx

Options:

  • --table - Table name to export
  • --output, -o - Output Excel file path

Profile Command

Generate data quality reports.

excel-to-sql profile --table products --output quality-report.html

Configuration Commands

Add Type Configuration

Create a new mapping type interactively.

excel-to-sql config add --type customers --table customers --pk id

Options:

  • --add-type - Name for the new type
  • --table - Target table name
  • --pk - Primary key column(s)
  • --file - Excel file for auto-detection (optional)

List Types

Show all configured mapping types.

excel-to-sql config --list

Show Type Details

Display configuration for a specific type.

excel-to-sql config --show products

Remove Type

Delete a mapping type.

excel-to-sql config --remove old_type

Magic Command (Auto-Pilot)

Automatic configuration and import with pattern detection.

# Automatic mode
excel-to-sql magic --data ./excels --output .excel-to-sql

# Interactive mode
excel-to-sql magic --data ./excels --interactive

# Dry run
excel-to-sql magic --data ./excels --dry-run

Options:

  • --data, -d - Directory containing Excel files (default: current directory)
  • --output, -o - Output directory for configuration (default: .excel-to-sql)
  • --dry-run - Analyze without generating configuration
  • --interactive, -i - Interactive guided setup

Python SDK

The Python SDK provides programmatic access to all excel-to-sql features.

Basic Usage

from excel_to_sql import ExcelToSqlite

# Initialize SDK
sdk = ExcelToSqlite()

# Import Excel file with transformations
result = sdk.import_excel(
    file_path="data.xlsx",
    type_name="products",
    tags=["q1-2024", "verified"]
)

print(f"Imported {result['rows_imported']} rows")

# Query data
df = sdk.query("SELECT * FROM products WHERE price > 100")
print(df.head())

# Profile data quality
profile = sdk.profile_table("products")
print(f"Quality score: {profile['summary']['quality_score']}")
print(f"Issues found: {len(profile['issues'])}")

# Export to Excel with multi-sheet support
sdk.export_to_excel(
    output="report.xlsx",
    sheet_mapping={
        "Products": "products",
        "Categories": "SELECT * FROM categories"
    }
)

Advanced Transformations

from excel_to_sql import ExcelToSqlite
from excel_to_sql.transformations import ValueMapping, CalculatedColumn
from excel_to_sql.validators import ValidationRule

sdk = ExcelToSqlite()

# Configure value mappings
value_mappings = {
    "status": {"1": "Active", "0": "Inactive"},
    "state": {"NY": "New York", "CA": "California"}
}

# Configure calculated columns
calculated_columns = [
    CalculatedColumn("total", "quantity * price"),
    CalculatedColumn("tax", "total * 0.1"),
    CalculatedColumn("grand_total", "total + tax")
]

# Configure validation rules
validation_rules = [
    ValidationRule("id", "unique"),
    ValidationRule("email", "regex", pattern=r"^[^@]+@[^@]+\.[^@]+$"),
    ValidationRule("age", "range", min_value=0, max_value=120)
]

Auto-Pilot Mode

Auto-Pilot mode provides zero-configuration import with intelligent pattern detection, quality scoring, and automated recommendations.

What Auto-Pilot Detects

Pattern Detection:

  • Primary Keys - Identifies unique columns automatically
  • Foreign Keys - Detects relationships between tables
  • Value Mappings - Finds code columns requiring translation
  • Split Fields - Identifies redundant status columns to combine
  • Data Types - Infers optimal SQL types from data

Quality Analysis:

  • Quality Score (0-100) with letter grades (A-D)
  • Issue Detection - Null values, duplicates, type mismatches
  • Statistical Analysis - Value distributions, outliers
  • Data Profiling - Column types, null percentages

Smart Recommendations:

  • Prioritized Suggestions (HIGH/MEDIUM/LOW)
  • Auto-fixable Issues with one-click corrections
  • Default Value Suggestions
  • French Code Detection (ENTRÉE→inbound, SORTIE→outbound, etc.)

Auto-Pilot Components

The Auto-Pilot system consists of five main components:

PatternDetector - Analyzes Excel files and detects patterns with confidence scores

from excel_to_sql.auto_pilot import PatternDetector

detector = PatternDetector()
patterns = detector.detect_patterns(df, "table_name")

# Returns: primary_key, foreign_keys, value_mappings, split_fields, confidence

QualityScorer - Generates comprehensive quality reports

from excel_to_sql.auto_pilot import QualityScorer

scorer = QualityScorer()
report = scorer.generate_quality_report(df, "table_name")

# Returns: score (0-100), grade (A-D), issues, column_stats

RecommendationEngine - Provides prioritized, actionable recommendations

from excel_to_sql.auto_pilot import RecommendationEngine

engine = RecommendationEngine()
recommendations = engine.generate_recommendations(
    df, "table_name", quality_report, patterns
)

# Returns prioritized recommendations (HIGH/MEDIUM/LOW)

AutoFixer - Automatically fixes data quality issues

from excel_to_sql.auto_pilot import AutoFixer

fixer = AutoFixer()
result = fixer.apply_auto_fixes(
    df, file_path, "Sheet1", recommendations, dry_run=False
)

# Fixes: null values, French codes, split fields with backup system

InteractiveWizard - Guided configuration workflow

from excel_to_sql.ui import InteractiveWizard

wizard = InteractiveWizard()
result = wizard.run_interactive_mode(
    excel_files, patterns_dict, quality_dict, output_path
)

When to Use Auto-Pilot

Perfect for:

  • Quick prototyping and testing
  • Ad-hoc data imports
  • Exploring new datasets
  • Learning the tool
  • Small to medium datasets

Not ideal for:

  • Production deployments (use generated config as template)
  • Complex custom transformations
  • Highly specialized business logic
  • Performance-critical operations

Configuration

Mapping configuration is stored in config/mappings.json:

{
  "products": {
    "target_table": "products",
    "primary_key": ["id"],
    "column_mappings": {
      "ID": {"target": "id", "type": "integer"},
      "Name": {"target": "name", "type": "string"},
      "Price": {"target": "price", "type": "float"}
    },
    "value_mappings": [
      {
        "column": "status",
        "mappings": {"1": "Active", "0": "Inactive"}
      }
    ],
    "calculated_columns": [
      {
        "name": "total",
        "expression": "quantity * price"
      }
    ],
    "validation_rules": [
      {
        "column": "id",
        "type": "unique"
      }
    ]
  }
}

Column Types

Supported SQL types:

  • string - TEXT columns (default)
  • integer - INTEGER with Int64 (nullable)
  • float - REAL columns
  • boolean - BOOLEAN (0/1)
  • date - TIMESTAMP (ISO-8601)

Examples

Example 1: E-commerce Product Import

# Initialize project
excel-to-sql init

# Use Auto-Pilot to analyze products
excel-to-sql magic --data ./products --interactive

# Review generated configuration
cat config/mappings.json

# Import with auto-generated configuration
excel-to-sql import --file products.xlsx --type products

Example 2: Data Migration with Validation

from excel_to_sql import ExcelToSqlite
from excel_to_sql.validators import ValidationRule

sdk = ExcelToSqlite()

# Add custom validation rules
rules = [
    ValidationRule("email", "regex", pattern=r"^[^@]+@[^@]+\.[^@]+$"),
    ValidationRule("age", "range", min_value=0, max_value=120),
    ValidationRule("id", "unique")
]

# Import with validation
result = sdk.import_excel(
    "customers.xlsx",
    "customers",
    validation_rules=rules
)

if result['validation_errors']:
    print(f"Found {len(result['validation_errors'])} validation errors")

Example 3: Quality Analysis

from excel_to_sql import ExcelToSqlite

sdk = ExcelToSqlite()

# Profile data quality
profile = sdk.profile_table("orders")

# Check quality score
score = profile['summary']['quality_score']
grade = profile['summary']['grade']
print(f"Quality Score: {score}/100 ({grade})")

# Review issues
for issue in profile['issues']:
    print(f"{issue['severity']}: {issue['column']} - {issue['message']}")

# Generate HTML report
sdk.generate_quality_report(
    "orders",
    output="quality-report.html"
)

Development

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=excel_to_sql --cov-report=html

# Run specific test file
uv run pytest tests/test_auto_pilot.py -v

Project Structure

excel-to-sqlite/
├── excel_to_sql/          # Main package
│   ├── cli.py            # CLI interface
│   ├── sdk/              # Python SDK
│   ├── entities/         # Domain entities
│   ├── transformations/  # Data transformations
│   ├── validators/       # Data validation
│   ├── profiling/        # Quality analysis
│   ├── auto_pilot/       # Auto-Pilot mode
│   └── ui/               # Interactive wizard
├── tests/                # Test suite
├── docs/                 # Documentation
└── config/               # Configuration files

Test Coverage

  • 200+ tests with comprehensive coverage
  • 85% coverage on core modules

  • Integration tests with real Excel fixtures
  • Unit tests for all components

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

For documentation improvements, see docs/issues/001-documentation-website.md.

Changelog

See CHANGELOG.md for version history and release notes.

Version 0.3.0

Auto-Pilot Mode - Zero-Configuration Import:

  • Pattern Detection - Automatic detection of PKs, FKs, value mappings, split fields
  • Quality Scoring - Multi-dimensional data quality analysis with grades (A-D)
  • Smart Recommendations - Prioritized, actionable suggestions (HIGH/MEDIUM/LOW)
  • Auto-Fix Capabilities - One-click corrections for common data issues
  • Interactive Wizard - Step-by-step guided configuration workflow
  • French Code Support - Automatic translation (ENTRÉE→inbound, SORTIE→outbound, etc.)
  • Split Field Detection - Intelligent COALESCE for redundant columns
  • CLI Integration - magic command with --interactive flag

Testing:

  • 143+ tests for Auto-Pilot components
  • Integration tests with real Excel fixtures
  • 85% coverage for core Auto-Pilot modules

Total: 200+ tests with comprehensive coverage across all features.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_to_sql-0.3.1.tar.gz (145.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_to_sql-0.3.1-py3-none-any.whl (72.6 kB view details)

Uploaded Python 3

File details

Details for the file excel_to_sql-0.3.1.tar.gz.

File metadata

  • Download URL: excel_to_sql-0.3.1.tar.gz
  • Upload date:
  • Size: 145.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_to_sql-0.3.1.tar.gz
Algorithm Hash digest
SHA256 51f92c58ab5cbd713e861726aecba6d740e79a0421f0aa3bcfb79952b3a649e4
MD5 a17d702d5a9a98627a42b9a0dfa0c7ea
BLAKE2b-256 942500be7e307ab5c738e709362570027216031f7bf584218ebb3a043acda9d2

See more details on using hashes here.

File details

Details for the file excel_to_sql-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: excel_to_sql-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 72.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":null,"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for excel_to_sql-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2297cc494c5f480da578f225dc7d6babc405b26e1e8ee5bfb417a6b536c30238
MD5 de80fb43b96f46d47b232b75e27984a9
BLAKE2b-256 7a0fbb3fcd49bea4abe6e97de21de3c103adfd56a9aa416b4b17a424bb2392e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page