Skip to main content

SQL interface for pandas DataFrames using SQLite with modern Python support

Project description

sequelframe

A lightweight, modern Python library to run SQL queries on pandas DataFrames using SQLite. Execute SQL commands on your DataFrames with ease!

Python Version License: MIT

Features

  • Execute SQL queries on pandas DataFrames
  • Support for multiple file formats (CSV, Excel, JSON, Parquet, Feather)
  • Context manager support for automatic cleanup
  • Type hints for better IDE support
  • Parameterized queries for SQL injection protection
  • Modern Python packaging (pyproject.toml)
  • Zero configuration required

Installation

pip install sequelframe

For Excel file support:

pip install sequelframe[excel]

Quick Start

from sequelframe import SequelFrame
import pandas as pd

# From a DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['New York', 'San Francisco', 'Los Angeles']
})

# Using context manager (recommended - auto cleanup!)
with SequelFrame(df) as db:
    result = db.query('SELECT * FROM data WHERE age > 25')
    print(result)

Usage

Initialization

from sequelframe import SequelFrame

# From a pandas DataFrame
db = SequelFrame(df)

# From a CSV file
db = SequelFrame('data.csv')

# From an Excel file
db = SequelFrame('data.xlsx')

# From other formats
db = SequelFrame('data.json')
db = SequelFrame('data.parquet')
db = SequelFrame('data.feather')

# With custom table name
db = SequelFrame(df, table_name='customers')

# With verbose output
db = SequelFrame(df, verbose=True)

# With custom temp directory
db = SequelFrame(df, temp_dir='/tmp/mydb')

Running SQL Queries

SELECT Queries

# Basic SELECT
result = db.query('SELECT * FROM data WHERE age > 25')

# Using runsql (alternative method)
result = db.runsql('SELECT name, age FROM data ORDER BY age DESC')

# Parameterized queries (prevents SQL injection)
result = db.query('SELECT * FROM data WHERE name = ?', ('Alice',))
result = db.query('SELECT * FROM data WHERE age > ? AND city = ?', (25, 'New York'))

INSERT, UPDATE, DELETE

# INSERT
db.runsql("INSERT INTO data (name, age, city) VALUES ('David', 28, 'Boston')")

# UPDATE
db.runsql("UPDATE data SET age = 31 WHERE name = 'Bob'")

# DELETE
db.runsql("DELETE FROM data WHERE age < 25")

# Using parameterized queries (safer)
db.runsql("INSERT INTO data (name, age, city) VALUES (?, ?, ?)",
          ('Eve', 27, 'Seattle'))

ALTER TABLE

# Add a column
db.runsql('ALTER TABLE data ADD COLUMN salary INTEGER')

# Update the new column
db.runsql('UPDATE data SET salary = 50000 WHERE name = "Alice"')

Utility Methods

# Show all data (optionally limit rows)
db.show()
db.show(limit=10)

# Get the current DataFrame
current_df = db.get_dataframe()

# Update the entire DataFrame
new_df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
db.update_dataframe(new_df)

# Get column names
print(db.columns)  # ['name', 'age', 'city']

# Get shape
print(db.shape)  # (3, 3)

# String representation
print(db)  # SequelFrame(table='data', shape=(3, 3), status='open')

Context Manager (Recommended)

The context manager automatically handles cleanup:

# Automatic cleanup when done
with SequelFrame('data.csv') as db:
    result = db.query('SELECT * FROM data WHERE age > 25')
    db.runsql('UPDATE data SET age = age + 1')
    final_data = db.show()
# Database and temp files are automatically cleaned up here

Manual Cleanup

If not using a context manager:

db = SequelFrame('data.csv')
try:
    result = db.query('SELECT * FROM data')
finally:
    db.kill()  # Clean up database and temp files

Advanced Examples

Joining Data from Multiple DataFrames

# Create two DataFrames
customers = pd.DataFrame({
    'customer_id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie']
})

orders = pd.DataFrame({
    'order_id': [101, 102, 103],
    'customer_id': [1, 1, 2],
    'amount': [100, 150, 200]
})

# Load both into SQLite
with SequelFrame(customers, table_name='customers') as db:
    # Add the orders table
    orders.to_sql('orders', db._conn, if_exists='replace', index=False)

    # Join them
    result = db.query('''
        SELECT c.name, COUNT(o.order_id) as order_count, SUM(o.amount) as total
        FROM customers c
        LEFT JOIN orders o ON c.customer_id = o.customer_id
        GROUP BY c.customer_id, c.name
    ''')
    print(result)

Complex Aggregations

with SequelFrame('sales.csv') as db:
    monthly_report = db.query('''
        SELECT
            strftime('%Y-%m', date) as month,
            COUNT(*) as num_sales,
            SUM(amount) as total_sales,
            AVG(amount) as avg_sale,
            MAX(amount) as max_sale
        FROM data
        GROUP BY month
        ORDER BY month DESC
    ''')

Window Functions

with SequelFrame(df) as db:
    result = db.query('''
        SELECT
            name,
            age,
            ROW_NUMBER() OVER (ORDER BY age DESC) as rank,
            AVG(age) OVER () as avg_age
        FROM data
    ''')

New in Version 0.4.0

  • Modern packaging with pyproject.toml
  • Full type hint support
  • Context manager support (with statement)
  • Parameterized queries for security
  • Support for more file formats (JSON, Parquet, Feather)
  • Better error handling and messages
  • New utility methods (get_dataframe(), update_dataframe())
  • Property accessors (columns, shape)
  • Improved cleanup handling
  • Backward compatible with old sequelframe class name

Migration from 0.3.x

The library is fully backward compatible. Your old code will continue to work:

# Old style (still works)
from sequelframe import sequelframe
db = sequelframe('data.csv')
db.runsql('SELECT * FROM data')
db.kill()

# New style (recommended)
from sequelframe import SequelFrame
with SequelFrame('data.csv') as db:
    result = db.query('SELECT * FROM data')

Development

Install development dependencies:

pip install -e ".[dev]"

Run tests:

pytest

Format code:

black sequelframe/
ruff check sequelframe/

Type checking:

mypy sequelframe/

Why sequelframe?

  • Simple: Just pass a DataFrame or file path and start querying
  • Powerful: Full SQL support via SQLite
  • Safe: Parameterized queries prevent SQL injection
  • Modern: Type hints, context managers, and modern Python practices
  • Lightweight: Minimal dependencies (just pandas)
  • Flexible: Supports multiple file formats

Limitations

  • Uses SQLite, so you're limited to SQLite's SQL dialect
  • Data is temporarily copied to a SQLite database (memory overhead)
  • Not recommended for very large datasets (consider using DuckDB or Polars instead)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Author

Sina Mirshahi

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sequelframe-0.4.0.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sequelframe-0.4.0-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file sequelframe-0.4.0.tar.gz.

File metadata

  • Download URL: sequelframe-0.4.0.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for sequelframe-0.4.0.tar.gz
Algorithm Hash digest
SHA256 a6940801df9b2b3a7b3c2f885c160897c38e869bc02bb73e37501efca9d3fac5
MD5 1045a5468af7500d02e438d4da6ca59c
BLAKE2b-256 135e90c20a718325aa7a906d219b2a79974461916f401912a3f2dd964a2ea6d2

See more details on using hashes here.

File details

Details for the file sequelframe-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: sequelframe-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for sequelframe-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e60924b2c964c48a699ae68f960c50df8533e990087313d4063926b80d96d9f
MD5 7b79262697744ce68b984df71cd1b98e
BLAKE2b-256 bf43deb95dd02c2b903e29f24934f7ca47bf5123d5f8005912b8fac0d7333771

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page