SQL interface for pandas DataFrames using SQLite with modern Python support
Project description
sequelframe
A lightweight, modern Python library to run SQL queries on pandas DataFrames using SQLite. Execute SQL commands on your DataFrames with ease!
Features
- Execute SQL queries on pandas DataFrames
- Support for multiple file formats (CSV, Excel, JSON, Parquet, Feather)
- Context manager support for automatic cleanup
- Type hints for better IDE support
- Parameterized queries for SQL injection protection
- Modern Python packaging (pyproject.toml)
- Zero configuration required
Installation
pip install sequelframe
For Excel file support:
pip install sequelframe[excel]
Quick Start
from sequelframe import SequelFrame
import pandas as pd
# From a DataFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35],
'city': ['New York', 'San Francisco', 'Los Angeles']
})
# Using context manager (recommended - auto cleanup!)
with SequelFrame(df) as db:
result = db.query('SELECT * FROM data WHERE age > 25')
print(result)
Usage
Initialization
from sequelframe import SequelFrame
# From a pandas DataFrame
db = SequelFrame(df)
# From a CSV file
db = SequelFrame('data.csv')
# From an Excel file
db = SequelFrame('data.xlsx')
# From other formats
db = SequelFrame('data.json')
db = SequelFrame('data.parquet')
db = SequelFrame('data.feather')
# With custom table name
db = SequelFrame(df, table_name='customers')
# With verbose output
db = SequelFrame(df, verbose=True)
# With custom temp directory
db = SequelFrame(df, temp_dir='/tmp/mydb')
Running SQL Queries
SELECT Queries
# Basic SELECT
result = db.query('SELECT * FROM data WHERE age > 25')
# Using runsql (alternative method)
result = db.runsql('SELECT name, age FROM data ORDER BY age DESC')
# Parameterized queries (prevents SQL injection)
result = db.query('SELECT * FROM data WHERE name = ?', ('Alice',))
result = db.query('SELECT * FROM data WHERE age > ? AND city = ?', (25, 'New York'))
INSERT, UPDATE, DELETE
# INSERT
db.runsql("INSERT INTO data (name, age, city) VALUES ('David', 28, 'Boston')")
# UPDATE
db.runsql("UPDATE data SET age = 31 WHERE name = 'Bob'")
# DELETE
db.runsql("DELETE FROM data WHERE age < 25")
# Using parameterized queries (safer)
db.runsql("INSERT INTO data (name, age, city) VALUES (?, ?, ?)",
('Eve', 27, 'Seattle'))
ALTER TABLE
# Add a column
db.runsql('ALTER TABLE data ADD COLUMN salary INTEGER')
# Update the new column
db.runsql('UPDATE data SET salary = 50000 WHERE name = "Alice"')
Utility Methods
# Show all data (optionally limit rows)
db.show()
db.show(limit=10)
# Get the current DataFrame
current_df = db.get_dataframe()
# Update the entire DataFrame
new_df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
db.update_dataframe(new_df)
# Get column names
print(db.columns) # ['name', 'age', 'city']
# Get shape
print(db.shape) # (3, 3)
# String representation
print(db) # SequelFrame(table='data', shape=(3, 3), status='open')
Context Manager (Recommended)
The context manager automatically handles cleanup:
# Automatic cleanup when done
with SequelFrame('data.csv') as db:
result = db.query('SELECT * FROM data WHERE age > 25')
db.runsql('UPDATE data SET age = age + 1')
final_data = db.show()
# Database and temp files are automatically cleaned up here
Manual Cleanup
If not using a context manager:
db = SequelFrame('data.csv')
try:
result = db.query('SELECT * FROM data')
finally:
db.kill() # Clean up database and temp files
Advanced Examples
Joining Data from Multiple DataFrames
# Create two DataFrames
customers = pd.DataFrame({
'customer_id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Charlie']
})
orders = pd.DataFrame({
'order_id': [101, 102, 103],
'customer_id': [1, 1, 2],
'amount': [100, 150, 200]
})
# Load both into SQLite
with SequelFrame(customers, table_name='customers') as db:
# Add the orders table
orders.to_sql('orders', db._conn, if_exists='replace', index=False)
# Join them
result = db.query('''
SELECT c.name, COUNT(o.order_id) as order_count, SUM(o.amount) as total
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
''')
print(result)
Complex Aggregations
with SequelFrame('sales.csv') as db:
monthly_report = db.query('''
SELECT
strftime('%Y-%m', date) as month,
COUNT(*) as num_sales,
SUM(amount) as total_sales,
AVG(amount) as avg_sale,
MAX(amount) as max_sale
FROM data
GROUP BY month
ORDER BY month DESC
''')
Window Functions
with SequelFrame(df) as db:
result = db.query('''
SELECT
name,
age,
ROW_NUMBER() OVER (ORDER BY age DESC) as rank,
AVG(age) OVER () as avg_age
FROM data
''')
New in Version 0.4.0
- Modern packaging with
pyproject.toml - Full type hint support
- Context manager support (
withstatement) - Parameterized queries for security
- Support for more file formats (JSON, Parquet, Feather)
- Better error handling and messages
- New utility methods (
get_dataframe(),update_dataframe()) - Property accessors (
columns,shape) - Improved cleanup handling
- Backward compatible with old
sequelframeclass name
Migration from 0.3.x
The library is fully backward compatible. Your old code will continue to work:
# Old style (still works)
from sequelframe import sequelframe
db = sequelframe('data.csv')
db.runsql('SELECT * FROM data')
db.kill()
# New style (recommended)
from sequelframe import SequelFrame
with SequelFrame('data.csv') as db:
result = db.query('SELECT * FROM data')
Development
Install development dependencies:
pip install -e ".[dev]"
Run tests:
pytest
Format code:
black sequelframe/
ruff check sequelframe/
Type checking:
mypy sequelframe/
Why sequelframe?
- Simple: Just pass a DataFrame or file path and start querying
- Powerful: Full SQL support via SQLite
- Safe: Parameterized queries prevent SQL injection
- Modern: Type hints, context managers, and modern Python practices
- Lightweight: Minimal dependencies (just pandas)
- Flexible: Supports multiple file formats
Limitations
- Uses SQLite, so you're limited to SQLite's SQL dialect
- Data is temporarily copied to a SQLite database (memory overhead)
- Not recommended for very large datasets (consider using DuckDB or Polars instead)
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details.
Author
Sina Mirshahi
- Email: sina7th@gmail.com
- GitHub: @Sinamirshahi
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sequelframe-0.4.0.tar.gz.
File metadata
- Download URL: sequelframe-0.4.0.tar.gz
- Upload date:
- Size: 11.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6940801df9b2b3a7b3c2f885c160897c38e869bc02bb73e37501efca9d3fac5
|
|
| MD5 |
1045a5468af7500d02e438d4da6ca59c
|
|
| BLAKE2b-256 |
135e90c20a718325aa7a906d219b2a79974461916f401912a3f2dd964a2ea6d2
|
File details
Details for the file sequelframe-0.4.0-py3-none-any.whl.
File metadata
- Download URL: sequelframe-0.4.0-py3-none-any.whl
- Upload date:
- Size: 8.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e60924b2c964c48a699ae68f960c50df8533e990087313d4063926b80d96d9f
|
|
| MD5 |
7b79262697744ce68b984df71cd1b98e
|
|
| BLAKE2b-256 |
bf43deb95dd02c2b903e29f24934f7ca47bf5123d5f8005912b8fac0d7333771
|