SQL interface for pandas DataFrames using SQLite with modern Python support
Project description
sequelframe
A lightweight Python library to run SQL queries on pandas DataFrames using SQLite.
Features
- Full SQL on any pandas DataFrame — SELECT, INSERT, UPDATE, DELETE, CTEs, window functions
- Multi-table support — load several DataFrames as separate tables and JOIN across them
- In-memory mode — skip the temp file entirely for faster, zero-footprint usage
- Transaction control — commit or roll back a group of writes atomically
- Schema inspection — list tables, inspect columns
- Parameterized queries for SQL injection protection
- File format support: CSV, Excel, JSON, Parquet, Feather
- Context manager support for automatic cleanup
- Full type hints
Installation
pip install sequelframe
For Excel file support:
pip install sequelframe[excel]
Quick Start
import pandas as pd
from sequelframe import SequelFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35],
'city': ['New York', 'San Francisco', 'Los Angeles'],
})
with SequelFrame(df) as db:
result = db.query('SELECT * FROM data WHERE age > 25')
print(result)
Usage
Initialization
from sequelframe import SequelFrame
# From a pandas DataFrame
db = SequelFrame(df)
# From a file (CSV, Excel, JSON, Parquet, Feather)
db = SequelFrame('data.csv')
db = SequelFrame('data.xlsx')
db = SequelFrame('data.parquet')
# Custom table name
db = SequelFrame(df, table_name='customers')
# In-memory database — faster, no temp file written to disk
db = SequelFrame(df, in_memory=True)
# Custom directory for the temp .sqlite file
db = SequelFrame(df, temp_dir='/tmp/mydb')
# Print every SELECT result automatically
db = SequelFrame(df, verbose=True)
SELECT queries
# Basic SELECT — returns a DataFrame
result = db.query('SELECT * FROM data WHERE age > 25')
# runsql() is the same but also handles write statements
result = db.runsql('SELECT name, age FROM data ORDER BY age DESC')
# Parameterized queries (recommended — prevents SQL injection)
result = db.query('SELECT * FROM data WHERE name = ?', ('Alice',))
result = db.query('SELECT * FROM data WHERE age > ? AND city = ?', (25, 'New York'))
# CTEs and window functions work out of the box
result = db.query('''
WITH ranked AS (
SELECT name, age,
ROW_NUMBER() OVER (ORDER BY age DESC) AS rank
FROM data
)
SELECT * FROM ranked WHERE rank <= 3
''')
INSERT, UPDATE, DELETE
db.runsql("INSERT INTO data (name, age, city) VALUES ('David', 28, 'Boston')")
db.runsql("UPDATE data SET age = 31 WHERE name = 'Bob'")
db.runsql("DELETE FROM data WHERE age < 25")
# Parameterized writes
db.runsql("INSERT INTO data (name, age, city) VALUES (?, ?, ?)", ('Eve', 27, 'Seattle'))
db.runsql("UPDATE data SET age = ? WHERE name = ?", (32, 'Alice'))
After any write, db.df is automatically refreshed so it stays in sync.
Multiple tables and JOINs
Use add_table() to register additional DataFrames, then JOIN them in SQL:
customers = pd.DataFrame({
'customer_id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Charlie'],
})
orders = pd.DataFrame({
'order_id': [101, 102, 103],
'customer_id': [1, 1, 2],
'amount': [100, 150, 200],
})
with SequelFrame(customers, table_name='customers', in_memory=True) as db:
db.add_table(orders, 'orders') # chainable: db.add_table(...).add_table(...)
result = db.query('''
SELECT c.name,
COUNT(o.order_id) AS order_count,
SUM(o.amount) AS total
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name
''')
print(result)
Transactions
Wrap multiple writes in a single atomic transaction:
with SequelFrame(df, in_memory=True) as db:
with db.transaction():
db.runsql("INSERT INTO data VALUES ('Eve', 28, 'Miami')")
db.runsql("UPDATE data SET age = age + 1 WHERE name = 'Alice'")
# committed — db.df is refreshed automatically
# Any exception inside rolls the whole block back:
try:
with db.transaction():
db.runsql("DELETE FROM data")
raise RuntimeError("oops")
except RuntimeError:
pass
# table is still intact
Multi-statement scripts
db.execute_script('''
ALTER TABLE data ADD COLUMN score REAL DEFAULT 0.0;
UPDATE data SET score = age * 1.5;
''')
Schema inspection
# All tables in the database
db.tables() # ['data', 'orders']
# Column info for a table
db.schema() # primary table
db.schema('orders') # specific table
# returns: cid | name | type | notnull | dflt_value | pk
Utility methods
db.show() # all rows as a DataFrame
db.show(limit=10) # first 10 rows
db.columns # ['name', 'age', 'city']
db.shape # (3, 3)
db.row_count # live row count from the database
db.get_dataframe() # copy of the current primary-table DataFrame
db.update_dataframe(new_df) # replace primary table entirely
Context manager (recommended)
with SequelFrame('data.csv') as db:
result = db.query('SELECT * FROM data WHERE age > 25')
db.runsql('UPDATE data SET age = age + 1')
# connection closed and temp file deleted automatically
Manual cleanup
db = SequelFrame('data.csv')
try:
result = db.query('SELECT * FROM data')
finally:
db.kill()
Advanced examples
Aggregations
with SequelFrame('sales.csv') as db:
report = db.query('''
SELECT
strftime('%Y-%m', date) AS month,
COUNT(*) AS num_sales,
SUM(amount) AS total_sales,
AVG(amount) AS avg_sale,
MAX(amount) AS max_sale
FROM data
GROUP BY month
ORDER BY month DESC
''')
Window functions
with SequelFrame(df) as db:
result = db.query('''
SELECT
name,
age,
ROW_NUMBER() OVER (ORDER BY age DESC) AS rank,
AVG(age) OVER () AS avg_age
FROM data
''')
What's new in 0.5.0
in_memory=True— use SQLite:memory:instead of a temp fileadd_table(data, name)— register additional DataFrames for cross-table JOINstables()— list all tables in the databaseschema(table_name)— inspect columns viaPRAGMA table_infotransaction()— context manager for atomic commit / rollbackexecute_script(sql)— run multiple;-separated statements at oncerow_countproperty — live row count from the database- Fixed: SELECT queries were being executed twice internally
- Fixed: SQL comments at the start of a query (
-- foo\nSELECT ...) were misclassified as write statements - Fixed: table names are now validated to prevent SQL injection
Migration from 0.4.x
Fully backward compatible. The only new required change: table names must now be valid SQL identifiers (letters, digits, underscores). Names with spaces or special characters will raise ValueError.
# Old style (still works)
from sequelframe import sequelframe
db = sequelframe('data.csv')
db.runsql('SELECT * FROM data')
db.kill()
# Recommended style
from sequelframe import SequelFrame
with SequelFrame('data.csv') as db:
result = db.query('SELECT * FROM data')
Limitations
- SQLite SQL dialect only (no
RETURNING, noFULL OUTER JOINprior to SQLite 3.39, etc.) - All data lives in memory — not suitable for datasets larger than available RAM
- For very large datasets consider DuckDB or Polars
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License — see LICENSE for details.
Author
Sina Mirshahi
- Email: sina7th@gmail.com
- GitHub: @Sinamirshahi
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sequelframe-0.5.0.tar.gz.
File metadata
- Download URL: sequelframe-0.5.0.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64e28d54e101472f170d1658f40fc767eb8b20f6d53df12d22589ba44647cff2
|
|
| MD5 |
37a8259d04ff2d84cbe9721cd9fa1ba8
|
|
| BLAKE2b-256 |
b8b32c3fc957d6d2c62f3efc5eb93be6442279e60e90b468115b8ed39e6723f8
|
File details
Details for the file sequelframe-0.5.0-py3-none-any.whl.
File metadata
- Download URL: sequelframe-0.5.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91926cc2651321bfca7a09f9bd0c0c1af64e3131936b141cc2ed867b7cdef389
|
|
| MD5 |
44cae1080c32e1485aba9effb7f307bd
|
|
| BLAKE2b-256 |
cecbb0fd52d41be6c76f0b99179f254382fd7372594094619bea12c62de0d87f
|