Skip to main content

PEP 249 compliant DB-API driver for Excel files

Project description

excel-dbapi excel-dbapi

CI codecov PyPI Python 3.10+ License: MIT Docs

A local-first Python DB-API 2.0 connector for Excel files. Use SQL to query, insert, update, and delete rows in .xlsx workbooks — no database server required.

About and docs

Limitations

Before you begin, understand what excel-dbapi is not:

  • Not full SQL — this is a documented SQL subset (see docs/SQL_SPEC.md)
  • No concurrent writes — use a single-writer model
  • Not for large datasets — if your Excel file has 100k+ rows, use pandas directly or a database
  • No transactional rollback guarantees — rollback restores an in-memory snapshot, not a WAL
  • PandasEngine rewrites workbooks — formatting, charts, images, and formulas are dropped
  • Identifier grammar is limited — quoted table names are supported (for example "Sales 2024"), but column references must still be unquoted ASCII identifiers ([A-Za-z_][A-Za-z0-9_]*)

If you need relational features, use SQLite or PostgreSQL.

See the full SQL Specification for the exact SQL subset supported.

Current SQL feature set

  • SELECT with aliases, arithmetic/CASE expressions, DISTINCT, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT, OFFSET
  • JOINs: INNER, LEFT, RIGHT, FULL OUTER, CROSS (with documented JOIN-specific restrictions)
  • Aggregates: COUNT, SUM, AVG, MIN, MAX, COUNT(DISTINCT col)
  • Subqueries in WHERE ... [NOT] IN (SELECT ...) and compound queries (UNION, UNION ALL, INTERSECT, EXCEPT)
  • DML/DDL: INSERT (single/multi-row and INSERT ... SELECT), UPSERT (ON CONFLICT), UPDATE, DELETE, CREATE/DROP/ALTER TABLE

For exact support/limitations per feature, use the matrix in docs/SQL_SPEC.md#2-authoritative-feature-matrix.


Who is this for?

  • Data analysts who want to query Excel files with SQL instead of manual filtering
  • Citizen developers automating small workflows with familiar SQL syntax
  • Educators teaching SQL concepts without setting up a database
  • Prototypers building quick data pipelines before moving to a real database

Installation

pip install excel-dbapi

See CHANGELOG for release history.


Quick Start

from excel_dbapi.connection import ExcelConnection

# Open an Excel file and query it
with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM Sheet1")
    print(cursor.fetchall())

Insert, Update, Delete

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()

    # Insert with parameter binding (recommended)
    cursor.execute("INSERT INTO Sheet1 (id, name) VALUES (?, ?)", (1, "Alice"))

    # Update
    cursor.execute("UPDATE Sheet1 SET name = 'Ann' WHERE id = 1")

    # Delete
    cursor.execute("DELETE FROM Sheet1 WHERE id = 2")

Multi-row Insert

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()

    # Insert multiple rows at once
    cursor.execute("INSERT INTO Sheet1 VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Carol')")

    # INSERT...SELECT: copy rows from another sheet
    cursor.execute("INSERT INTO Sheet2 (id, name) SELECT id, name FROM Sheet1 WHERE id > 1")

Create and Drop Sheets

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()
    cursor.execute("CREATE TABLE NewSheet (id, name)")
    cursor.execute("DROP TABLE NewSheet")

Engine Options

Engine Description Dependency
openpyxl (default) Fast sheet access openpyxl
pandas DataFrame-based operations pandas, openpyxl
graph Microsoft Graph API (remote Excel) httpx
conn = ExcelConnection("sample.xlsx", engine="openpyxl")  # default
conn = ExcelConnection("sample.xlsx", engine="pandas")

WHERE Operators

Operator Example Description
=, !=, <> WHERE id = 1 Equality / inequality
>, >=, <, <= WHERE score >= 80 Comparison
IS NULL / IS NOT NULL WHERE name IS NOT NULL NULL checks
IN WHERE name IN ('Alice', 'Bob') Set membership
BETWEEN WHERE score BETWEEN 70 AND 90 Inclusive range
LIKE / ILIKE WHERE name LIKE 'A%' Pattern matching (ILIKE = case-insensitive)
NOT LIKE / NOT ILIKE WHERE name NOT LIKE 'A%' Negated pattern matching
NOT IN WHERE id NOT IN (1, 2) Negated set membership
NOT BETWEEN WHERE x NOT BETWEEN 1 AND 5 Negated range
AND / OR / NOT WHERE x = 1 AND y = 2 Logical connectives

NULL semantics: Comparisons with NULL follow SQL three-valued logic (TRUE / FALSE / UNKNOWN). WHERE x = NULL returns no rows; use IS NULL instead.

LIKE patterns: % matches any sequence of characters, _ matches any single character.

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()

    # IN operator
    cursor.execute("SELECT * FROM Sheet1 WHERE name IN ('Alice', 'Bob')")

    # BETWEEN operator
    cursor.execute("SELECT * FROM Sheet1 WHERE score BETWEEN 70 AND 90")

    # LIKE operator
    cursor.execute("SELECT * FROM Sheet1 WHERE name LIKE 'A%'")

    # All operators support parameter binding
    cursor.execute("SELECT * FROM Sheet1 WHERE name IN (?, ?)", ("Alice", "Bob"))
    cursor.execute("SELECT * FROM Sheet1 WHERE score BETWEEN ? AND ?", (70, 90))
    cursor.execute("SELECT * FROM Sheet1 WHERE name LIKE ?", ("A%",))

Compound Queries (Set Operations)

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()

    cursor.execute("SELECT id FROM t1 UNION SELECT id FROM t2")
    cursor.execute("SELECT id FROM t1 UNION ALL SELECT id FROM t2")
    cursor.execute("SELECT id FROM t1 INTERSECT SELECT id FROM t2")
    cursor.execute("SELECT id FROM t1 EXCEPT SELECT id FROM t2")

Safety Defaults

Formula Injection Defense

By default, excel-dbapi sanitizes cell values on write (INSERT/UPDATE) to prevent formula injection attacks. Strings starting with =, +, -, @, \t, or \r are automatically prefixed with a single quote (') so they are stored as plain text, not executed as formulas.

# Default: sanitization ON (recommended)
with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()
    cursor.execute("INSERT INTO Sheet1 (id, name) VALUES (?, ?)",
                   (1, "=SUM(A1:A10)"))
    # Stored as: '=SUM(A1:A10)  (safe, not executed as formula)

# Opt out if you intentionally write formulas
with ExcelConnection("sample.xlsx", sanitize_formulas=False) as conn:
    cursor = conn.cursor()
    cursor.execute("INSERT INTO Sheet1 (id, formula) VALUES (?, ?)",
                   (1, "=SUM(A1:A10)"))
    # Stored as: =SUM(A1:A10)  (executed as formula in Excel)

Transaction Example

with ExcelConnection("sample.xlsx", autocommit=False) as conn:
    cursor = conn.cursor()
    cursor.execute("UPDATE Sheet1 SET name = 'Ann' WHERE id = 1")
    conn.rollback()

When autocommit is enabled, rollback() is not supported.

Cursor Metadata

with ExcelConnection("sample.xlsx") as conn:
    cursor = conn.cursor()
    cursor.execute("SELECT id, name FROM Sheet1")
    print(cursor.description)
    print(cursor.rowcount)

Troubleshooting

"Column 'xyz' not found"

The column name in your SQL doesn't match any header in the sheet.

ProgrammingError: Column 'nmae' not found in Sheet1. Available columns: ['id', 'name', 'email']

Fix: Check the spelling. Column names must match the first row (header) of the sheet exactly.

"Table 'SheetX' not found"

The sheet name in your SQL doesn't match any sheet in the workbook.

ProgrammingError: Table 'Shee1' not found. Available sheets: ['Sheet1', 'Sheet2']

Fix: Check the sheet name spelling. Sheet names are resolved case-insensitively.

PandasEngine drops formatting

PandasEngine reads data into a DataFrame and writes it back. This process drops Excel formatting, charts, images, and formulas.

Fix: Use the default openpyxl engine if you need to preserve formatting.

Integer vs. string comparison (Pandas)

The Pandas engine preserves Python types. If a column contains integers, WHERE id = '2' (string) won't match — use WHERE id = 2 (no quotes).

Fix: Omit quotes around numeric values in WHERE clauses when using the Pandas engine.


Experimental: Remote Excel via Microsoft Graph API

Status: Experimental — API may change in future releases.

excel-dbapi can access remote Excel files on OneDrive/SharePoint via the Microsoft Graph API.

Supported Graph DSNs are ID-based:

  • msgraph://drives/{drive_id}/items/{item_id}
  • sharepoint://sites/{site_name}/drives/{drive_id}/items/{item_id}
  • onedrive://me/drive/items/{item_id}
pip install excel-dbapi[graph]
from excel_dbapi.connection import ExcelConnection

conn = ExcelConnection(
    "msgraph://drives/{drive_id}/items/{item_id}",
    engine="graph",
    credential=your_credential,
    autocommit=True,
)
cursor = conn.cursor()
cursor.execute("SELECT * FROM Sheet1")
print(cursor.fetchall())
conn.close()

The Graph backend is read-only by default. Write operations require explicit opt-in and a credential/token provider with appropriate Graph API permissions.

Graph metadata sync is best-effort for write operations: if worksheet mutation succeeds but metadata sync fails, excel-dbapi keeps the worksheet change and logs a warning.

For DSN formats and dependency choices, see the Usage Guide Graph section.


Related Projects

  • sqlalchemy-excel — SQLAlchemy dialect that uses excel-dbapi as its DB-API 2.0 driver. Use create_engine("excel:///file.xlsx") for full ORM support.

Documentation

Examples

  • examples/basic_usage.py
  • examples/write_operations.py
  • examples/transactions.py
  • examples/advanced_query.py
  • examples/pandas_engine.py

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

excel_dbapi-0.4.1.tar.gz (215.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

excel_dbapi-0.4.1-py3-none-any.whl (90.6 kB view details)

Uploaded Python 3

File details

Details for the file excel_dbapi-0.4.1.tar.gz.

File metadata

  • Download URL: excel_dbapi-0.4.1.tar.gz
  • Upload date:
  • Size: 215.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for excel_dbapi-0.4.1.tar.gz
Algorithm Hash digest
SHA256 280dfa2ef06140a6fca35df0823951e2c73be7b6ff74ce21624e6eb2f37bbf98
MD5 0983d3ad6765f2bed0bed25db978094e
BLAKE2b-256 e9cb58fc153a65982cff37ff8a64ceadbd870174c7b6197f0a32d5c41cb06efa

See more details on using hashes here.

Provenance

The following attestation bundles were made for excel_dbapi-0.4.1.tar.gz:

Publisher: publish-pypi.yml on yeongseon/excel-dbapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file excel_dbapi-0.4.1-py3-none-any.whl.

File metadata

  • Download URL: excel_dbapi-0.4.1-py3-none-any.whl
  • Upload date:
  • Size: 90.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for excel_dbapi-0.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 cc99645e78a55788c83bf29425235b9a3f27d6bc6a7ae6d6d03a8726b800fb03
MD5 9921a5a002d72765bf7f446437ab0158
BLAKE2b-256 2de52c2fbb84e7099a654dba504ed966aab7cdb782fa2ae77da872d94c0c053c

See more details on using hashes here.

Provenance

The following attestation bundles were made for excel_dbapi-0.4.1-py3-none-any.whl:

Publisher: publish-pypi.yml on yeongseon/excel-dbapi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page