Python DB API 2.0 (PEP 249) client for MongoDB
Project description
PyMongoSQL
PyMongoSQL is a Python DB API 2.0 (PEP 249) client for MongoDB. It provides a familiar SQL interface to MongoDB, allowing developers to use SQL to interact with MongoDB collections.
Objectives
PyMongoSQL implements the DB API 2.0 interfaces to provide SQL-like access to MongoDB, built on PartiQL syntax for querying semi-structured data. The project aims to:
- Bridge SQL and NoSQL: Provide SQL capabilities for MongoDB's nested document structures
- Standard SQL Operations: Support DQL (SELECT) and DML (INSERT, UPDATE, DELETE) operations with WHERE, ORDER BY, and LIMIT clauses
- Seamless Integration: Full compatibility with Python applications expecting DB API 2.0 compliance
- Easy Migration: Enable migration from traditional SQL databases to MongoDB without rewriting application code
Features
- DB API 2.0 Compliant: Full compatibility with Python Database API 2.0 specification
- PartiQL-based SQL Syntax: Built on PartiQL (SQL for semi-structured data), enabling seamless SQL querying of nested and hierarchical MongoDB documents
- Nested Structure Support: Query and filter deeply nested fields and arrays within MongoDB documents using standard SQL syntax
- SQLAlchemy Integration: Complete ORM and Core support with dedicated MongoDB dialect
- SQL Query Support: SELECT statements with WHERE conditions, field selection, and aliases
- DML Support: Full support for INSERT, UPDATE, and DELETE operations using PartiQL syntax
- Connection String Support: MongoDB URI format for easy configuration
Requirements
- Python: 3.9, 3.10, 3.11, 3.12, 3.13+
- MongoDB: 7.0+
Dependencies
-
PyMongo (MongoDB Python Driver)
- pymongo >= 4.15.0
-
ANTLR4 (SQL Parser Runtime)
- antlr4-python3-runtime >= 4.13.0
-
JMESPath (JSON/Dict Path Query)
- jmespath >= 1.0.0
Optional Dependencies
- SQLAlchemy (for ORM/Core support)
- sqlalchemy >= 1.4.0 (SQLAlchemy 1.4+ and 2.0+ supported)
Installation
pip install pymongosql
Or install from source:
git clone https://github.com/your-username/PyMongoSQL.git
cd PyMongoSQL
pip install -e .
Quick Start
Table of Contents:
- Basic Usage
- Using Connection String
- Context Manager Support
- Using DictCursor for Dictionary Results
- Cursor vs DictCursor
- Query with Parameters
- Supported SQL Features
- Apache Superset Integration
- Limitations & Roadmap
- Contributing
- License
Basic Usage
from pymongosql import connect
# Connect to MongoDB
connection = connect(
host="mongodb://localhost:27017",
database="database"
)
cursor = connection.cursor()
cursor.execute('SELECT name, email FROM users WHERE age > 25')
print(cursor.fetchall())
Using Connection String
from pymongosql import connect
# Connect with authentication
connection = connect(
host="mongodb://username:password@localhost:27017/database?authSource=admin"
)
cursor = connection.cursor()
cursor.execute('SELECT * FROM products WHERE category = ?', ['Electronics'])
for row in cursor:
print(row)
Context Manager Support
from pymongosql import connect
with connect(host="mongodb://localhost:27017/database") as conn:
with conn.cursor() as cursor:
cursor.execute('SELECT COUNT(*) as total FROM users')
result = cursor.fetchone()
print(f"Total users: {result[0]}")
Using DictCursor for Dictionary Results
from pymongosql import connect
from pymongosql.cursor import DictCursor
with connect(host="mongodb://localhost:27017/database") as conn:
with conn.cursor(DictCursor) as cursor:
cursor.execute('SELECT COUNT(*) as total FROM users')
result = cursor.fetchone()
print(f"Total users: {result['total']}")
Cursor vs DictCursor
PyMongoSQL provides two cursor types for different result formats:
Cursor (default) - Returns results as tuples:
cursor = connection.cursor()
cursor.execute('SELECT name, email FROM users')
row = cursor.fetchone()
print(row[0]) # Access by index
DictCursor - Returns results as dict:
from pymongosql.cursor import DictCursor
cursor = connection.cursor(DictCursor)
cursor.execute('SELECT name, email FROM users')
row = cursor.fetchone()
print(row['name']) # Access by column name
Query with Parameters
PyMongoSQL supports two styles of parameterized queries for safe value substitution:
Positional Parameters with ?
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
cursor = connection.cursor()
cursor.execute(
'SELECT name, email FROM users WHERE age > ? AND status = ?',
[25, 'active']
)
Named Parameters with :name
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
cursor = connection.cursor()
cursor.execute(
'SELECT name, email FROM users WHERE age > :age AND status = :status',
{'age': 25, 'status': 'active'}
)
Parameters are substituted into the MongoDB filter during execution, providing protection against injection attacks.
Supported SQL Features
SELECT Statements
- Field selection:
SELECT name, age FROM users - Wildcards:
SELECT * FROM products - Field aliases:
SELECT name AS user_name, age AS user_age FROM users - Nested fields:
SELECT profile.name, profile.age FROM users - Array access:
SELECT items[0], items[1].name FROM orders
WHERE Clauses
- Equality:
WHERE name = 'John' - Comparisons:
WHERE age > 25,WHERE price <= 100.0 - Logical operators:
WHERE age > 18 AND status = 'active',WHERE age < 30 OR role = 'admin' - Nested field filtering:
WHERE profile.status = 'active' - Array filtering:
WHERE items[0].price > 100
Nested Field Support
- Single-level:
profile.name,settings.theme - Multi-level:
account.profile.name,config.database.host - Array access:
items[0].name,orders[1].total - Complex queries:
WHERE customer.profile.age > 18 AND orders[0].status = 'paid'
Note: Avoid SQL reserved words (
user,data,value,count, etc.) as unquoted field names. Use alternatives or bracket notation for arrays.
Sorting and Limiting
- ORDER BY:
ORDER BY name ASC, age DESC - LIMIT:
LIMIT 10 - Combined:
ORDER BY created_at DESC LIMIT 5
INSERT Statements
PyMongoSQL supports inserting documents into MongoDB collections using both PartiQL-style object literals and standard SQL INSERT VALUES syntax.
PartiQL-Style Object Literals
Single Document
cursor.execute(
"INSERT INTO Music {'title': 'Song A', 'artist': 'Alice', 'year': 2021}"
)
Multiple Documents (Bag Syntax)
cursor.execute(
"INSERT INTO Music << {'title': 'Song B', 'artist': 'Bob'}, {'title': 'Song C', 'artist': 'Charlie'} >>"
)
Parameterized INSERT
# Positional parameters using ? placeholders
cursor.execute(
"INSERT INTO Music {'title': '?', 'artist': '?', 'year': '?'}",
["Song D", "Diana", 2020]
)
Standard SQL INSERT VALUES
Single Row with Column List
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES ('Song E', 'Eve', 2022)"
)
Multiple Rows
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES ('Song F', 'Frank', 2023), ('Song G', 'Grace', 2024)"
)
Parameterized INSERT VALUES
# Positional parameters (?)
cursor.execute(
"INSERT INTO Music (title, artist, year) VALUES (?, ?, ?)",
["Song H", "Henry", 2025]
)
# Named parameters (:name)
cursor.execute(
"INSERT INTO Music (title, artist) VALUES (:title, :artist)",
{"title": "Song I", "artist": "Iris"}
)
UPDATE Statements
PyMongoSQL supports updating documents in MongoDB collections using standard SQL UPDATE syntax.
Update All Documents
cursor.execute("UPDATE Music SET available = false")
Update with WHERE Clause
cursor.execute("UPDATE Music SET price = 14.99 WHERE year < 2020")
Update Multiple Fields
cursor.execute(
"UPDATE Music SET price = 19.99, available = true WHERE artist = 'Alice'"
)
Update with Logical Operators
cursor.execute(
"UPDATE Music SET price = 9.99 WHERE year = 2020 AND stock > 5"
)
Parameterized UPDATE
# Positional parameters using ? placeholders
cursor.execute(
"UPDATE Music SET price = ?, stock = ? WHERE artist = ?",
[24.99, 50, "Bob"]
)
Update Nested Fields
cursor.execute(
"UPDATE Music SET details.publisher = 'XYZ Records' WHERE title = 'Song A'"
)
Check Updated Row Count
cursor.execute("UPDATE Music SET available = false WHERE year = 2020")
print(f"Updated {cursor.rowcount} documents")
DELETE Statements
PyMongoSQL supports deleting documents from MongoDB collections using standard SQL DELETE syntax.
Delete All Documents
cursor.execute("DELETE FROM Music")
Delete with WHERE Clause
cursor.execute("DELETE FROM Music WHERE year < 2020")
Delete with Logical Operators
cursor.execute(
"DELETE FROM Music WHERE year = 2019 AND available = false"
)
Parameterized DELETE
# Positional parameters using ? placeholders
cursor.execute(
"DELETE FROM Music WHERE artist = ? AND year < ?",
["Charlie", 2021]
)
Check Deleted Row Count
cursor.execute("DELETE FROM Music WHERE available = false")
print(f"Deleted {cursor.rowcount} documents")
Transaction Support
PyMongoSQL supports DB API 2.0 transactions for ACID-compliant database operations. Use the begin(), commit(), and rollback() methods to manage transactions:
from pymongosql import connect
connection = connect(host="mongodb://localhost:27017/database")
try:
connection.begin() # Start transaction
cursor = connection.cursor()
cursor.execute('UPDATE accounts SET balance = 100 WHERE id = ?', [1])
cursor.execute('UPDATE accounts SET balance = 200 WHERE id = ?', [2])
connection.commit() # Commit all changes
print("Transaction committed successfully")
except Exception as e:
connection.rollback() # Rollback on error
print(f"Transaction failed: {e}")
finally:
connection.close()
Note: MongoDB requires a replica set or sharded cluster for transaction support. Standalone MongoDB servers do not support ACID transactions at the server level.
Apache Superset Integration
PyMongoSQL can be used as a database driver in Apache Superset for querying and visualizing MongoDB data:
- Install PyMongoSQL: Install PyMongoSQL on the Superset app server:
pip install pymongosql
- Create Connection: Connect to your MongoDB instance using the connection URI with superset mode:
or for MongoDB Atlas:mongodb://username:password@host:port/database?mode=supersetmongodb+srv://username:password@host/database?mode=superset - Use SQL Lab: Write and execute SQL queries against MongoDB collections directly in Superset's SQL Lab
- Create Visualizations: Build charts and dashboards from your MongoDB queries using Superset's visualization tools
This allows seamless integration between MongoDB data and Superset's BI capabilities without requiring data migration to traditional SQL databases.
Limitations & Roadmap
Note: PyMongoSQL currently supports DQL (Data Query Language) and DML (Data Manipulation Language) operations. The following SQL features are not yet supported but are planned for future releases:
- Advanced DML Operations
REPLACE,MERGE,UPSERT
These features are on our development roadmap and contributions are welcome!
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
License
PyMongoSQL is distributed under the MIT license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pymongosql-0.3.3.tar.gz.
File metadata
- Download URL: pymongosql-0.3.3.tar.gz
- Upload date:
- Size: 175.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
72883928ccbb95625d84dfc4a3918ce39cc3309a5514f4baeafad08de5f01135
|
|
| MD5 |
b971f042c7aad5988b35241d5e31f3dd
|
|
| BLAKE2b-256 |
77466472c565bd3b3b91d014f6e0af4e57e4839acd4c8acfbe595c5c3491446d
|
File details
Details for the file pymongosql-0.3.3-py3-none-any.whl.
File metadata
- Download URL: pymongosql-0.3.3-py3-none-any.whl
- Upload date:
- Size: 179.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c18c2fd8fe7f7f9c07226e78fc636f93d3ed2d1510422d24cbe252e8049021a6
|
|
| MD5 |
c4ef76a6da5665ac59e1a76841a0bbfc
|
|
| BLAKE2b-256 |
3f7cd4df82bb5a815fc243f0c87fc827f8ef0dbdee19195956b9b31055a63881
|