DataFrame API with SQL pushdown execution and real SQL CRUD - the missing layer for SQL in Python

These details have not been verified by PyPI

Project links

Project description

Moltres

The Missing DataFrame Layer for SQL in Python

MOLTRES: Modern Operations Layer for Transformations, Relational Execution, and SQL

Installation • Quick Start • Documentation • Examples

Moltres combines a DataFrame API (like Pandas/Polars), SQL pushdown execution (no data loading into memory), and real SQL CRUD operations (INSERT, UPDATE, DELETE) in one unified interface.

Transform millions of rows using familiar DataFrame operations—all executed directly in SQL without materializing data. Update, insert, and delete with column-aware, type-safe operations.

✨ Features

🚀 PySpark-Style DataFrame API - Primary API with familiar operations (select, filter, join, groupBy, etc.) for seamless migration from PySpark
🐼 Optional Pandas-Style Interface - Comprehensive Pandas-like API with string accessor, query(), dtypes, shape, pivot, sample, concat, and more
🦀 Optional Polars-Style Interface - Polars LazyFrame-like API with expression-based operations, set operations, file I/O, CTEs, and more
🎯 98% PySpark API Compatibility - Near-complete compatibility for seamless migration
🗄️ SQL Pushdown Execution - All operations compile to SQL and run on your database—no data loading into memory
✏️ Real SQL CRUD - INSERT, UPDATE, DELETE operations with DataFrame-style syntax
📊 Multiple Formats - Read/write CSV, JSON, JSONL, Parquet, and more
🐼 Pandas & Polars Integration - Pass pandas/polars DataFrames directly to moltres operations
🌊 Streaming Support - Handle datasets larger than memory with chunked processing
⚡ Async Support - Full async/await support for all operations
🔒 Security First - Built-in SQL injection prevention and validation

📦 Installation

pip install moltres

# Optional: For async support
pip install moltres[async-postgresql]  # PostgreSQL
pip install moltres[async-mysql]       # MySQL
pip install moltres[async-sqlite]     # SQLite

# Optional: For pandas/polars result formats
pip install moltres[pandas,polars]

🚀 Quick Start

Basic DataFrame Operations

from moltres import col, connect
from moltres.expressions import functions as F

# Connect to your database
db = connect("sqlite:///example.db")

# DataFrame operations with SQL pushdown (no data loading into memory)
df = (
    db.table("orders")
    .select()
    .join(db.table("customers").select(), on=[col("orders.customer_id") == col("customers.id")])
    .where(col("active") == True)
    .group_by("country")
    .agg(F.sum(col("amount")).alias("total_amount"))
)

# Execute and get results (SQL is compiled and executed here)
results = df.collect()  # Returns list of dicts by default

Pandas-Style Interface

df = db.table("users").pandas()

# Pandas-style operations
df[['id', 'name']]  # Select columns
df.query('age > 25 and country == "USA"')  # Query with AND/OR
df['name'].str.upper()  # String accessor
df.groupby('country').agg(age='mean')  # GroupBy

📚 See the Pandas Interface Guide →

Polars-Style Interface

df = db.table("users").polars()

# Polars-style operations
df.select("id", "name", (col("age") * 2).alias("double_age"))
df.filter((col("age") > 25) & (col("country") == "USA"))
df.group_by("country").agg(F.sum(col("age")))

📚 See the Polars Interface Guide →

CRUD Operations

from moltres.io.records import Records

# Insert rows
Records.from_list([
    {"id": 1, "name": "Alice", "email": "alice@example.com"},
    {"id": 2, "name": "Bob", "email": "bob@example.com"},
], database=db).insert_into("users")

# Update rows
db.update("users", where=col("active") == 0, set={"active": 1})

# Delete rows
db.delete("users", where=col("email").is_null())

📚 See CRUD Operations Guide →

📖 Documentation

Getting Started

Getting Started Guide - Step-by-step introduction
Examples Directory - 19 comprehensive example files
Examples Guide - Common patterns and use cases

Interface Guides

Pandas Interface - Complete pandas-style API reference
Polars Interface - Complete Polars-style API reference
PySpark Migration - Migrating from PySpark

Core Topics

Reading Data - Tables, SQL, files
Writing Data - Tables, files, formats
Table Management - Create, drop, constraints
Schema Inspection - Reflection and inspection
Streaming - Large dataset handling
Async Operations - Async/await support

Advanced Topics

Performance Optimization - Query optimization and best practices
Error Handling - Exception handling and debugging
Best Practices - Production-ready patterns
Advanced Topics - Window functions, CTEs, transactions

Reference

Why Moltres? - Understanding the gap Moltres fills
Security Guide - Security best practices
Troubleshooting - Common issues and solutions
API Reference - Complete API documentation

📚 Examples

Comprehensive examples demonstrating all Moltres features:

01_connecting.py - Database connections (sync and async)
02_dataframe_basics.py - Basic DataFrame operations
03_async_dataframe.py - Asynchronous operations
04_joins.py - Join operations
05_groupby.py - GroupBy and aggregation
06_expressions.py - Column expressions and functions
07_file_reading.py - Reading files (CSV, JSON, Parquet)
08_file_writing.py - Writing DataFrames to files
09_table_operations.py - Table operations and mutations
10_create_dataframe.py - Creating DataFrames from Python data
11_window_functions.py - Window functions
12_sql_operations.py - Raw SQL and SQL operations
13_transactions.py - Transaction management
14_reflection.py - Schema inspection and reflection
15_pandas_polars_dataframes.py - Pandas/Polars integration
16_ux_features.py - UX improvements
17_sqlalchemy_models.py - SQLAlchemy ORM integration
18_pandas_interface.py - Pandas-style interface examples
19_polars_interface.py - Polars-style interface examples

See the examples directory for all example files.

🛠️ Supported Operations

DataFrame Operations

select() / selectExpr() - Project columns or SQL expressions
where() / filter() - Filter rows
join() - Join with other DataFrames
group_by() / groupBy() - Group rows
agg() - Aggregate functions
order_by() / orderBy() / sort() - Sort rows
limit() - Limit number of rows
distinct() - Remove duplicate rows
withColumn() - Add or rename columns
pivot() - Pivot operations
explode() - Explode array/JSON columns

Column Expressions

Arithmetic: +, -, *, /, %
Comparisons: ==, !=, <, >, <=, >=
Boolean: &, |, ~
Functions: 130+ functions including mathematical, string, date/time, aggregate, window, array, JSON, and utility functions
Window Functions: over(), partition_by(), order_by() - Full PySpark compatibility

📚 See Expressions Guide →

Supported SQL Dialects

✅ SQLite - Full support
✅ PostgreSQL - Full support with dialect-specific optimizations
✅ MySQL - Full support with dialect-specific optimizations
✅ DuckDB - Full support with PostgreSQL-compatible optimizations
✅ Other SQLAlchemy-supported databases - ANSI SQL fallback

🧪 Development

Setup

# Clone the repository
git clone https://github.com/eddiethedean/moltres.git
cd moltres

# Install in development mode
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run tests in parallel
pytest -n 9

# Run with coverage
pytest --cov=src/moltres --cov-report=html

Code Quality

# Linting
ruff check .

# Formatting
ruff format .

# Type checking (strict mode enabled)
mypy src

Pre-Commit CI Checks

# Run all CI checks (linting, type checking, tests)
make ci-check

# Quick linting check only
make ci-check-lint

🤝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Quick Start:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Before submitting:

Run tests: pytest
Check code quality: ruff check . && mypy src
Update documentation if needed

👤 Author

Odos Matthews

GitHub: @eddiethedean
Email: odosmatthews@gmail.com

🙏 Acknowledgments

Inspired by PySpark's DataFrame API style, but focused on SQL feature support rather than PySpark feature parity
Built on SQLAlchemy for database connectivity and SQL compilation
Thanks to all contributors and users

📄 License

MIT License - see LICENSE file for details.

Made with ❤️ for the Python data community

⬆ Back to Top

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.0

Apr 3, 2026

0.23.2

Dec 5, 2025

0.23.1

Dec 5, 2025

0.23.0

Dec 4, 2025

0.22.0

Dec 2, 2025

0.21.0

Dec 2, 2025

0.20.0

Dec 2, 2025

0.19.6

Dec 2, 2025

0.19.4

Dec 1, 2025

0.19.3

Dec 1, 2025

0.19.2

Nov 29, 2025

0.19.1

Nov 29, 2025

0.19.0

Nov 29, 2025

This version

0.18.0

Nov 27, 2025

0.17.0

Nov 27, 2025

0.16.0

Nov 26, 2025

0.15.0

Nov 26, 2025

0.14.0

Nov 25, 2025

0.13.0

Nov 25, 2025

0.12.0

Nov 25, 2025

0.11.0

Nov 24, 2025

0.10.0

Nov 24, 2025

0.9.0

Nov 24, 2025

0.8.0

Nov 22, 2025

0.7.0

Nov 22, 2025

0.6.0

Nov 21, 2025

0.5.0

Nov 21, 2025

0.4.0

Nov 21, 2025

0.3.0

Nov 20, 2025

0.2.0

Nov 20, 2025

0.1.0

Nov 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

moltres-0.18.0.tar.gz (264.4 kB view details)

Uploaded Nov 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

moltres-0.18.0-py3-none-any.whl (302.5 kB view details)

Uploaded Nov 27, 2025 Python 3

File details

Details for the file moltres-0.18.0.tar.gz.

File metadata

Download URL: moltres-0.18.0.tar.gz
Upload date: Nov 27, 2025
Size: 264.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for moltres-0.18.0.tar.gz
Algorithm	Hash digest
SHA256	`4134a5b430ce1e8cf2d36f12afe514015c2cee88aef7ae1915efb2a8ca82f6cf`
MD5	`c504b2746c325a5302f8e34cc9af6a3e`
BLAKE2b-256	`313484b91e08f0c188eada622627a24479030ca9ce7706962f35c01fbe9055d6`

See more details on using hashes here.

File details

Details for the file moltres-0.18.0-py3-none-any.whl.

File metadata

Download URL: moltres-0.18.0-py3-none-any.whl
Upload date: Nov 27, 2025
Size: 302.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.13

File hashes

Hashes for moltres-0.18.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f0783c6222d9f7220e898955abe65aade5e2981a389316c64fb9bb8988461be6`
MD5	`dffb77570171461e56c288acd9799ee6`
BLAKE2b-256	`ea87c7d06ee8b5420e86febd8c4fb5dfe639ed2bd1f4b15b1174d992b24be02b`

See more details on using hashes here.

moltres 0.18.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Moltres

✨ Features

📦 Installation

🚀 Quick Start

Basic DataFrame Operations

Pandas-Style Interface

Polars-Style Interface

CRUD Operations

📖 Documentation

Getting Started

Interface Guides

Core Topics

Advanced Topics

Reference

📚 Examples

🛠️ Supported Operations

DataFrame Operations

Column Expressions

Supported SQL Dialects

🧪 Development

Setup

Running Tests

Code Quality

Pre-Commit CI Checks

🤝 Contributing

👤 Author

🙏 Acknowledgments

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes