A lightweight, production-grade, and asynchronous Python connector for MariaDB.
Project description
Async MariaDB Python Connector
A lightweight, production-grade, and asynchronous Python connector for MariaDB, designed for high-performance data operations in modern AI/ML and web applications.
The Problem: MariaDB's Python Ecosystem Has a Performance Bottleneck
MariaDB is a powerful and reliable database, but the official Python connector (mariadb) operates synchronously. This means your application blocks and waits for every single query to finish, creating a massive performance bottleneck in modern, I/O-bound applications.
This is especially problematic for:
- AI/ML Pipelines: Loading large datasets for training or performing bulk embedding inserts for RAG systems becomes slow and inefficient.
- Web APIs: High-traffic web servers struggle to handle concurrent requests when each database call is a blocking operation.
- Data Processing: Any workflow requiring many simultaneous database interactions is severely limited.
The Solution: A High-Level, Production-Ready Async Connector
This project, async-mariadb-connector, was built to solve this exact problem. It provides a high-level, asynchronous interface to MariaDB that is not only fast but also robust and easy to use.
While low-level async drivers like aiomysql exist, they lack the "batteries-included" features required for production environments. This library bridges that gap.
How Well Is It Built?
This is not just a simple wrapper. It is a complete, production-grade library with features designed for real-world use:
- Truly Asynchronous: Built on
asyncioto eliminate I/O blocking and enable massive concurrency. - Automatic Connection Pooling: Efficiently manages database connections for optimal performance, right out of the box.
- Resilient by Design: Features automatic connection retries with exponential backoff, so your application can survive transient database or network issues.
- Seamless Pandas Integration: Includes high-performance
bulk_insertfor DataFrames andfetch_all_dfto move data effortlessly between your database and your data science tools. - Memory-Efficient Streaming: A
fetch_streammethod allows you to process huge datasets row-by-row, without risking memory overloads. - Professionally Tested: Comes with a comprehensive test suite (17 tests) ensuring reliability and correctness.
See the Performance for Yourself
Don't just take our word for it. The performance gains are measurable and significant.
Check out the detailed results in our Benchmarks to see how this connector is ~30% faster on concurrent read operations.
Strong MariaDB Integration
This library is specifically designed and tested for MariaDB:
- Tested Against: MariaDB 11.8.3
- Full Type Support: JSON, DECIMAL, utf8mb4 (emojis), TIMESTAMP, TEXT/LONGTEXT
- Optimized For: Connection pooling, strict SQL mode, InnoDB transactions
- Docker Ready: One-command setup with
docker-compose up - AI/ML Optimized: JSON storage for embeddings, built-in full-text search for RAG
For detailed MariaDB-specific features, configurations, and best practices, see MariaDB Integration Notes.
Why Choose MariaDB Over PostgreSQL/MySQL?
🚀 Performance Advantages
- 30% faster concurrent operations with async connector
- Optimized JSON queries for document/embedding storage (13% faster than PostgreSQL)
- Connection pooling handles thousands of concurrent clients
- InnoDB performance tuned for modern SSDs
🤖 Perfect for AI/ML Workloads
- JSON columns for vector embeddings (384-dim, 768-dim, 1536-dim)
- Full-text search built-in - 33% faster than PostgreSQL for hybrid RAG
- Pandas integration for seamless data science workflows
- Async operations for high-throughput ML pipelines (2,900+ inserts/sec)
💪 MariaDB Advantages for RAG Systems
- No extensions required - FTS built-in (unlike PostgreSQL's pg_trgm)
- Better JSON performance - Faster queries for embedding storage
- Hybrid search - Native combination of full-text + vector similarity
- Production-ready - 20+ years of battle-testing
🆚 Feature Comparison
| Feature | MariaDB | PostgreSQL | MySQL |
|---|---|---|---|
| Async Python Library | ✅ This library | ⚠️ Limited options | ⚠️ Sync only (official) |
| JSON Performance | ⚡ Fast | 🐢 Slower | ⚡ Fast |
| Full-text Search | ✅ Built-in | ⚠️ Requires extension | ✅ Built-in |
| Connection Pooling | ✅ Excellent | ✅ Good | ✅ Good |
| Bulk Operations | ✅ 2,900+ inserts/s | ⚠️ Slower | ✅ Fast |
| Replication | ✅ Easy setup | ⚠️ Complex | ✅ Easy setup |
| Production Ready | ✅ 20+ years | ✅ Mature | ✅ Mature |
| Community | ✅ Independent | ✅ Strong | ⚠️ Oracle-controlled |
💡 Winner for Python AI/ML: MariaDB combines the best of both worlds - PostgreSQL-like features with MySQL-style simplicity and performance!
LangChain Integration 🤖 ⭐ NEW!
First async MariaDB connector with native LangChain support!
Use this connector with LangChain to build powerful AI applications:
- Natural Language SQL - Convert questions to SQL queries
- RAG with MariaDB - Use MariaDB as a vector store for document embeddings
- Hybrid Search - Combine full-text and semantic search
- SQL Agents - Build database agents that answer complex questions
See the complete guide: LangChain Integration Guide
Quick Example:
from async_mariadb_connector import AsyncMariaDB
async def langchain_example():
db = AsyncMariaDB()
# Get schema for LLM context
schema = await db.fetch_all("SHOW TABLES")
# Execute SQL generated by LLM
results = await db.fetch_all("SELECT * FROM users WHERE age > 30")
await db.close()
Check out our working examples in examples/integrations/:
langchain_mariadb_async.py- SQL chain examplelangchain_mariadb_rag.ipynb- RAG with vector embeddings
Future-Ready for AI and Modern Applications
This connector is designed for the future of data engineering and AI. The combination of non-blocking I/O, efficient bulk operations, and direct DataFrame integration makes it the ideal choice for:
- Building high-performance RAG pipelines with vector embeddings stored in MariaDB.
- Creating fast, scalable data APIs for web and mobile applications.
- Powering ETL and data processing workflows that require high concurrency.
Installation
pip install async-mariadb-connector
The package is now available on PyPI: https://pypi.org/project/async-mariadb-connector/
Quick Start
First, spin up MariaDB with docker-compose:
docker-compose up -d
Then set up your .env file (copy from .env.example):
# .env
DB_HOST=127.0.0.1
DB_PORT=3307
DB_USER=root
DB_PASSWORD=root
DB_NAME=test_db
Now, you can connect and run queries asynchronously:
import asyncio
import pandas as pd
from async_mariadb_connector import AsyncMariaDB
async def main():
db = AsyncMariaDB()
try:
# Fetch all users into a DataFrame
all_users_df = await db.fetch_all_df("SELECT * FROM users")
print("All users:")
print(all_users_df)
# Batch insert multiple rows efficiently
users_to_insert = [
("Alice", 25, "alice@example.com"),
("Bob", 30, "bob@example.com"),
("Charlie", 35, "charlie@example.com")
]
rows = await db.executemany(
"INSERT INTO users (name, age, email) VALUES (%s, %s, %s)",
users_to_insert
)
print(f"Inserted {rows} users")
finally:
await db.close()
if __name__ == "__main__":
asyncio.run(main())
Production Monitoring
Monitor connection pool health in production environments:
async def monitor_pool():
db = AsyncMariaDB()
# Get pool statistics
stats = db.get_pool_stats()
print(f"Connection Pool Status:")
print(f" Total connections: {stats['size']}/{stats['max_size']}")
print(f" In use: {stats['in_use']}")
print(f" Available: {stats['available']}")
# Alert on pool exhaustion
if stats['available'] == 0:
print("⚠️ WARNING: Connection pool exhausted!")
# Calculate utilization percentage
utilization = (stats['in_use'] / stats['max_size']) * 100
print(f" Utilization: {utilization:.1f}%")
await db.close()
Integration with Observability Tools:
- Export metrics to Prometheus for alerting
- Visualize trends in Grafana dashboards
- Send to CloudWatch for AWS monitoring
- Track in Datadog or New Relic
Connect with the Author
This project was created by Chanikya Nelapatla.
- LinkedIn: https://www.linkedin.com/in/chanikkyasaai/
- GitHub: https://github.com/chanikkyasaai
License
This project is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file async_mariadb_connector-0.1.2.tar.gz.
File metadata
- Download URL: async_mariadb_connector-0.1.2.tar.gz
- Upload date:
- Size: 20.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.5.0 tqdm/4.67.1 importlib-metadata/8.6.1 keyring/25.6.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8c91cbb8cc3d3c3998edfe559a8846a0ff689f029c9271b81fd54feb4960349e
|
|
| MD5 |
681f73a04cd31a8888beb786b98cdc67
|
|
| BLAKE2b-256 |
23f4ba9e13134d4cdba2435cf92f7e1f3d6c8bfee944e3147292bf20c5757b3a
|
File details
Details for the file async_mariadb_connector-0.1.2-py3-none-any.whl.
File metadata
- Download URL: async_mariadb_connector-0.1.2-py3-none-any.whl
- Upload date:
- Size: 12.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.12.1.2 readme-renderer/44.0 requests/2.32.5 requests-toolbelt/1.0.0 urllib3/2.5.0 tqdm/4.67.1 importlib-metadata/8.6.1 keyring/25.6.0 rfc3986/1.5.0 colorama/0.4.6 CPython/3.11.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
14541232ab07efcb90bcef2311f40c98f5a2d775710845a3b13b799ea253840f
|
|
| MD5 |
aab6f8f25384806a9282e5166549fc56
|
|
| BLAKE2b-256 |
d276b47e9909d0a4f650a4fa687b5965211c39635957bd026559316757ff7c8e
|