A lightweight and fun DBAPI built on pandas/csv 🎉. Supported SQL: SELECT, INSERT, UPDATE, DELETE, JOIN, GROUP BY.
Project description
🐼 Pica - Simple SQL Interface for Pandas DataFrames
Pica is a lightweight Python library that provides a SQL interface for Pandas DataFrames, following the Python DB-API 2.0 specification. It allows you to interact with your DataFrames using familiar SQL syntax while leveraging the power of Pandas under the hood.
✨ Features
- 🔍 SQL-like interface for Pandas DataFrames
- 📊 Supports common SQL operations
- 🐍 Python DB-API 2.0 compliant
- 🚀 Easy to use and integrate
- 📝 CSV file support for persistence
🛠️ Installation
pip install pica-dbapi
🎯 Quick Start
import os
import sys
import pandas as pd
import pica
def main():
"""Basic example of using Pica with CSV files and DataFrames
CSVファイルとDataFrameを使用したPicaの基本的な使用例"""
# Create sample data
# サンプルデータの作成
users_data = {
'id': [1, 2, 3, 4, 5],
'name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'age': [25, 30, 35, 28, 22],
'department': ['Sales', 'IT', 'Sales', 'Marketing', 'IT']
}
orders_data = {
'order_id': [1, 2, 3, 4, 5],
'user_id': [1, 2, 1, 3, 5],
'product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Printer'],
'amount': [1000, 20, 50, 200, 150]
}
# Define schemas
# スキーマの定義
users_schema = {
'id': 'INTEGER',
'name': 'TEXT',
'age': 'INTEGER',
'department': 'TEXT'
}
orders_schema = {
'order_id': 'INTEGER',
'user_id': 'INTEGER',
'product': 'TEXT',
'amount': 'INTEGER'
}
# Create DataFrames
# DataFrameを作成
users_df = pd.DataFrame(users_data)
orders_df = pd.DataFrame(orders_data)
# Initialize connection with dataframes
# DataFrameを指定して接続を初期化
initial_tables = {
"users": users_df,
"orders": orders_df
}
conn = pica.connect(dataframes=initial_tables)
cursor = conn.cursor()
# Register schemas
# スキーマを登録
conn.register_schema("users", users_schema)
conn.register_schema("orders", orders_schema)
# Example 1: Basic SELECT with WHERE
# 基本的なSELECTとWHEREの例
print("\n=== Example 1: Basic SELECT with WHERE ===")
cursor.execute("SELECT name, age FROM users WHERE age > 25")
results = cursor.fetchall()
print("Users over 25:")
for row in results:
print(row)
# Example 2: GROUP BY with aggregation
# GROUP BYと集計の例
print("\n=== Example 2: GROUP BY with aggregation ===")
cursor.execute("""
SELECT department, COUNT(*) as count, AVG(age) as avg_age
FROM users
GROUP BY department
""")
results = cursor.fetchall()
print("Department statistics:")
for row in results:
print(row)
# Example 3: JOIN operation using two DataFrames
# 2つのDataFrameを使用したJOIN操作の例
print("\n=== Example 3: JOIN operation ===")
cursor.execute("""
SELECT
users.name as customer_name,
orders.product as product_name,
orders.amount as order_amount
FROM users
JOIN orders ON users.id = orders.user_id
ORDER BY amount DESC
""")
results = cursor.fetchall()
print("User orders:")
for row in results:
print(row)
# Example 4: Using with Pandas DataFrame directly
# Pandas DataFrameの直接使用例
print("\n=== Example 4: Using with Pandas DataFrame ===")
cursor.execute("""
SELECT name, age
FROM users
WHERE department = 'IT'
ORDER BY age DESC
""")
results = cursor.fetchall()
print("IT department members:")
for row in results:
print(row)
# --- Lazy-loading Example ---
# This example demonstrates the lazy-loading functionality where CSV files are loaded automatically
# if the connection is initialized without initial DataFrames.
print("\n=== Example 5: Lazy-loading ===")
# Set base_dir to the directory containing the CSV files (assuming they are placed in the same directory as this example file)
base_dir = os.path.dirname(__file__)
print("base_dir:", base_dir)
csv_files = [
os.path.join(base_dir, 'users.csv'),
os.path.join(base_dir, 'orders.csv')
]
print('file1:', os.path.exists(csv_files[0]))
print('file2:', os.path.exists(csv_files[1]))
# Create connection without providing initial dataframes to trigger lazy-loading
conn_lazy = pica.connect(base_dir=base_dir)
try:
cursor = conn_lazy.cursor()
cursor.execute("SELECT * FROM users")
results = cursor.fetchall()
print("Lazy-loaded users data:")
print(results)
except Exception as e:
print("Error during lazy-loading:", e)
if __name__ == "__main__":
main()
🔥 Supported SQL Operations
SELECT
- Basic SELECT with column selection
- WHERE clause with comparison operators (=, >, <, >=, <=, !=)
- GROUP BY with aggregate functions (COUNT, SUM, AVG, MAX, MIN)
- ORDER BY (ASC/DESC)
- JOIN operations
- Aliases (AS)
Example:
SELECT name, AVG(age) as avg_age
FROM users
WHERE age > 25
GROUP BY name
ORDER BY avg_age DESC
INSERT
- Basic INSERT INTO with VALUES
Example:
INSERT INTO users (name, age) VALUES ('David', 28)
UPDATE
- UPDATE with WHERE clause
Example:
UPDATE users SET age = 29 WHERE name = 'Alice'
DELETE
- DELETE with WHERE clause
Example:
DELETE FROM users WHERE age < 25
📊 Supported Data Types
- INTEGER
- REAL
- BOOLEAN
- DATE
- TEXT
🔄 Transaction Support
conn = pica.connect()
try:
# Perform operations
cursor = conn.cursor()
cursor.execute("UPDATE users SET age = 26 WHERE name = 'Alice'")
conn.commit()
except:
conn.rollback()
finally:
conn.close()
📝 License
MIT License
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pica_dbapi-0.1.1.tar.gz.
File metadata
- Download URL: pica_dbapi-0.1.1.tar.gz
- Upload date:
- Size: 29.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8116ba8289d7b2cd344002e41843a03f5fb5fa37d617c82ae30d7498a3e25236
|
|
| MD5 |
697178ce8e69af89643a262566ca1a96
|
|
| BLAKE2b-256 |
4ec555693418ebaf85573024c9cafcf04bbf809de788b30f6d0148548144880f
|
File details
Details for the file pica_dbapi-0.1.1-py3-none-any.whl.
File metadata
- Download URL: pica_dbapi-0.1.1-py3-none-any.whl
- Upload date:
- Size: 22.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70499d9b273944b3d7c251e0bf488b0b73aa3bf0587b70536fab9f5afca4f15f
|
|
| MD5 |
e9c40307988e542546a47ef735fa65aa
|
|
| BLAKE2b-256 |
57d4a75ec8b3d9e387b2aa7056bf3772f81185fc964108bc2cd794df804faac2
|