A simple pandas extension that enables users to execute SQL statements against DataFrames using in-memory SQLite.
Project description
What is pandas-liteql?
pandas-liteql is a simple pandas extension that enables users to execute SQL statements against DataFrames using in-memory SQLite. It is meant to streamline data manipulation and analysis tasks. For more detailed information and examples on pandas-liteql, visit the documentation pages.
What pandas-liteql is not
pandas-liteql is not a competitor to libraries such as PySpark or DuckDB that can perform SQL queries on larger data sets and perform more advanced data science use-cases. Rather, it is inspired by those projects and similar libraries that have performed the same function, but have since been abandoned or were not as user-friendly.
Installing pandas-liteql
pandas-liteql requires a minimum of Python 3.7 and the following libraries:
Library | Version |
---|---|
Pandas | >= 1.3.5 |
SQLAlchemy | >= 1.4.36 |
Assuming these prerequisites are already installed, adding pandas-liteql is as simple as...
pip install pandas-liteql
Examples
Below are some usage examples to load, query, and drop data from the in-memory SQLite sessions established with pandas-liteql and pandas DataFrame integration.
Loading
Start by loading your DataFrame with the load
function. When pandas-liteql is imported, an in-memory SQLite session is created where data can be loaded to.
import pandas as pd
from src import pandas_liteql as lql
# Data set creation
person_data = {
'name': ['Bill', 'Ted', 'Abraham', 'Genghis', 'Napoleon'],
'age': [25, 24, 56, 64, 51],
'email': ['bill@excellent.com', 'ted@excellent.com',
'lincoln@excellent.com', 'khan@excellent.com',
'bonaparte@excellent.com']
}
# DataFrame creation
person_df = pd.DataFrame(data=person_data)
# Loading the DataFrame to in-memory SQLite as the 'person' table
# The 'person' variable is also a LiteQL class containing the table name and schema information
person = lql.load(df=person_df, table_name='person')
print(f'Table name: {person.name}')
print(person.schema)
Output:
Table name: person
name type nullable default autoincrement primary_key
0 index BIGINT True None auto 0
1 name TEXT True None auto 0
2 age BIGINT True None auto 0
3 email TEXT True None auto 0
Querying
Next, query the table using the query
function. Using SQL syntax, the loaded table can be queried and the results will be returned as a pandas DataFrame.
bill_and_ted = lql.query(sql='SELECT * FROM person WHERE age < 30')
print(bill_and_ted)
Output:
index name age email
0 0 Bill 25 bill@excellent.com
1 1 Ted 24 ted@excellent.com
Dropping
If finished with a table within the flow of a script, you can simply drop it with the drop
function to preserve memory.
lql.drop(table_name='person')
The DataFrame SQL Accessor
Lastly, for a more simplistic approach, you can use the liteql.sql
accessor to perform the same functions above in one line and return the result as a pandas DataFrame. This approach requires that you query from the liteql
table that is loaded from the DataFrame, queried, and then dropped.
import pandas as pd
import pandas_liteql as lql
# Data set creation
person_data = {
'name': ['Bill', 'Ted', 'Abraham', 'Genghis', 'Napoleon'],
'age': [25, 24, 56, 64, 51],
'email': ['bill@excellent.com', 'ted@excellent.com',
'lincoln@excellent.com', 'khan@excellent.com',
'bonaparte@excellent.com']
}
# DataFrame creation
person_df = pd.DataFrame(data=person_data)
bill_and_ted = person_df.liteql.sql('SELECT * FROM liteql WHERE age < 30')
print(bill_and_ted)
Output:
index name age email
0 0 Bill 25 bill@excellent.com
1 1 Ted 24 ted@excellent.com
Contributing
Currently, pandas-liteql will not be receiving any additional updates. Contributions will not be accepted here, but feel free to fork this project if you desire.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pandas_liteql-0.5.3.tar.gz
.
File metadata
- Download URL: pandas_liteql-0.5.3.tar.gz
- Upload date:
- Size: 6.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a7ef84dcf13ce07b483bcfdca0675a47925bd4762a69bff2e03b60ffd52bf345 |
|
MD5 | 139cfbe2e84db5cf4f5ca40dfe411f14 |
|
BLAKE2b-256 | cc826f70f8924011b869d0531dcaed7ea3732fa76527f19ceb44cef05565face |
File details
Details for the file pandas_liteql-0.5.3-py3-none-any.whl
.
File metadata
- Download URL: pandas_liteql-0.5.3-py3-none-any.whl
- Upload date:
- Size: 5.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.11.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 84154c73a2a749437edb96529992e9d02239b59b72a9d393f8bfbd0b3cb60690 |
|
MD5 | b35821c0a00edb5a4f464f0d341c05ea |
|
BLAKE2b-256 | d544e3ce82309670518e77de17a4caae2f429776b7b4814d98d5ed508fbde134 |