Skip to main content

A simple pandas extension that enables users to execute SQL statements against DataFrames using in-memory SQLite.

Project description

pandas-liteql-logo.png

What is pandas-liteql?

pandas-liteql is a simple pandas extension that enables users to execute SQL statements against DataFrames using in-memory SQLite. It is meant to streamline data manipulation and analysis tasks. For more detailed information and examples on pandas-liteql, visit the documentation pages.

What pandas-liteql is not

pandas-liteql is not a competitor to libraries such as PySpark or DuckDB that can perform SQL queries on larger data sets and perform more advanced data science use-cases. Rather, it is inspired by those projects and similar libraries that have performed the same function, but have since been abandoned or were not as user-friendly.

Installing pandas-liteql

pandas-liteql requires a minimum of Python 3.7 and the following libraries:

Library Version
Pandas >= 1.3.5
SQLAlchemy >= 1.4.36

Assuming these prerequisites are already installed, adding pandas-liteql is as simple as...

pip install pandas-liteql

Examples

Below are some usage examples to load, query, and drop data from the in-memory SQLite sessions established with pandas-liteql and pandas DataFrame integration.

Loading

Start by loading your DataFrame with the load function. When pandas-liteql is imported, an in-memory SQLite session is created where data can be loaded to.

import pandas as pd
from src import pandas_liteql as lql

# Data set creation
person_data = {
    'name': ['Bill', 'Ted', 'Abraham', 'Genghis', 'Napoleon'],
    'age': [25, 24, 56, 64, 51],
    'email': ['bill@excellent.com', 'ted@excellent.com',
              'lincoln@excellent.com', 'khan@excellent.com',
              'bonaparte@excellent.com']
}

# DataFrame creation
person_df = pd.DataFrame(data=person_data)

# Loading the DataFrame to in-memory SQLite as the 'person' table
# The 'person' variable is also a LiteQL class containing the table name and schema information
person = lql.load(df=person_df, table_name='person')

print(f'Table name: {person.name}')
print(person.schema)

Output:

Table name: person
    name    type  nullable default autoincrement  primary_key
0  index  BIGINT      True    None          auto            0
1   name    TEXT      True    None          auto            0
2    age  BIGINT      True    None          auto            0
3  email    TEXT      True    None          auto            0

Querying

Next, query the table using the query function. Using SQL syntax, the loaded table can be queried and the results will be returned as a pandas DataFrame.

bill_and_ted = lql.query(sql='SELECT * FROM person WHERE age < 30')

print(bill_and_ted)

Output:

   index  name  age               email
0      0  Bill   25  bill@excellent.com
1      1   Ted   24   ted@excellent.com

Dropping

If finished with a table within the flow of a script, you can simply drop it with the drop function to preserve memory.

lql.drop(table_name='person')

The DataFrame SQL Accessor

Lastly, for a more simplistic approach, you can use the liteql.sql accessor to perform the same functions above in one line and return the result as a pandas DataFrame. This approach requires that you query from the liteql table that is loaded from the DataFrame, queried, and then dropped.

import pandas as pd
import pandas_liteql as lql

# Data set creation
person_data = {
    'name': ['Bill', 'Ted', 'Abraham', 'Genghis', 'Napoleon'],
    'age': [25, 24, 56, 64, 51],
    'email': ['bill@excellent.com', 'ted@excellent.com',
              'lincoln@excellent.com', 'khan@excellent.com',
              'bonaparte@excellent.com']
}

# DataFrame creation
person_df = pd.DataFrame(data=person_data)

bill_and_ted = person_df.liteql.sql('SELECT * FROM liteql WHERE age < 30')

print(bill_and_ted)

Output:

   index  name  age               email
0      0  Bill   25  bill@excellent.com
1      1   Ted   24   ted@excellent.com

Contributing

Currently, pandas-liteql will not be receiving any additional updates. Contributions will not be accepted here, but feel free to fork this project if you desire.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_liteql-0.5.3.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

pandas_liteql-0.5.3-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file pandas_liteql-0.5.3.tar.gz.

File metadata

  • Download URL: pandas_liteql-0.5.3.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.11.9

File hashes

Hashes for pandas_liteql-0.5.3.tar.gz
Algorithm Hash digest
SHA256 a7ef84dcf13ce07b483bcfdca0675a47925bd4762a69bff2e03b60ffd52bf345
MD5 139cfbe2e84db5cf4f5ca40dfe411f14
BLAKE2b-256 cc826f70f8924011b869d0531dcaed7ea3732fa76527f19ceb44cef05565face

See more details on using hashes here.

File details

Details for the file pandas_liteql-0.5.3-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_liteql-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 84154c73a2a749437edb96529992e9d02239b59b72a9d393f8bfbd0b3cb60690
MD5 b35821c0a00edb5a4f464f0d341c05ea
BLAKE2b-256 d544e3ce82309670518e77de17a4caae2f429776b7b4814d98d5ed508fbde134

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page