Skip to main content

A package for querying dataframes using SQL

Project description

https://github.com/zbrookle/dataframe_sql/workflows/CI/badge.svg?branch=master https://pepy.tech/badge/dataframe-sql https://img.shields.io/pypi/l/dataframe_sql.svg https://img.shields.io/pypi/status/dataframe_sql.svg https://img.shields.io/pypi/v/dataframe_sql.svg https://img.shields.io/badge/code%20style-black-000000.svg

dataframe_sql is a Python package that translates SQL syntax into operations on pandas DataFrames, a functionality which is not available in the central pandas package.

Installation

pip install dataframe_sql

Usage

In this simple example, a DataFrame is read in from a csv and then using the query function you can produce a new DataFrame from the sql query.

from pandas import read_csv
from dataframe_sql import register_temp_table, query

my_table = read_csv("some_file.csv")

register_temp_table(my_table, "my_table")

query("""select * from my_table""")

The package currently only supports pandas but there are plans to support dask, rapids, and modin in the future.

SQL Syntax

The SQL syntax for dataframe_sql is exactly the same as the syntax in sql_to_ibis, its underlying package.

You can find the full SQL syntax here

Why use dataframe_sql?

While there are other packages that accomplish the goal of using SQL with pandas DataFrames, all other packages such as pandasql actually use a database on the backend which defeats the purpose of using pandas to begin with. In the case of pandasql which uses SQLite, this can result in major performance bottlenecks. dataframe_sql actually performs native pandas operations in memory on DataFrames, which avoids conflicts that may arise from using external databases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dataframe_sql-0.4.0.tar.gz (29.8 kB view details)

Uploaded Source

File details

Details for the file dataframe_sql-0.4.0.tar.gz.

File metadata

  • Download URL: dataframe_sql-0.4.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for dataframe_sql-0.4.0.tar.gz
Algorithm Hash digest
SHA256 9f5476e4bd2a6c9fddd7896e939ad7e6c76c54a80eea8c20396b3efe3255a44b
MD5 05cbe8f9f1c96c3080668c3dc74f7897
BLAKE2b-256 167a9887946905d91bcf333404f8d57561a3628de681d91f429dcfc6d727b6be

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page