Skip to main content

Use SQL expressions to query Pandas DataFrames

Project description

SQLtoPandas (WIP):

Use SQL queries on Pandas DataFrames.

Are you comfortable with SQL but not Pandas? Or maybe you're comfortable with Pandas but not SQL? Well, this library allows querying of Pandas DataFrames using SQL syntax. Hopefully it will let you learn SQL if you already understand Pandas, or learn how Pandas DataFrames behave if you already know SQL.

Unlike pandasql, this package does not create a local sqlite3 data on the users computer and query from that. Rather, it converts SQL commands directly into pandas code.

Requirements:

Install all required dependencies through pip when this package is released, or generate the conda dev environment with conda create --file environment.yml, and activate with conda active sqltopandas. This does require Python 3.8 because getting variable literal names with f-string debugging is a critical part of the code infrastructure. I don't think there is a nice way to make that backwards compatible, sadly.

Usage:

Example usage:

>>> import sqltopandas
>>> spd = sqltopandas.SQLtoPD()
>>> df = pd.DataFrame(np.array([[1, 1, 3], [5, 5, 6], [7, 8, 9]]),columns=['a', 'b', 'c'])
>>> df
   a  b  c
0  1  1  3
1  5  5  6
2  7  8  9
>>> spd.parse(df, 'SELECT a, b FROM df')
   a  b
0  1  1
1  5  5
2  7  8

>>> spd.parse(df, """SELECT a, b FROM df
...                  WHERE a!=1; """)
   a  b
1  5  5
2  7  8

>>> spd.parse(df, """SELECT a, b, c FROM df;
... 		            ORDER BY a DESC;
...           		   LIMIT 2;""")
   a  b  c
2  7  8  9
1  5  5  6

Obvious edge cases

DataFrame column names, as well as DataFrame names, cannot be SQL keywords. For example, a column name with "SELECT" or "select" will throw an error.

Syntactical Rules

SQL tends to be quite lax with syntax. However, this library is not. Each SQL statement must end with a ;. If it does not, it will not be parsed correctly. For example, SELECT ... FROM ... WHERE; is one statement, as we define which columns and rows to select, which DataFrame to select them from. Think of each statement as being a complete mathematical expression. SELECT ... FROM ...; WHERE ...; is an incorrect statement because WHERE does not have a reference. This rules may change as the package is updated, so visit this page for the most updated documentation.

Contributing:

If you have read this far I hope you've found this tool useful. I am always looking to learn more and develop as a programmer, so if you have any ideas or contributions, feel free to write a feature or pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqltopandas-0.1.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

sqltopandas-0.1-py2.py3-none-any.whl (12.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file sqltopandas-0.1.tar.gz.

File metadata

  • Download URL: sqltopandas-0.1.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.5

File hashes

Hashes for sqltopandas-0.1.tar.gz
Algorithm Hash digest
SHA256 a7083b680ae7bc9742482f21a6954e2571855a1d2b4be8c1f3d85bf89946c05a
MD5 0c8c1b02e5cb693802bdd5eb0f7b9f08
BLAKE2b-256 1dfaf6d1f96ba64b572af2be3843bb7172a640b2c8cc2c42339daf53d4c11038

See more details on using hashes here.

File details

Details for the file sqltopandas-0.1-py2.py3-none-any.whl.

File metadata

  • Download URL: sqltopandas-0.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/49.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.8.5

File hashes

Hashes for sqltopandas-0.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 718d61d4017bc8f577604fbddfa259658cbf4651a15060c4fafe26755be2a36e
MD5 4c25ecb5ba2c002d75b56b8190488462
BLAKE2b-256 bdfbfb318086a47acd14c4eeb7203fc5ac2c8bb195ef4257d1258f2885fffbe1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page