Skip to main content

SQL queries on Pandas data frames

Project description

Seek well, pandas

seekwellpandas (SQL-pandas) is a pandas extension that provides SQL-inspired methods to manipulate DataFrames in a more intuitive way, closely resembling SQL syntax.

Features

seekwellpandas adds several SQL methods to your pandas DataFrames, among them:

  • SELECT(): Select specific columns, including negative selection.
  • WHERE(): Filter rows based on a condition.
  • GROUP_BY(): Group data by one or more columns.
  • HAVING(): Filter groups based on a condition.
  • ORDER_BY(): Sort data by one or more columns.
  • LIMIT(): Limit the number of returned rows.
  • JOIN(): Join two DataFrames.
  • UNION(): Union two DataFrames.
  • DISTINCT(): Remove duplicates.
  • INTERSECT(): Find the intersection between two DataFrames.
  • DIFFERENCE(): Find the difference between two DataFrames.
  • ADD_COLUMN(): Add a new column based on an expression.
  • RENAME_COLUMN(): Rename a column.
  • CAST(): Change the data type of a column.
  • DROP_COLUMN(): Remove one or more columns.
  • UNPIVOT(): Transform columns into rows (melt).
  • GROUP_HAVING(): Combine grouping and group filtering.

Installation

You can install seekwellpandas via pip:

pip install seekwellpandas

Usage

Here are some examples of how to use SeekwellPandas:

import pandas as pd
import seekwellpandas

# Create a sample DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3, 4],
    'B': ['a', 'b', 'a', 'b'],
    'C': [10, 20, 30, 40]
})

# Select columns
result = df.SELECT('A', 'B')

# Negative selection
result = df.SELECT('-A')

# Filter rows redirecting to .query() (the _ avoids overlapping with pandas.DataFrame.where)
result = df.WHERE('A > 2')

# Group and aggregate
result = df.GROUP_BY('B').AVG('A', "mean_A")

# Sort data
result = df.ORDER_BY('C', ascending=False)

# Add a new column
result = df.ADD_COLUMN('D', 'A * C')

# Join two DataFrames (the _ avoids overlapping with pandas.DataFrame.join)
df2 = pd.DataFrame({'B': ['a', 'b'], 'D': [100, 200]})
result = df.JOIN(df2, on='B')

Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request on my GitHub repository.

License

This project is licensed under the GPLv3 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seekwellpandas-0.3.0.tar.gz (25.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seekwellpandas-0.3.0-py3-none-any.whl (29.9 kB view details)

Uploaded Python 3

File details

Details for the file seekwellpandas-0.3.0.tar.gz.

File metadata

  • Download URL: seekwellpandas-0.3.0.tar.gz
  • Upload date:
  • Size: 25.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for seekwellpandas-0.3.0.tar.gz
Algorithm Hash digest
SHA256 78dd4357b3e161538f3ef45b725bad658a0933947aa86a7af8c1b52f038d9765
MD5 dc250ff23d3db1d3181515e34c8a3a97
BLAKE2b-256 b897d095c7c3758d71885ff5b93a47dd78007820a7209704be2477aa597ca3c2

See more details on using hashes here.

File details

Details for the file seekwellpandas-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: seekwellpandas-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 29.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for seekwellpandas-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7a64057333ad0080a918ca80b47e4e74c53da8d88116a4c9d7954ca30ae47a5e
MD5 8680dd3f8094fd0e774d5733c8cad176
BLAKE2b-256 4ca7969bbd27a6ceeb07a7fa8f6d32f628c2003698e081b76fc8b1e72294d058

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page