Skip to main content

SQL queries on Pandas data frames

Project description

Seek well, pandas

seekwellpandas (SQL-pandas) is a pandas extension that provides SQL-inspired methods to manipulate DataFrames in a more intuitive way, closely resembling SQL syntax.

Features

seekwellpandas adds the following methods to your pandas DataFrames:

  • select(): Select specific columns, including negative selection.
  • where_(): Filter rows based on a condition.
  • group_by(): Group data by one or more columns.
  • having(): Filter groups based on a condition.
  • order_by(): Sort data by one or more columns.
  • limit(): Limit the number of returned rows.
  • join_(): Join two DataFrames.
  • union(): Union two DataFrames.
  • distinct(): Remove duplicates.
  • intersect(): Find the intersection between two DataFrames.
  • difference(): Find the difference between two DataFrames.
  • with_column(): Add a new column based on an expression.
  • rename_column(): Rename a column.
  • cast(): Change the data type of a column.
  • drop_column(): Remove one or more columns.
  • unpivot(): Transform columns into rows (melt).
  • group_having(): Combine grouping and group filtering.

Installation

You can install seekwellpandas via pip:

pip install seekwellpandas

Usage

Here are some examples of how to use SeekwellPandas:

import pandas as pd
import seekwellpandas

# Create a sample DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3, 4],
    'B': ['a', 'b', 'a', 'b'],
    'C': [10, 20, 30, 40]
})

# Select columns
result = df.select('A', 'B')

# Negative selection
result = df.select(-'A')

# Filter rows redirecting to .query() (the _ avoids overlapping with pandas.DataFrame.where)
result = df.where_('A > 2')

# Group and aggregate
result = df.group_by('B').agg({'A': 'mean', 'C': 'sum'})

# Sort data
result = df.order_by('C', ascending=False)

# Add a new column
result = df.with_column('D', 'A * C')

# Join two DataFrames (the _ avoids overlapping with pandas.DataFrame.join)
df2 = pd.DataFrame({'B': ['a', 'b'], 'D': [100, 200]})
result = df.join_(df2, on='B')

Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request on my GitHub repository.

License

This project is licensed under the GPLv3 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seekwellpandas-0.2.tar.gz (49.3 kB view details)

Uploaded Source

Built Distribution

seekwellpandas-0.2-py3-none-any.whl (17.7 kB view details)

Uploaded Python 3

File details

Details for the file seekwellpandas-0.2.tar.gz.

File metadata

  • Download URL: seekwellpandas-0.2.tar.gz
  • Upload date:
  • Size: 49.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for seekwellpandas-0.2.tar.gz
Algorithm Hash digest
SHA256 746b945e55972771252850915ef2eb00d74c0e0d66f97f09fbffa8079d3c4c6d
MD5 ce2c4fd7f549e0fee79f070dac1ba9c7
BLAKE2b-256 ef1ac1ea7efdf00c1a393cc10ddc414f7be20e54af3248e0c95c954911d86aff

See more details on using hashes here.

File details

Details for the file seekwellpandas-0.2-py3-none-any.whl.

File metadata

  • Download URL: seekwellpandas-0.2-py3-none-any.whl
  • Upload date:
  • Size: 17.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for seekwellpandas-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 bcb97767e650c8a5c1ddcb312bc0bb5a8eb89c9a22c9e687c5d25fb87e5c4b00
MD5 e430405ceef00bece33d25e435ffc0bd
BLAKE2b-256 00888b3951db1781bbc266cfd90df1a259f72b8e96ca59d72dc4fd2cb8caab81

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page