An extension of pandas for efficient representation of nested associated datasets.
Project description
nested-pandas
An extension of pandas for efficient representation of nested associated datasets.
Nested-Pandas extends the pandas package with tooling and support for nested dataframes packed into values of top-level dataframe columns. Pyarrow is used internally to aid in scalability and performance.
Nested-Pandas allows data like this:
To instead be represented like this:
Where the nested data is represented as nested dataframes:
# Each row of "object_nf" now has it's own sub-dataframe of matched rows from "source_df"
object_nf.loc[0]["nested_sources"]
Allowing powerful and straightforward operations, like:
# Compute the mean flux for each row of "object_nf"
import numpy as np
def mean_flux(row):
"""Calculates the mean flux for each object"""
return np.mean(row["nested_sources.flux"])
object_nf.map_rows(mean_flux, output_names="mean_flux")
Nested-Pandas is motivated by time-domain astronomy use cases, where we see
typically two levels of information, information about astronomical objects and
then an associated set of N measurements of those objects. Nested-Pandas offers
a performant and memory-efficient package for working with these types of datasets.
Core advantages being:
- hierarchical column access
- efficient packing of nested information into inputs to custom user functions
- avoiding costly groupby operations
This is a LINCC Frameworks project - find more information about LINCC Frameworks here.
Acknowledgements
This project is supported by Schmidt Sciences.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nested_pandas-0.6.9.tar.gz.
File metadata
- Download URL: nested_pandas-0.6.9.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
5710d18174ecebfcef0477c10aa673665a8ccb7027d37a9a59771b06ba51a8e6
|
|
| MD5 |
cb971264d49aaf8e92a591fdd40fa907
|
|
| BLAKE2b-256 |
242b560ef5066e2776ae1cd2f4f7a03183e25525ac1afa4ba3ea77e02f493774
|
Provenance
The following attestation bundles were made for nested_pandas-0.6.9.tar.gz:
Publisher:
publish-to-pypi.yml on lincc-frameworks/nested-pandas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nested_pandas-0.6.9.tar.gz -
Subject digest:
5710d18174ecebfcef0477c10aa673665a8ccb7027d37a9a59771b06ba51a8e6 - Sigstore transparency entry: 1359161868
- Sigstore integration time:
-
Permalink:
lincc-frameworks/nested-pandas@90bc189800f289483e846a26e7275f85096281d6 -
Branch / Tag:
refs/tags/v0.6.9 - Owner: https://github.com/lincc-frameworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@90bc189800f289483e846a26e7275f85096281d6 -
Trigger Event:
release
-
Statement type:
File details
Details for the file nested_pandas-0.6.9-py3-none-any.whl.
File metadata
- Download URL: nested_pandas-0.6.9-py3-none-any.whl
- Upload date:
- Size: 79.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e95942eb462cc26beea82b1a5b1c04afc1af73519b559af6a694963777f7b6a4
|
|
| MD5 |
8f9ea116cd6cd2cdb3cee2c2356a8360
|
|
| BLAKE2b-256 |
2f1fb342b2483e398e0a8fb53d333426726c73a757f317e2c3b7be3869148508
|
Provenance
The following attestation bundles were made for nested_pandas-0.6.9-py3-none-any.whl:
Publisher:
publish-to-pypi.yml on lincc-frameworks/nested-pandas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nested_pandas-0.6.9-py3-none-any.whl -
Subject digest:
e95942eb462cc26beea82b1a5b1c04afc1af73519b559af6a694963777f7b6a4 - Sigstore transparency entry: 1359161904
- Sigstore integration time:
-
Permalink:
lincc-frameworks/nested-pandas@90bc189800f289483e846a26e7275f85096281d6 -
Branch / Tag:
refs/tags/v0.6.9 - Owner: https://github.com/lincc-frameworks
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-to-pypi.yml@90bc189800f289483e846a26e7275f85096281d6 -
Trigger Event:
release
-
Statement type: