Skip to main content

An extension of pandas for efficient representation of nested associated datasets.

Project description

nested-pandas

Template

PyPI Conda

GitHub Workflow Status codecov Read the Docs benchmarks

An extension of pandas for efficient representation of nested associated datasets.

Nested-Pandas extends the pandas package with tooling and support for nested dataframes packed into values of top-level dataframe columns. Pyarrow is used internally to aid in scalability and performance.

nested-pandas API

Nested-Pandas allows data like this:

pandas dataframes

To instead be represented like this:

nestedframe

Where the nested data is represented as nested dataframes:

   # Each row of "object_nf" now has it's own sub-dataframe of matched rows from "source_df"
   object_nf.loc[0]["nested_sources"]

sub-dataframe

Allowing powerful and straightforward operations, like:

   # Compute the mean flux for each row of "object_nf"
   import numpy as np

   def mean_flux(row):
   """Calculates the mean flux for each object"""
       return np.mean(row["nested_sources.flux"])

   object_nf.map_rows(mean_flux, output_names="mean_flux")

using reduce

Nested-Pandas is motivated by time-domain astronomy use cases, where we see typically two levels of information, information about astronomical objects and then an associated set of N measurements of those objects. Nested-Pandas offers a performant and memory-efficient package for working with these types of datasets.

Core advantages being:

  • hierarchical column access
  • efficient packing of nested information into inputs to custom user functions
  • avoiding costly groupby operations

This is a LINCC Frameworks project - find more information about LINCC Frameworks here.

Acknowledgements

This project is supported by Schmidt Sciences.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nested_pandas-0.6.10.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nested_pandas-0.6.10-py3-none-any.whl (81.1 kB view details)

Uploaded Python 3

File details

Details for the file nested_pandas-0.6.10.tar.gz.

File metadata

  • Download URL: nested_pandas-0.6.10.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nested_pandas-0.6.10.tar.gz
Algorithm Hash digest
SHA256 f8eb60372ba8953168147bec25b820a8813d17e0fe58de1b3dbd3cd0746ba0b2
MD5 06953d0987d1b2c37c8f38adb2bbd333
BLAKE2b-256 7792c5edd6b35f7b98a78acb6cd369705229f549b14bea2e4fac0e3809a164b0

See more details on using hashes here.

Provenance

The following attestation bundles were made for nested_pandas-0.6.10.tar.gz:

Publisher: publish-to-pypi.yml on lincc-frameworks/nested-pandas

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nested_pandas-0.6.10-py3-none-any.whl.

File metadata

  • Download URL: nested_pandas-0.6.10-py3-none-any.whl
  • Upload date:
  • Size: 81.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for nested_pandas-0.6.10-py3-none-any.whl
Algorithm Hash digest
SHA256 10fa2a7583041b61d2f987d4bb6021dc80f862bc4a56e3d391d20bb787f2a1e0
MD5 6c3cc66615d50fd8438e05779c1047e4
BLAKE2b-256 b3467f5fc49e6e7c0cf1c98c511e09749e99c944702bc030ff53e3d0f2ebf469

See more details on using hashes here.

Provenance

The following attestation bundles were made for nested_pandas-0.6.10-py3-none-any.whl:

Publisher: publish-to-pypi.yml on lincc-frameworks/nested-pandas

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page