Skip to main content

Array (and numpy) API for ONNX

Project description

Build Status Windows https://dev.azure.com/xavierdupre3/pandas_streaming/_apis/build/status/sdpython.pandas_streaming https://badge.fury.io/py/pandas_streaming.svg MIT License https://codecov.io/gh/sdpython/pandas-streaming/branch/main/graph/badge.svg?token=0caHX1rhr8 GitHub Issues Downloads Forks Stars size

pandas-streaming aims at processing big files with pandas, too big to hold in memory, too small to be parallelized with a significant gain. The module replicates a subset of pandas API and implements other functionalities for machine learning.

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_csv("filename", sep="\t", encoding="utf-8")

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

The module can also stream an existing dataframe.

import pandas
df = pandas.DataFrame([dict(cf=0, cint=0, cstr="0"),
                       dict(cf=1, cint=1, cstr="1"),
                       dict(cf=3, cint=3, cstr="3")])

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_df(df)

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

It contains other helpers to split datasets into train and test with some weird constraints.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_streaming-0.5.1.tar.gz (34.4 kB view details)

Uploaded Source

Built Distribution

pandas_streaming-0.5.1-py3-none-any.whl (36.8 kB view details)

Uploaded Python 3

File details

Details for the file pandas_streaming-0.5.1.tar.gz.

File metadata

  • Download URL: pandas_streaming-0.5.1.tar.gz
  • Upload date:
  • Size: 34.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for pandas_streaming-0.5.1.tar.gz
Algorithm Hash digest
SHA256 ad34c07cd271ea43832962f9eef9e16b3b8cd281748a55de95c2719fc7f7aae9
MD5 42933061542ed3d65536010c279fe184
BLAKE2b-256 6bf0b42921e2c35d7444fda7fa96b0dac34eecbd7c92e9de9e963ca002d14713

See more details on using hashes here.

File details

Details for the file pandas_streaming-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_streaming-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5c5780742c8c6fcf86b7871caa9b4b52a6c10463ed2bd4cebcb6bda9d06c59cc
MD5 e2e5adbdad3dfb8ba06c6aecb241116f
BLAKE2b-256 38bba7e7f01c416200ed03247c688e080d2b50bf697b3819aa368a63ba2c7cc2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page