Skip to main content

Memory frugal torch dataset from a csv collection

Project description

csvsdataset

csvsdataset is a Python library designed to simplify the process of working with multiple CSV files as a single dataset. The primary functionality is provided by the CsvsDataset class in the csvsdataset.py module.

Installation

To install the csvsdataset library, simply run:

pip install csvsdataset

Usage

    from csvsdataset.csvsdataset import CsvsDataset
    
    # Initialize the CsvsDataset instance
    dataset = CsvsDataset(folder_path="path/to/your/csv/folder",
                          file_pattern="*.csv",
                          x_columns=["column1", "column2"],
                          y_column="target_column")
    
    # Iterate over the dataset
    for x_data, y_data in dataset:
        # Your processing code here
        pass
    
    # Access a specific item in the dataset
    x_data, y_data = dataset[42]

Memory frugality

Only data from a small number of csv files is maintained in memory. The rest is discarded on a LRU basis. This class is intended for use when a very large number of data files exist which cannot be loaded into memory conveniently.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csvsdataset-0.0.5.tar.gz (35.0 MB view details)

Uploaded Source

Built Distribution

csvsdataset-0.0.5-py3-none-any.whl (35.3 MB view details)

Uploaded Python 3

File details

Details for the file csvsdataset-0.0.5.tar.gz.

File metadata

  • Download URL: csvsdataset-0.0.5.tar.gz
  • Upload date:
  • Size: 35.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for csvsdataset-0.0.5.tar.gz
Algorithm Hash digest
SHA256 2307faa9bdf419ab8a5c855f91fa2308f6fa11a57978bd65766d710ce3247e67
MD5 1c8b2bd6586503d12137c775559476c7
BLAKE2b-256 0823b6ef819fc3684f135fc033dfc017e3ef0162b7f5d1730870177d98624e84

See more details on using hashes here.

File details

Details for the file csvsdataset-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: csvsdataset-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 35.3 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for csvsdataset-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 768661fdabb5d553097f02fa44e2cea0692973047c9ae8068ba7bb5975300379
MD5 0e56bcf06af0a5314cebfdf0b50f707d
BLAKE2b-256 a3f5a524f3f69503c74518ba3703eae1d76d46e1be8e6a3e5143ebd27d0ceda5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page