Skip to main content

General data pipeline for pytorch lightning

Project description

Synaesthesia 🧠🎨

Create general PyTorch data pipelines from simple Python, extendable to any sensors.

Overview 🌟

Synaesthesia is a Python library that forms the foundation of a dataset stack in any PyTorch/PyTorch Lightning projects. It contains base datasets and structures that enable combination, sequencing, concatenation, and other transformations through composition mechanisms.

This library provides a flexible and modular approach to creating datasets and dataloaders for various applications. It's designed to handle different data types, including CSV and image datasets, with the ability to expand functionality through custom classes.

Key Features 🔑

  • Modular Design 🧩: Easily combine different dataset types and operations.
  • Multi-modal Support 🎛️: Handle various sensor modalities and information types.
  • Flexible Combinations 🔗:
    • Parallel combination of datasets (MultiSignalDataset)
    • Serial concatenation of datasets (ConcatDataset)
    • Sequential data retrieval (SequentialDataset)
  • Extensibility 🔌: Users can create custom dataset classes to extend functionality.
  • Built-in Support 📦: Ready-to-use implementations for CSV and image datasets.

Installation 💻

The easiest way of using Synaesthesia is to clone it as a submodule of your system:

git submodule add git@github.com:danieledema/synaesthesia.git .submodules/synaesthesia

Then, use poetry to manage the required packages by including the submodule in the installation path:

poetry add .submodules/synaesthesia

Main Components 🧱

DatasetBase 🏗️

The foundation class for all datasets in the library.

CustomConcatDataset 🔗

Allows concatenation of multiple datasets, preserving individual dataset properties.

MultiSignalDataset 📡

Combines multiple single-signal datasets, supporting various aggregation and fill methods.

SequentialDataset 🔢

Enables retrieval of data sequences from a base dataset, with customizable filtering and stride options.

Filter Classes 🔍

Provides different strategies for data filtering and selection:

  • SkipNFilter
  • MultipleNFilter
  • ExponentialFilter

Usage Examples 📚

# Example 1: Creating a multi-signal dataset
csv_dataset = CSVDataset(...)
image_dataset = ImageDataset(...)
multi_dataset = MultiSignalDataset([csv_dataset, image_dataset])

# Example 2: Creating a sequential dataset
seq_dataset = SequentialDataset(csv_dataset, n_samples=5, stride=2)

# Example 3: Concatenating datasets
concat_dataset = CustomConcatDataset([dataset1, dataset2, dataset3])

Extending the Library 🚀

Users can create custom dataset classes by inheriting from DatasetBase and implementing required methods:

class MyCustomDataset(DatasetBase):
    def __init__(self, ...):
        super().__init__()
        # Custom initialization

    def get_data(self, idx):
        # Implement data retrieval logic

    # Implement other required methods

Contributing 🤝

Contributions are welcome! Please feel free to submit a Pull Request.

License 📄

This project is licensed under the APACHE-2.0 License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synaesthesia-0.1.0.tar.gz (69.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synaesthesia-0.1.0-py3-none-any.whl (24.5 kB view details)

Uploaded Python 3

File details

Details for the file synaesthesia-0.1.0.tar.gz.

File metadata

  • Download URL: synaesthesia-0.1.0.tar.gz
  • Upload date:
  • Size: 69.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.27

File hashes

Hashes for synaesthesia-0.1.0.tar.gz
Algorithm Hash digest
SHA256 734d009b9b33e9dc964fb3156d32b00ed3a15af3189595614fc632252f3bf83d
MD5 4db74b3a7424e706c350d6e6c37fda01
BLAKE2b-256 abdae91248454e8db620a8e950862e60523f83d6f5dd598bc1528b2dee8ec428

See more details on using hashes here.

File details

Details for the file synaesthesia-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for synaesthesia-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 388308b1fb97c217a3c9989637315835adfaa4256722d94d52df18ddb4258353
MD5 0e393a029b5db845044d31c66076fe6c
BLAKE2b-256 31f12f3091efe02d2b4df1df6191cb6d2812f1064884bf1d9048dda2cf5429f4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page