A lightweight and extensible Python package for managing data, tailored for researchers working with structured data.

These details have not been verified by PyPI

Project links

Project description

📦 dwrappr

A lightweight and extensible Python package for managing data, tailored for researchers working with structured data. In addition to general data management features, the package introduces a data structure specifically optimized for ML research. This common format enables researchers to efficiently test new algorithms and methods, streamlining collaboration and ensuring consistency in data management across projects.

🧩 Features

🗃️ Consistent dataset object structure for handling structured data in ML use cases
🔄 Support for building a file-based internal dataset collaboration platform for researchers
🧰 General utilities for managing data like saving and loading

🚀 Quickstart

For executing the quickstart examples and get an overview of dwrappr's functionalities, please have a look at IEEE_examples.

Additional functionalities are showcased in:

loading_dataset_from_file.py: Shows how to load a dataset from an existing dataset file
scanning_folder_for_datasets.py: Shows how to scann a folder vor available datasets
dataset_functionalities.py : Shows some of the main functionalities of the DataSet class.

👀 Functionality Ipnsights

Scan folder for dataset

DATASET_FOLDER = "./data/datasets/"
available_datasets = DataSet.get_available_datasets_in_folder(
    DATASET_FOLDER
)
available_datasets.T

Loading specific dataset

DATASET_FILEPATH = "./data/datasets/manufacturing_process_ds.joblib"
ds = DataSet.load(DATASET_FILEPATH)

Generating dataset from raw data

RAW_DATA_FILEPATH= "./data/raw_data.csv"
#load raw data into pandas.DataFrame
df = pd.read_csv(RAW_DATA_FILEPATH)
"""
<some manual dataset preprocessing steps
like dropping missing values and chaning dtypes>
"""
#define metaData
meta = DataSetMeta(
    name = "example_dataset",
    synthetic_data=True,
    time_series=False,
    feature_names=["feature"],
    target_names=["target"]
)
#generate DataSet
ds = DataSet.from_dataframe(
    df=df,
    meta=meta
)
#saving dataset
ds.save("./data/example_dataset.joblib", drop_meta_json=True)

Split dataset

(train/test-split)

import numpy as np
n_instances = 100
# Create the 'product_id' feature with 3 different categorical values
product_ids = np.random.choice(['1001', '2002', '3003', '4004', '5005', '6006', '7007'], size=n_instances)
# Generate two additional numeric features
feature_1 = np.random.rand(n_instances) * 100  # Random numbers between 0 and 100
feature_2 = np.random.rand(n_instances) * 50   # Random numbers between 0 and 50
# Generate a numeric target
target = feature_1 * 0.5 + feature_2 * 0.3 + np.random.randn(n_instances) * 5  # Adding some noise
# Create a DataFrame
df = pd.DataFrame({
    'product_id': product_ids,
    'feature_1': feature_1,
    'feature_2': feature_2,
    'target': target
})

ds = DataSet.from_dataframe(
    df=df,
    meta = DataSetMeta(
        name = "example_dataset",
        synthetic_data=True,
        time_series=False,
        feature_names=["product_id", "feature_1", "feature_2"],
        target_names=["target"]
    )
)

train_ds, test_ds = ds.split_dataset(
    first_ds_size=0.5,
    shuffle=True,
    group_by_features=["product_id"]
)

📄 Help

See Documentation for details.

🛠️ Package Installation

full version: pip install dwrappr
light version (excluding sklearn library): pip install dwrappr[light]

(keep package updated with pip install dwrappr --upgrade)

🔧 Maintainer

This project is maintained by Nils

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.16

Sep 22, 2025

1.0.15

Sep 10, 2025

1.0.14

Sep 1, 2025

This version

1.0.13

Jul 21, 2025

1.0.12

Jul 21, 2025

1.0.11

Jul 21, 2025

1.0.10

Jul 21, 2025

1.0.9

Jul 21, 2025

1.0.8

Jun 23, 2025

1.0.7

Jun 17, 2025

1.0.6

Jun 17, 2025

1.0.5

Jun 16, 2025

1.0.4

Jun 16, 2025

1.0.3

Jun 16, 2025

1.0.2

Jun 16, 2025

1.0.1

Jun 10, 2025

0.0.13

Jun 10, 2025

0.0.12

Jun 9, 2025

0.0.11

May 21, 2025

0.0.10

May 20, 2025

0.0.9

May 20, 2025

0.0.8

May 19, 2025

0.0.7

May 19, 2025

0.0.6

May 14, 2025

0.0.5

May 14, 2025

0.0.4

May 13, 2025

0.0.3

May 13, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dwrappr-1.0.13.tar.gz (20.7 kB view details)

Uploaded Jul 21, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dwrappr-1.0.13-py3-none-any.whl (20.5 kB view details)

Uploaded Jul 21, 2025 Python 3

File details

Details for the file dwrappr-1.0.13.tar.gz.

File metadata

Download URL: dwrappr-1.0.13.tar.gz
Upload date: Jul 21, 2025
Size: 20.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for dwrappr-1.0.13.tar.gz
Algorithm	Hash digest
SHA256	`80027061b349d967367f379277bee68853c22ab86daa541d8da8565ec6e20f32`
MD5	`3537566608a3fa00ee3451dd8417edf9`
BLAKE2b-256	`75e7a8cdbe440cc052eebbf735dc6197b0bbca9cd5c351ef31e07d8c4b36f782`

See more details on using hashes here.

File details

Details for the file dwrappr-1.0.13-py3-none-any.whl.

File metadata

Download URL: dwrappr-1.0.13-py3-none-any.whl
Upload date: Jul 21, 2025
Size: 20.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.13

File hashes

Hashes for dwrappr-1.0.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f779ff7553e50584519aa951564850e1c4adcf27e2545d3c4399b74262596b40`
MD5	`16ab7cb54818262c27257014b32ddef2`
BLAKE2b-256	`e2695508029bc0d82ead1e7d114462a713fa78f6bc11cf17684cdf03d7a3f184`

See more details on using hashes here.

dwrappr 1.0.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

📦 dwrappr

🧩 Features

🚀 Quickstart

👀 Functionality Ipnsights

Scan folder for dataset

Loading specific dataset

Generating dataset from raw data

Split dataset

📄 Help

🛠️ Package Installation

🔧 Maintainer

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes