Pysoda package for Fairdataihub tools

These details have not been verified by PyPI

Project links

Project description

pysoda

Overview

Pysoda is a tool for your python workflows that can help you create datasets in compliance with your favorite FAIR(Findable, Accessible, Interoperable, Reusable) data standards. At the moment, pysoda is primarily focused on neuromodulation, neurophysiology, and related data according to the SPARC guidelines that are aimed at making data FAIR. However, we are envisioning to extend the tool to support other standards such as BIDS, FHIR, etc, in the future.

Pysoda stems from SODA, a desktop software that simplifies the organization and sharing of data that needs to comply to a FAIR data standard. While using the SODA app can be convenient for most investigators, others with coding proficiency may find it more convenient to implement automated workflows. Given that the backend of SODA contains many functions necessary for preparing and submitting a dataset that is compliant with the SPARC Data Structure (SDS) such as:

Creating standard metadata files Generating manifest files Automatically complying with the file/folder naming conventions Validating against the offical SDS validator Uploading dataset to Pennsieve with SDS compliance (ignoring empty folders and non-allowed files, avoiding duplicate files and folders, etc.) And many more

Pysoda makes these functions, which have been thoroughtly tested and validated, easily integratable in automated workflows such that the investigators do not have to re-write them. This will be very similar to the pyfairdatatools Python package we are developing for our AI-READI project as part of the NIH Bridge2AI program.

Workflow

Import the pysoda package into your project and initialize the soda object with the supported standard of your choosing

from pysoda import soda_create
# initialize the soda_create object
# Internal note: soda_create returns the typical sodaJSONObj with additional methods for adding data and metadata [not in version 1]
# It is passed into the module functions just like our sodaJSONObj is passed to the backend of our api

soda = soda_create(standard='sds')

# add a dataset name to the soda object
soda.set_dataset_name('my_dataset')

Structure your data

# get your base dataset files and folders structure
dataset_structure = soda.get_dataset_structure()

# fill out your dataset structure.
# NOTE: YOu will want to reference the
# dataset_structure key in the soda_schema.json file to understand the structure
# and what is required.
dataset_structure['folders'] = {
    'data': {
        'files': {
            'file1': {
                'path': '/home/user/file1.txt', 'relativePath': '/data/file1.txt', 'action': 'new'
            }, 
            'file2': {
                'path': '/home/user/file2.txt', 'relativePath': '/data/file2.txt', 'action': 'new'
            }
        }, 
        'folders': {
            'primary': {
                'files': {
                    'file3': {
                        'path': '/home/user/file3.txt', 'relativePath': '/data/primary/file3.txt', 'action': 'new'
                    }
                }
            }
        },
        'relativePath': '/data'
    },
    'files': {},
    'relativePath': '/'
}


# map your imported data files to the entity structure defined in the soda schema [here](soda_schema.py)
entity_structure = soda.get_entity_structure()

# fill out your entity structure using the schema as a reference
# NOTE: data model not finalized
entity = {'subjectId': 'sub-1', 'metadata': {'age': '1 year', 'sex': 'female'}, 'data-file': '/data/file1.txt'}
entity_structure['subjects'].append(entity)

Create your dataset metadata

# import the metadata module from the soda_create package
from pysoda import metadata

# define your submission metadata
submission = soda.get_submission_metadata()

submission['consortium-data-standard'] = 'standard'
submission['funding-consortium'] = 'SPARC'
submission['award-number'] = '12345'
submission['milestone-acheieved'] = ['one', 'two', 'three']
submission['filepath'] = 'path/to/destination'

# create the excel file for the submission metadata
metadata.submission.create(soda, file_output_location='path/to/output')


# repeat
metadata.subjects.create(soda, file_output_location='path/to/output')
metadata.samples.create(soda, file_output_location='path/to/output')
metadata.performances.create(soda, file_output_location='path/to/output')
metadata.sites.create(soda, file_output_location='path/to/output')
metadata.code.create(soda, file_output_location='path/to/output')
metadata.manifest.create(soda, file_output_location='path/to/output')

Generate your dataset

Generate locally

from pysoda import generate

# set the generation options
soda.set_generate_dataset_options(destination='local', path='path/to/destination', dataset_name='my_dataset')

# generate the dataset
generate(soda)

Generate on Pennsieve

from pysoda import generate

# provide the Pennsieve API Key and secret
soda.upload.auth(api_key='api, api_secret='api_secret)

# upload new dataset
# NOTE: You will need to download and start the Pennsieve Agent [here](https://app.pennsieve.io) to upload data to Pennsieve
dataset_id = generate(soda) # returns dataset_id

# OR upload to an existing pennsieve dataset
# set the generate options in the soda object
soda.set_generate_dataset_options(destination='existing-ps', if_existing="merge", if_existing_files="replace", dataset_id=dataset_id)
update_existing(soda)

Utilities

Compare a dataset on Pennsieve and a local dataset for differences

from pysoda import compare

# provide the Pennsieve API Key and secret
soda.upload.auth(api_key='api, api_secret='api_secret)

# import the dataset from Pennsieve
soda.import_dataset(dataset_id='dataset_id')

# compare the Pennsieve dataset with the local dataset
results = compare(soda, local_dataset_location='path/to/local/dataset')

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.61

Aug 29, 2025

0.1.60

Aug 29, 2025

0.1.59

Aug 29, 2025

0.1.58

Aug 14, 2025

0.1.57

Jul 31, 2025

0.1.56

Jul 31, 2025

0.1.55

Jul 31, 2025

0.1.54

Jul 21, 2025

0.1.53

Jul 21, 2025

0.1.52

Jul 21, 2025

0.1.51

Jul 21, 2025

0.1.50

Jul 20, 2025

0.1.49

Jul 17, 2025

This version

0.1.48

Jul 16, 2025

0.1.47

Jul 16, 2025

0.1.46

Jul 16, 2025

0.1.45

Jul 16, 2025

0.1.44

Jul 16, 2025

0.1.43

Jul 16, 2025

0.1.42

Jul 16, 2025

0.1.41

Jul 16, 2025

0.1.40

Jul 16, 2025

0.1.39

Jul 15, 2025

0.1.38

Jul 15, 2025

0.1.37

Jul 15, 2025

0.1.36

Jul 15, 2025

0.1.35

Jul 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pysoda_fairdataihub_tools-0.1.48.tar.gz (133.0 kB view details)

Uploaded Jul 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pysoda_fairdataihub_tools-0.1.48-py3-none-any.whl (178.6 kB view details)

Uploaded Jul 16, 2025 Python 3

File details

Details for the file pysoda_fairdataihub_tools-0.1.48.tar.gz.

File metadata

Download URL: pysoda_fairdataihub_tools-0.1.48.tar.gz
Upload date: Jul 16, 2025
Size: 133.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.9.23 Linux/6.11.0-1018-azure

File hashes

Hashes for pysoda_fairdataihub_tools-0.1.48.tar.gz
Algorithm	Hash digest
SHA256	`87598f34cb13b82af6a5e887b515d10512eef615b63c9c9e38b192134a50ded1`
MD5	`6c038fdbbcea03fe3b46b40f22e71e56`
BLAKE2b-256	`4c74973992b1655afa51c3eb6a94fed9831ea5020653b4771ef5d758c498b624`

See more details on using hashes here.

File details

Details for the file pysoda_fairdataihub_tools-0.1.48-py3-none-any.whl.

File metadata

Download URL: pysoda_fairdataihub_tools-0.1.48-py3-none-any.whl
Upload date: Jul 16, 2025
Size: 178.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.1.3 CPython/3.9.23 Linux/6.11.0-1018-azure

File hashes

Hashes for pysoda_fairdataihub_tools-0.1.48-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6e0930e6de60a04ce10fe9e0d03525e0e24c3f25a86be4b8aeee09c0d8591b3e`
MD5	`5b73f445c1a10484f067b211bec21550`
BLAKE2b-256	`45d744c369ce09e9b639d3e72e74eae80887a932c8ecfa101975c343454e36b0`

See more details on using hashes here.

pysoda-fairdataihub-tools 0.1.48

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pysoda

Overview

Workflow

Import the pysoda package into your project and initialize the soda object with the supported standard of your choosing

Structure your data

Create your dataset metadata

Generate your dataset

Generate locally

Generate on Pennsieve

Utilities

Compare a dataset on Pennsieve and a local dataset for differences

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes