A package that can generate low-fidelity synthetic CDISC SDTM data based on intelligent sequence generators

These details have not been verified by PyPI

Project links

source

Development Status
- 2 - Pre-Alpha
Intended Audience
- Science/Research
Operating System
- OS Independent
Programming Language

Project description

Synthetic SDTM (ssdtm)

This library provides a collection functions to create synthetic CDISC SDTM data. It is largely done using intelligent sequence generators powered by domain knowledge.

Background

The dummy or low-fidelity synthetic SDTM data would be very valuable in multiple scenarios. A few use cases listed below:

Testing and Validation of Systems:

System Configuration: Using dummy data allows for thorough testing and configuration of data management systems before real data is collected, ensuring that systems are correctly set up and can handle the expected data formats and volumes .
Software Validation: Dummy data is essential for validating the software tools used for data capture, processing, and analysis, ensuring they work correctly under various scenarios and edge cases.

Training and Education:

Staff Training: Dummy data provides a safe and realistic way to train clinical staff, data managers, and statisticians on data entry, management, and analysis processes without risking patient confidentiality or data integrity .
Protocol Familiarization: It could help the study team familiarize themselves with the study protocols and data collection methods, improving overall preparedness and efficiency.

Protocol Development and Refinement:

CRF and Protocol Testing: Dummy data can be used to test and refine clinical trial protocols and case report forms (CRFs) before actual patient data is collected, identifying potential issues and making necessary adjustments early in the process .
Scenario Simulation: Simulating various scenarios using fake data helps in identifying and mitigating risks, ensuring the protocol is robust and ready for real-world application.

Quality Control:

Error Detection: By using dummy data, potential data entry errors, inconsistencies, and system flaws can be identified and corrected before the actual trial begins, enhancing data quality and reliability .
Process Optimization: It allows for the optimization of data collection and processing workflows, ensuring they are efficient and capable of handling real data smoothly.

Regulatory Compliance:

Compliance Testing: Ensures that all data handling and processing systems comply with regulatory standards and guidelines by testing with dummy data first, reducing the risk of non-compliance during the actual trial .

Confidentiality and Security:

Safe Testing Environment: Using fake data protects patient confidentiality and adheres to privacy regulations during system testing and staff training, minimizing the risk of data breaches and ethical issues .
Security Assessment: Dummy data can be used to test the security measures of data management systems, ensuring they are robust enough to protect sensitive patient information when real data is collected.

Shorter study startup time

Test and validate the data pipelines: Having access to realistic dummy data allows to test and validate the data entry and data transfer pipelines before the First-Patient-In milestone of a study. This results in a shorter study startup time.

Free software: MIT license

Tutorial

How to install

$ pip install ssdtm

Basic Usage

import ssdtm as sd

	
# Generate synthetic single-domain (adverse events) data for 5 patients
ae = sd.get_adverse_events(5)

# Generate synthetic single-domain (concomitant medication) data 5 patients
cm = sd.get_conmeds(5)

# Generate synthetic single-domain (adverse events) data 5 patients
dm = sd.get_demographics(5)

# Generate synthetic single-domain (adverse events) data 5 patients
ex = sd.get_exposure(5)

# Generate lab anbalytes dataset for 8 patients, where each patient has data for 4 visits.
lb = sd.get_lab_analytes(8,4)

# Generate vital signs dataset for 8 patients, where each patient has data for 4 visits.
vs = sd.get_vital_signs(8,4)

# Generates CDISC SDTM data for 6 domains (ae, cm, dm, ex, lb, and vs)
data = sd.get_sdtm_data(8,4)
# Then you can access individual domain-specific dataframes as follows
data['cm']
data['dm']
data['vs']

# This generates and saves the SDTM data for 6 common SDTM domains in the local directory
sd.save_sdtm_data(8,4)

# Generate vital signs dataset for 8 patients, assuming 5 visits per patient.
rs = sd.get_response(8)

# Generate vital signs dataset for 8 patients, where each patient can have 1 to 5 tumors.
tu = sd.get_tumor_identification(8)

# Generate tumor results dataset for 8 patients, where each patient can have 1 to 5 tumors.
tr = sd.get_tumor_results(8)

# Generates CDISC SDTM data for 6 generic domains (ae, cm, dm, ex, lb, and vs) and additional therapeutic area specific domains (e.g. for 'oncology' we would have rs, tu and tr)
data = sd.get_sdtm_data(8,4, 'oncology')
# Then you can access individual domain-specific dataframes as follows
data['cm']
data['dm']
data['vs']
# And TA-specific individual domain dataframes as follows
data['rs']
data['tu']
data['tr']

# This generates and saves the SDTM data for 6 common SDTM domains and 3 therapeutic area specific domains in the local directory
sd.save_sdtm_data(8,4, 'oncology')

Project details

These details have not been verified by PyPI

Project links

source

Development Status
- 2 - Pre-Alpha
Intended Audience
- Science/Research
Operating System
- OS Independent
Programming Language

Release history Release notifications | RSS feed

This version

0.1.3

Aug 3, 2024

0.1.2

Aug 3, 2024

0.1.1

Nov 21, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ssdtm-0.1.3.tar.gz (8.6 kB view details)

Uploaded Aug 3, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ssdtm-0.1.3-py3-none-any.whl (6.7 kB view details)

Uploaded Aug 3, 2024 Python 3

File details

Details for the file ssdtm-0.1.3.tar.gz.

File metadata

Download URL: ssdtm-0.1.3.tar.gz
Upload date: Aug 3, 2024
Size: 8.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for ssdtm-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`2221fae5c0dc635a4792895795389c76244399ff2d26a4afed4a024b5dd2ba38`
MD5	`9303a41b6abf848200ca57c4d883848e`
BLAKE2b-256	`a722aadc6ebfe79c1235658dc45da4c4d69fd32863c7f7fb45b7d2c1f6858e87`

See more details on using hashes here.

File details

Details for the file ssdtm-0.1.3-py3-none-any.whl.

File metadata

Download URL: ssdtm-0.1.3-py3-none-any.whl
Upload date: Aug 3, 2024
Size: 6.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.31.0 setuptools/45.2.0 requests-toolbelt/0.8.0 tqdm/4.30.0 CPython/3.8.10

File hashes

Hashes for ssdtm-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b1b570bd56c11388e00e71cf958ce4a47c53df32fa5462e822db77fff9a175ed`
MD5	`178252227551a6f9153c1772e4fb6420`
BLAKE2b-256	`501ea3fcb23e4f662bf18e53e283e95fb12fc3cb7399d6d56a331980ca04bf5c`

See more details on using hashes here.

ssdtm 0.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Synthetic SDTM (ssdtm)

Background

Testing and Validation of Systems:

Training and Education:

Protocol Development and Refinement:

Quality Control:

Regulatory Compliance:

Confidentiality and Security:

Shorter study startup time

Tutorial

How to install

Basic Usage

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes