omopsurvey

The 'omopsurvey' Python package transforms standardized response codes from the OMOP CDM survey variables into numeric values and simplifies data preparation by providing functionalities for mapping and converting response codes, as well as handling missing data, facilitating easier and more reliable data analysis.

These details have not been verified by PyPI

Project links

Project description

The 'omopsurvey' Python package offers a comprehensive solution for transforming standardized response codes from the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) survey variables into numeric values, streamlining the data preparation process. By automating the mapping and conversion of response codes, as well as the handling of missing data, it makes data analysis more accessible and reliable. The package provides a range of functions designed to help researchers and data analysts efficiently work through OMOP CDM survey data, ensuring accurate mappings of responses and effective management of data discrepancies. This package is a helpful tool for those working with OMOP CDM survey data, offering a path to more profound and accurate analyses by dramatically lowering the burden of data preprocessing.

response_set.py

map_answers(input_data): Maps survey responses to corresponding numeric and text values, and handles survey responses that fall outside the predefined cases.

Parameters:

input_data: DataFrame containing survey responses with columns question_concept_id and answer_concept_id.

Returns: The modified DataFrame with added columns answer_numeric and answer_text containing the mapped values.

create_dummies(input_data): Transforms survey data to include dummy variables for questions that allow multiple answers. Each response is converted into a separate row with a unique identifier, defined as '{question_concept_id}_{answer_concept_id}'. The function then removes the original rows to prevent data redundancy.

Parameters:

input_data: DataFrame containing survey responses with columns question_concept_id and answer_concept_id.

Returns: A new DataFrame with additional rows for select-all-that-apply questions, properly formatted for further analysis.

scale(input_data, variables, scale_name): Calculates a composite score for each participant based on specified variables. This function supports handling missing values and can calculate the score as either the sum or the mean of the selected variables. The results include diagnostic print statements about the calculation.

Parameters:

input_data: DataFrame containing the data to be scored.

variables: List of columns to be included in the score calculation.

score_name: Name of the column where the score will be stored.

na: Boolean indicating whether to include participants with missing data in the calculations (na = True or na = False).

method: Specifies the method for score calculation (method = 'sum' or method = 'mean').

Returns: The original DataFrame with an additional column containing the calculated scores for each participant.

pivot_data.py

pivot(input_data, file_name): Pivots a dataset to structure numeric survey responses in a wide format. The function checks if the specified file exists; if not, it prints an error message and returns. It reads the data from the file into a DataFrame, then pivots this DataFrame so that each row represents a respondent and each column represents a question, with cells containing the numeric answers. The column names are prefixed with 'q' to denote question IDs. The resulting pivot table is saved to a CSV file and then uploaded to a cloud storage bucket using Google Cloud's gsutil tool.

Parameters:

input_data: A pandas DataFrame containing the columns person_id, question_concept_id, and answer_numeric.

file_name: Optional; the name of the file to which the pivot table will be saved. Defaults to 'pivot_n.csv'.

pivot_text(input_data, file_name): Similar to pivot_data_numeric, but pivots text responses instead. The resulting pivot table is saved to a CSV file and then uploaded to a cloud storage bucket using Google Cloud's gsutil tool.

Parameters:

input_data: A pandas DataFrame containing the columns person_id, question_concept_id, and answer_text.

file_name: Optional; the name of the file to which the pivot table will be saved. Defaults to 'pivot_t.csv'.

recode_missing.py

recode(input_data): Processes input data (either in file format or as a pandas DataFrame) to handle missing values according to a specified list of codes. It replaces a predefined set of numeric codes with pandas NA values to standardize the representation of missing data across the dataset.

Parameters:

input_data: This can be either a path to a data file (CSV, TXT, or Excel) or a pandas DataFrame. The function adapts its behavior based on the type of input provided.

Returns: A pandas DataFrame with the missing values recoded as pandas NA.

codebooks.py

codebook(df): Processes input data to generate a structured HTML codebook, which includes a detailed listing of questions and responses formatted neatly. The function first cleans and deduplicates the data, then iteratively builds the formatted data, and finally writes it to an HTML file that is linked for download.

Parameters:

input_data: DataFrame to be processed into a codebook.

Returns: The HTML-formatted codebook and an IPython display link to the generated HTML file.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.8

May 19, 2024

0.0.7

May 14, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omopsurvey-0.0.8.tar.gz (87.9 kB view details)

Uploaded May 19, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omopsurvey-0.0.8-py3-none-any.whl (92.0 kB view details)

Uploaded May 19, 2024 Python 3

File details

Details for the file omopsurvey-0.0.8.tar.gz.

File metadata

Download URL: omopsurvey-0.0.8.tar.gz
Upload date: May 19, 2024
Size: 87.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.12.0 Darwin/23.4.0

File hashes

Hashes for omopsurvey-0.0.8.tar.gz
Algorithm	Hash digest
SHA256	`976d776609a18b1ec2eb09407d3b034e803e7b04fb69302f750df74e14ba06e1`
MD5	`f0187556a1b1cdff7557c5b5483f8fec`
BLAKE2b-256	`51e319a35a5f286d9c5b4692317c0aa5fdf778fcab911cab8ed3d5a88321970b`

See more details on using hashes here.

File details

Details for the file omopsurvey-0.0.8-py3-none-any.whl.

File metadata

Download URL: omopsurvey-0.0.8-py3-none-any.whl
Upload date: May 19, 2024
Size: 92.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.2 CPython/3.12.0 Darwin/23.4.0

File hashes

Hashes for omopsurvey-0.0.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8eee2b5ba4abfc2105408db26ccfb9d56299ccedbe89f3245b6f58bdb13bebff`
MD5	`e62ace36292c0de05e9afe7b131221c4`
BLAKE2b-256	`b32e5d70481388e6f1bdf03828fb6d25f805e8b73438d48ea79fefab8b12c5b3`

See more details on using hashes here.

omopsurvey 0.0.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

response_set.py

pivot_data.py

recode_missing.py

codebooks.py

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes