Skip to main content

A package for comparing survey data between samples from Deliberative Polling experiments.

Project description

How To

Purpose

This guide is designed to assist researchers in efficiently leveraging this Python package for the analysis of survey data in Deliberative Polling experiments. Although this package was developed for the purposes of conducting Deliberative Polling, this package can be used to analyze survey data from any experiment.

Installation

This package has one function outputs which takes an IBM SPSS Statistics file ending in .SAV and outputs .xlsx and .docx documents containing comparisons of the responses of all specified treatment groups at all times, with data weighted by all statistical weights provided.

Note: If you are a Stanford affiliate, you can likely get SPSS for free at software.stanford.edu.

To run the function, install this Python package to your device by running Python in the terminal or an IDE.

The outputs function in this Python package

Note: To install Python, go to python.org/downloads. To activate Python in a terminal, run Python3.

```bash pip install DeliberativePolling ```

Then in a directory containing the .SAV file, run:

```python from DeliberativePolling import outputs outputs("your_file.SAV") ```

For example, outputs("Sample.SAV").

Data Format

While the survey data may be provided to you in .csv, .xlsx, or any other format, the function that outputs tables and reports requires that the data be an .SAV file format from IBM SPSS Statistics. It import data into SPSS, open SPSS and navigate to File and Import Data or simply copy and paste the data directly into the tab Data View

Metadata Configuration

Once the data has been imported into SPSS, you need to provide metadata about the variables in the data. In the "Measure" column of "Variable View", variables can be classified into three kinds of "Measure": "Nominal", "Ordinal", or "Scale".

Nominal Variables

These are categorical variables that do not have an intrinsic order. For example, Gender, where categories like male, female, and non-binary do not have a specific sequence. There may be some exceptions to this rule. For example, some variables like Income Level may have some order to them, so it may be tempting to classify them as Ordinal. In general, that would be a mistake.

Ordinal Variables

These are categorical variables with a clear, definable order. For example, data derived from a Likert scale ranging from 0 to 10. The values indicate a progression from least to most favorable (or vice versa).

Scale Variables

Variables not classified as either Nominal or Ordinal are listed under this category. These can be continuous or discrete variables. In order for outputs() to identify the different times, experiment groups, and participants in the data, it needs at least three variables classified as "Scale": "ID", "Time", and "Group".

Weight Variables

[Explain]

Variable Identification

  • ID: A variable indicating individuals within the sample. IDs are necessary because the code compares the responses of individuals at multiple times.
  • Time: A variable indicating when the responses were given.
  • Group: A variable indicating what kind of experiment treatment, for example, "Treatment" or "Control".

Data Cleaning

  1. Column Labels: Column labels explain the meaning of variables. For example, for the Nominal variable "Education" in Sample.SAV, a good label would be something short but more descriptive like "Highest Education Level".
  2. Value Labels: Value labels explain the meaning of the values in the variables. For example, the Variable "Age" in Sample.SAV has the values "1", "2", "3", and "4".

Note: Ensure that all values have labels, otherwise the outputs function will return an error message explaining which values are unlabeled.

Execution

Once the data has been cleaned and all metadata has been inputted, the outputs function can now be run. If there is missing metadata, then the outputs function will return an error and indicate what the missing data is.

Output

The outputs function will create a folder in the directory named "Outputs", which will contain all the outputting Tables and Reports. All tables and reports will be outputted in XLSX format. If the tables and reports are reasonable sized (under 50,000 cells) then they will also be exported in a Word document format.

Purpose: This guide is designed to assist professionals in efficiently leveraging a Python package tailored for the analysis of survey data in Deliberative Polling experiments.

The first step in analyzing a data set of surveys of treatment and control from a Deliberative Polling experiment is to produce tables comparing the responses of these groups before deliberation and after deliberations — and, in some cases, midway through deliberation and follow ups some period of time after the deliberation has concluded.

The function outputs() in this python package takes an SPSS file and outputs comparisons of the responses of all treatment groups at all times with data weighted by all statistical weights provided.

Note: If you are a Stanford affiliate you can likely get SPSS for free at (link) software.stanford.edu.

While the data may be provided to you as an XLSX or CSV file, the function that outputs tables and reports python package requires that the data be an SAV file from IBM SPSS Statistics. As such, you must ensure your data is in an .SAV file format from IBM SPSS Statistics.

Once the data has been imported into SPSS, you need to provide metadata about the variables in the data.

In the "Measure" column of "Variable View", variables can be classified into three kinds of "Measure": "Nominal", "Ordinal", or "Scale".

[Image of measure options]

Nominal Variables: These are categorical variables that do not have an intrinsic order. For exmaple, Gender, where categories like male, female, and non-binary do not have a specific sequence. There may be some exceptions to this rule. For example, some variables like Income Level may have some order to them, so it may be tempting to classiy them as Ordinal. In general, that would be a mistake. This code compares how Ordinal variables change with respect to Nominal variables. For example, how participants in a Deliberative Polling experiment change how they rate our democracy from 0 to 10 (Ordinal) with respect to a demographic variables like income level (Nominal).

Ordinal Variables: These are categorical variables with a clear, definable order. For example, data derived from a Likert scale ranging from 0 to 10. The values indicate a progression from least to most favorable (or vice versa). Again, these are the response variables we are interested in seeing how they change.

Scale Variables: Variables not classified as either Nominal or Ordinal are listed under this category. These can be continuous or discrete variables. In order for outputs() to identify the different times, experiment groups, and participants in the data, it needs at least three variables classified as "Scale": "ID", "Time", and "Group".

"ID" is a variable indicating individuals within the sample. IDs are necessary because the code compares the responses of individuals at multiple times. By matching IDs, it is clear how the invidiuals responses change from say, "Pre-Deliberation" to "Post-Deliberation". IDs need not be numeric. They may also be emails or any other identifying value that is consistent for the same individuals across survey responses.

"Time" is a variable indicating when the responses wre given. For example, a value of "T1" in "Time" would indicate that all the responses in that row were given at T1 (which usually means before deliberation). Whereas, a value of "T2" would indicate that responses were given after T1 (which would generally mean after deliberation). While using values of "T1" or "T2" is common in experimental analysis, you can use more descriptive values like "Pre-Deliberation" and "Post-Deliberation."

"Group" is variable indicating what kind of experiment treatment, for example, "Treatment" or "Control".

[Image of these variables]

CLEAN DATA "10-Strongly disgaree"

Once your variables are properly classified by Measure, you need to label the data. There are two types of labels.

Column Labels: Column labels explain the meaning of variables. Variables in SPSS cannot have spaces or punctuation in them. As such, you need to put full, descriptive labels in to explain the meaning of variables in the "Label" column in "Variable View". For example, for the Nominal variable "Education" in Sample.SAV, a good label would be something short but more descriptive like "Highest Education Level". Keep Nominal variable labels short, as they will go in the file name of ordinal tables. For Ordinal variable labels, you can put full, descriptive labels in. For example, for the Ordinal Variable "Question1" in Sample.SAV, the column label is the question itself: "How well does democracy function?". The length of column labels for ordinal variables is less important than for nominal values as these column labels do not appear in file names, merely cells within XLSX and DOCX files and so can be many characters long.

Value Labels: Value labels explain the meaning of the values in the variables. For example, the Variable "Age" in Sample.SAV has the values "1", "2", "3", and "4". Without prior knowledge of how the data has been coded, these values have no meaning. In the "Values" column in "Variable View", you input value labels to give these codes meaning. In "Age" in Sample.SAV, "1" corresponds to "18-30", "2" corresponds to "30-50", etc. Note that multiple values can share the same label. For example, for Ordinal variable "Question1" values 0 through 4 are labeled as "Poorly" and values "6-10" are labeled as "Well".

Note: Some survey programs label nonresponses like NA or "Don't Know" as 99 or 98. Because the average of Ordinal values are taken by outputs

[Image of age value labels]

Ensure that all values have labels, otherwise the outputs function will return an error message explaining which values are unlabeled.

Note that columns "Width", "Decimals", "Missing", "Columns", "Align" and "Role" in "Variable View" can generally be ignored when preparing the SPSS file for use in outputs().

In the "Type" column of "Variable View", Ordinal and Nominal variables must be set to "Numeric".

The data must be cleaned before use. In order to make all responses in Ordinal and Nominal variables numeric, Nominal values of, for example, "Male" should be recoded into numeric versions like "1" and the appropriate value labels indicating this coding put into the "Values" column in "Variable View". For Ordinal, for example, an entry of "10-Strongly disagree" should be rewritten as "10" in numeric with a value label put in to indicate that "10" is "Strongly Disagree" (or simply "Disagree" along with "9", "8", "7", and "6").

Once the data has been cleaned and all metadata has been inputted, the Outputs function can now be run. If there is missing metadata, then the Outputs function will return an error and indicate what the missing data is.

To run the function, install this Python package to your device by running Python in the terminal or an IDE.

Note: To install Python go to python.org/downloads. To activate Python in a terminal run "Python3".

In a Python terminal, run: "pip install DeliberativePolling"

\Then in a directory containing the .SAV file run: "from DeliberativePolling import outputs" "outputs("your_file.SAV")" For example, "outputs("Sample.SAV")"

The Outputs function will then creatre a folder in the directory named "Outputs", which will contain all the outputting Tables and Reports.

All tables and reports will be outputted in XLSX format. If the tables and reports are reasonable sized (under 50,000 cells) then they will also be exported in a Word document format.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DeliberativePolling-0.1.9.tar.gz (12.1 kB view details)

Uploaded Source

Built Distribution

DeliberativePolling-0.1.9-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file DeliberativePolling-0.1.9.tar.gz.

File metadata

  • Download URL: DeliberativePolling-0.1.9.tar.gz
  • Upload date:
  • Size: 12.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for DeliberativePolling-0.1.9.tar.gz
Algorithm Hash digest
SHA256 30ca637bf2a3e338b1c01c388ac05ccfe3d5942492a4f20a2414fb072aaf7f83
MD5 ceb704210744ac6f0b628f27ad78aec0
BLAKE2b-256 50ccdf6543591ebc23156ed682c76a2842fe8ebd0179d46493e59be9ab964e65

See more details on using hashes here.

File details

Details for the file DeliberativePolling-0.1.9-py3-none-any.whl.

File metadata

File hashes

Hashes for DeliberativePolling-0.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 752cbc3de5abbeefc8daff1632d3a75d39d7dbc6bb47c220c5888f75424d5f90
MD5 0887ae117d0085f2c49f975a9f8fd337
BLAKE2b-256 2e2b230d3cdca51944d242c19f11325604b60d156c84330a0cbe65ba710f5a9c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page