Skip to main content

Automatically detect bad responses in survey responses

Project description

Survey Dud Detector

Apply methods to detect bad responses in surveys.

Detect Straightlining

Straightlining involves someone answering the same item on a scale for all the questions (e.g., saying "Strongly Agree" to everything).

from survey_dud_detector import detect_straightlining

likert_cols = [c for c in mfa.columns if 'agree' in c or 'would' in c or 'favorable' in c]
straightlining = detect_straightlining(df[likert_cols])

# Examine incidence of straightlining (results are normalized to % of questions examined)
print(straightlining.value_counts())

# Drop everyone who perfectly straightlined
df = df[straightlining < 1]

Multiple Low Incidence Detection

Multiple low incidence involves someone answering multiple questions with an unlikely answer (e.g., saying they are a Native American or that they are non-binary). Obviously unlikely answers themselves are not an issue, but multiple low incidence can indicate someone might be trolling (i.e., pretenting to be a non-binary Native American who is Very Conservative and earns over $150K).

demographics = ['gender', 'race', 'education', 'urban_rural', 'politics', 'income', 'age', 'vote2016']
# Detect low incidence - the threshold defines what rarity you want to count as "low incidence" (0.04 means anything with 4% or less occurance will be defined as "low incidence")
low_incidence_counts = detect_low_incidence(df[demographics], low_incidence_threshold=0.04)

# Examine incidence of straightlining (results are number of low incidence answers)
print(low_incidence_counts.value_counts())

# It might be good to look at the values of people with a high number of low incidence answers
# just in case this is actually legitimate.
print(df[low_incidence_counts >= 3])

# Drop everyone who gave three or more low incidence answers
df = df[low_incidence_counts < 3]

Installation

pip3 install survey_dud_detector

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

survey_dud_detector-0.2.tar.gz (2.9 kB view details)

Uploaded Source

Built Distribution

survey_dud_detector-0.2-py3-none-any.whl (4.1 kB view details)

Uploaded Python 3

File details

Details for the file survey_dud_detector-0.2.tar.gz.

File metadata

  • Download URL: survey_dud_detector-0.2.tar.gz
  • Upload date:
  • Size: 2.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for survey_dud_detector-0.2.tar.gz
Algorithm Hash digest
SHA256 b3a02e572578548259da8c0330344979b3dd8a2d7c901f78ec88eb23f632c2f5
MD5 baadd23eab93cf75a9f4d66723d1b732
BLAKE2b-256 9d3359aa7f4c12a04bc42ce1294c99f31d7722fac7b63771700d0dbcd1e78cbf

See more details on using hashes here.

File details

Details for the file survey_dud_detector-0.2-py3-none-any.whl.

File metadata

  • Download URL: survey_dud_detector-0.2-py3-none-any.whl
  • Upload date:
  • Size: 4.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.6

File hashes

Hashes for survey_dud_detector-0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 305604c798a701f21698470028ca2f2339779641290c2cc9e2f1d052e1b9e307
MD5 b3399a4c683a4ea72f79316aa856b0e1
BLAKE2b-256 28b4408a83a941e2f66536bc21d02c71aa40659448b6479be314f5c99d5ef4f0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page