Python version of R careless package

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cameronlyons

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Careless

A Python package for detecting careless responding in survey data using various statistical indices and methods.

Overview

When taking online surveys, participants sometimes respond to items without regard to their content. These types of responses, referred to as careless or insufficient effort responding, constitute significant problems for data quality, leading to distortions in data analysis and hypothesis testing, such as spurious correlations.

The careless package provides solutions designed to detect such careless/insufficient effort responses by allowing easy calculation of indices proposed in the literature. For a comprehensive review of these methods, see Curran (2016).

Features

Multiple Detection Methods: Supports various indices for detecting careless responding
Flexible Input: Works with lists, numpy arrays, and pandas DataFrames
Robust Implementation: Handles missing data, edge cases, and provides comprehensive error handling
Performance Optimized: Efficient algorithms for large datasets
Comprehensive Documentation: Detailed docstrings with examples for all functions

Installation

From PyPI (when available)

pip install careless-py

From Source

git clone https://github.com/Cameron-Lyons/careless-py.git
cd careless-py
pip install -e .

Optional Dependencies

For enhanced functionality (e.g., advanced Mahalanobis distance methods), install with full dependencies:

pip install careless-py[full]

Using uv (Recommended for Development)

This project uses uv for fast, reproducible dependency management. Install uv first:

curl -LsSf https://astral.sh/uv/install.sh | sh

Then clone and install:

git clone https://github.com/Cameron-Lyons/careless-py.git
cd careless-py
uv sync --extra full   # Install with all optional dependencies

For development with all dev tools:

uv sync --extra dev

Run commands in the virtual environment:

uv run pytest          # Run tests
uv run ruff check .    # Run linter
uv run mypy src/       # Run type checker

Quick Start

import numpy as np
from careless import evenodd, irv, longstring, mahad, psychsyn

# Sample survey data (rows = participants, columns = items)
data = np.array([
    [1, 2, 3, 4, 5, 6, 7, 8],  # Participant 1
    [2, 2, 2, 2, 5, 5, 5, 5],  # Participant 2 (suspicious pattern)
    [3, 4, 3, 4, 6, 7, 6, 7],  # Participant 3
])

# Check even-odd consistency
factors = [4, 4]  # Two factors with 4 items each
consistency_scores = evenodd(data, factors)
print("Even-odd consistency scores:", consistency_scores)

# Check intra-individual response variability
irv_scores = irv(data)
print("IRV scores:", irv_scores)

# Check for long strings of identical responses
longest_strings = longstring(data)
print("Longest strings:", longest_strings)

Available Functions

Consistency Indices

`evenodd(x, factors, diag=False)`

Computes the Even-Odd Consistency Index by dividing unidimensional scales using an even-odd split.

Parameters:

x: Input data (2D array/list) where rows are individuals and columns are responses
factors: List of integers specifying the length of each factor
diag: Boolean to return diagnostic values (number of valid correlations per individual)

Returns:

Array of even-odd consistency scores (average correlations per individual)
If diag=True: Tuple of (scores, diagnostic_values)

Example:

data = [[1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6, 7]]
factors = [4, 2]  # First factor has 4 items, second has 2
scores = evenodd(data, factors)

`psychsyn(x, pairs, method='synonyms', seed=None)`

Computes the Psychometric Synonyms/Antonyms Index based on correlated item pairs.

Parameters:

x: Input data (2D array/list)
pairs: List of item pairs [(item1, item2), ...]
method: 'synonyms' or 'antonyms'
seed: Random seed for reproducibility

Returns:

Array of psychometric synonym/antonym scores

Example:

data = [[1, 2, 3, 4], [2, 3, 4, 5]]
pairs = [(0, 1), (2, 3)]  # Item pairs to correlate
scores = psychsyn(data, pairs, method='synonyms')

`psychant(x, pairs, seed=None)`

Convenience wrapper for psychsyn that computes psychological antonyms.

Response Pattern Functions

`longstring(x, avg=False)`

Computes the longest (and optionally, average) length of consecutive identical responses.

Parameters:

x: Input data (2D array/list)
avg: Boolean to also return average string length

Returns:

Array of longest string lengths per individual
If avg=True: Tuple of (longest_strings, average_strings)

Example:

data = [[1, 1, 1, 2, 3], [1, 2, 3, 4, 5]]
longest, avg = longstring(data, avg=True)

`irv(x, consecutive=None)`

Computes the Intra-individual Response Variability (IRV), the standard deviation of responses across consecutive items.

Parameters:

x: Input data (2D array/list)
consecutive: Number of consecutive items to analyze (default: all items)

Returns:

Array of IRV scores per individual

Example:

data = [[1, 2, 3, 4, 5], [1, 1, 1, 1, 1]]
irv_scores = irv(data)

Statistical Outlier Functions

`mahad(x, method='classic', threshold=None, **kwargs)`

Computes Mahalanobis Distance to identify multivariate outliers.

Parameters:

x: Input data (2D array/list)
method: Detection method ('classic', 'robust', 'mcd', 'mve')
threshold: Custom threshold for outlier detection
**kwargs: Additional method-specific parameters

Returns:

Array of Mahalanobis distances per individual
If threshold provided: Tuple of (distances, outlier_flags)

Example:

data = [[1, 2, 3], [4, 5, 6], [1, 1, 1]]
distances, outliers = mahad(data, method='robust', threshold=0.95)

Advanced Usage

Working with Different Data Types

The package supports various input formats:

import numpy as np
import pandas as pd

# Numpy arrays
data_np = np.array([[1, 2, 3], [4, 5, 6]])

# Lists
data_list = [[1, 2, 3], [4, 5, 6]]

# Pandas DataFrames
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]])
data_df = df.values

# All work the same way
scores = evenodd(data_np, [3])

Handling Missing Data

The functions handle missing data (NaN values) appropriately:

import numpy as np

data_with_nans = np.array([
    [1, 2, np.nan, 4],
    [np.nan, 2, 3, 4],
    [1, 2, 3, 4]
])

# Functions will handle NaN values appropriately
scores = evenodd(data_with_nans, [4])

Custom Thresholds and Parameters

# Custom Mahalanobis distance threshold
distances, outliers = mahad(data, threshold=0.99)

# Custom IRV analysis on consecutive items
irv_scores = irv(data, consecutive=5)

# Psychometric synonyms with custom pairs
pairs = [(0, 1), (2, 3), (4, 5)]
syn_scores = psychsyn(data, pairs, method='synonyms')

Performance Considerations

Large Datasets: For datasets with >10,000 participants, consider processing in chunks
Memory Usage: Functions create copies of input data for processing
Parallel Processing: Consider using multiprocessing for very large datasets

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use this package in your research, please cite:

@software{careless2024,
  title={Careless: Python package for detecting careless responding},
  author={Lyons, Cameron},
  year={2024},
  url={https://github.com/Cameron-Lyons/careless}
}

References

Curran, P. G. (2016). Methods for the detection of carelessly invalid responses in survey data. Journal of Experimental Social Psychology, 66, 4-19.
Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Theilgard, N. (2018). Intra-individual response variability as an indicator of insufficient effort responding: Comparison to other indicators and relationships with individual differences. Journal of Business and Psychology, 33(1), 105-121.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cameronlyons

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.1.0

Jan 2, 2026

1.0.6

Jan 2, 2026

1.0.5

Jan 2, 2026

1.0.4

Dec 31, 2025

1.0.3

Dec 29, 2025

1.0.2

Dec 28, 2025

1.0.1

Dec 28, 2025

This version

1.0.0

Dec 27, 2025

0.0.1

Dec 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

careless_py-1.0.0.tar.gz (18.4 kB view details)

Uploaded Dec 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

careless_py-1.0.0-py3-none-any.whl (17.5 kB view details)

Uploaded Dec 27, 2025 Python 3

File details

Details for the file careless_py-1.0.0.tar.gz.

File metadata

Download URL: careless_py-1.0.0.tar.gz
Upload date: Dec 27, 2025
Size: 18.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for careless_py-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`db28cb51a4b48c968c9d42aed96e7cbd99c520192ee8d3f0c91337d4398589dc`
MD5	`9b3d319a13f067f7e218b21a819f2173`
BLAKE2b-256	`a798fba029e03097c1bda3b509367ee3bfb1a0bd4af23bb013760ea2a304afcf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for careless_py-1.0.0.tar.gz:

Publisher: python-publish.yml on Cameron-Lyons/careless-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: careless_py-1.0.0.tar.gz
- Subject digest: db28cb51a4b48c968c9d42aed96e7cbd99c520192ee8d3f0c91337d4398589dc
- Sigstore transparency entry: 780480996
- Sigstore integration time: Dec 27, 2025
Source repository:
- Permalink: Cameron-Lyons/careless-py@addcca0cd3cb1ce73945d2a72b70e938cd1b6423
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Cameron-Lyons
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@addcca0cd3cb1ce73945d2a72b70e938cd1b6423
- Trigger Event: push

File details

Details for the file careless_py-1.0.0-py3-none-any.whl.

File metadata

Download URL: careless_py-1.0.0-py3-none-any.whl
Upload date: Dec 27, 2025
Size: 17.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for careless_py-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fecad63086a74409e42aba6983fc491885b7694a719e2de5e5d7cf973b68e96e`
MD5	`09432543677689d356c5470847e9e555`
BLAKE2b-256	`2dd734b2726fac07bbd6d9204003abb2ee44cb6d3225ed29c31b9378858f2dad`

See more details on using hashes here.

Provenance

The following attestation bundles were made for careless_py-1.0.0-py3-none-any.whl:

Publisher: python-publish.yml on Cameron-Lyons/careless-py

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: careless_py-1.0.0-py3-none-any.whl
- Subject digest: fecad63086a74409e42aba6983fc491885b7694a719e2de5e5d7cf973b68e96e
- Sigstore transparency entry: 780480998
- Sigstore integration time: Dec 27, 2025
Source repository:
- Permalink: Cameron-Lyons/careless-py@addcca0cd3cb1ce73945d2a72b70e938cd1b6423
- Branch / Tag: refs/heads/main
- Owner: https://github.com/Cameron-Lyons
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@addcca0cd3cb1ce73945d2a72b70e938cd1b6423
- Trigger Event: push

careless-py 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

Careless

Overview

Features

Installation

From PyPI (when available)

From Source

Optional Dependencies

Using uv (Recommended for Development)

Quick Start

Available Functions

Consistency Indices

evenodd(x, factors, diag=False)

psychsyn(x, pairs, method='synonyms', seed=None)

psychant(x, pairs, seed=None)

Response Pattern Functions

longstring(x, avg=False)

irv(x, consecutive=None)

Statistical Outlier Functions

mahad(x, method='classic', threshold=None, **kwargs)

Advanced Usage

Working with Different Data Types

Handling Missing Data

Custom Thresholds and Parameters

Performance Considerations

Contributing

License

Citation

References

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`evenodd(x, factors, diag=False)`

`psychsyn(x, pairs, method='synonyms', seed=None)`

`psychant(x, pairs, seed=None)`

`longstring(x, avg=False)`

`irv(x, consecutive=None)`

`mahad(x, method='classic', threshold=None, **kwargs)`