Skip to main content

A pandas extension for survey analysis

Project description

Faster and more Insightful analysis of survey results

This package lets you apply advanced Natural Language Processing (NLP) and Machine Learning functions on survey results directly within a dataframe.

It fills a gap where many NLP packages (like spacy, genism, sentence_transformers) are not designed for data in a spreadsheet (and therefore imported into a dataframe), and many of the people who are tasked with analysing survey results are often not data scientists.

For example, to extract the sentiment you can just type:

df.extract_sentiment(input_column="survey-comments")

It will abstract away a lot of the data transformation pipeline to give you useful functionality with minimal code.

Examples

See Read-the-docs for simple example notebooks. There are more detailed notebooks in the repo under notebooks/

Functionality

Clustering comments

It will group similar free-text comments together and assign a cluster ID. This is a useful step prior to any qualitative analysis.

Sentiment Analysis

It will measure the sentiment in terms or postive / neutral / negative and assign a score for each of those parts, picking the highest scoring as the most likely overall sentiment.

Topic analysis

Involves TFIDF and word co-occurence to gain some high level insights into the likely topics

Clustering likert questions (or other responses)

For strongly disagree ... neutral ... strong agree type responses, it will groups all those questions together to identity groups of respondents within your survey data. This can be much more useful than overall averages across the survey.

Visualisation

Functions to help make sense of the clusters and topics you have identified using the above functions (in development)

Setup

If sentence transformers throws dll errors: https://stackoverflow.com/questions/78484297/c-torch-lib-fbgemm-dll-or-one-of-its-dependencies/78794748#78794748

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas_survey_toolkit-1.0.4.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pandas_survey_toolkit-1.0.4-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file pandas_survey_toolkit-1.0.4.tar.gz.

File metadata

  • Download URL: pandas_survey_toolkit-1.0.4.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for pandas_survey_toolkit-1.0.4.tar.gz
Algorithm Hash digest
SHA256 668f8fc19adcdb412ecb4218e2c4801c503bd041f77f1aa80d2afb049469cefe
MD5 eab47f1300b293e2a71e9e9428e243e1
BLAKE2b-256 9b1b9c8435e74dbe5a49265d0705ee143ee72ea8f562298d0a497966dfbbc49d

See more details on using hashes here.

File details

Details for the file pandas_survey_toolkit-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for pandas_survey_toolkit-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 5c53413e2a1b4361483cce638c0a5919d6cf375ce3e31d5ca9f845d3654d115c
MD5 7d95319389c7b31faef403f67806c6e2
BLAKE2b-256 8a0fd6f74d096bc5ec91b64fec803fd2014486961ad07c3f1cc0c4bd11de23f3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page