A pandas extension for survey analysis
Project description
Faster and more Insightful analysis of survey results
This package lets you apply advanced Natural Language Processing (NLP) and Machine Learning functions on survey results directly within a dataframe.
It fills a gap where many NLP packages (like spacy, genism, sentence_transformers) are not designed for data in a spreadsheet (and therefore imported into a dataframe), and many of the people who are tasked with analysing survey results are often not data scientists.
For example, to extract the sentiment you can just type:
df.extract_sentiment(input_column="survey-comments")
It will abstract away a lot of the data transformation pipeline to give you useful functionality with minimal code.
Examples
See Read-the-docs for simple example notebooks. There are more detailed notebooks in the repo under notebooks/
Functionality
Clustering comments
It will group similar free-text comments together and assign a cluster ID. This is a useful step prior to any qualitative analysis.
Sentiment Analysis
It will measure the sentiment in terms or postive / neutral / negative and assign a score for each of those parts, picking the highest scoring as the most likely overall sentiment.
Topic analysis
Involves TFIDF and word co-occurence to gain some high level insights into the likely topics
Clustering likert questions (or other responses)
For strongly disagree ... neutral ... strong agree type responses, it will groups all those questions together to identity groups of respondents within your survey data. This can be much more useful than overall averages across the survey.
Visualisation
Functions to help make sense of the clusters and topics you have identified using the above functions (in development)
Setup
If sentence transformers throws dll errors: https://stackoverflow.com/questions/78484297/c-torch-lib-fbgemm-dll-or-one-of-its-dependencies/78794748#78794748
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pandas_survey_toolkit-1.0.1.tar.gz.
File metadata
- Download URL: pandas_survey_toolkit-1.0.1.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f357bf03de018512bb73d091c0b3cfcd0f6607785a94e36a5bd5f0f707d97dd0
|
|
| MD5 |
f20dbfe0695a8345a1cca0d8f2f5ed3a
|
|
| BLAKE2b-256 |
0bbaa4cd392ca91c54bac0cabc27ccb96803650405934329eb43107638b7e836
|
File details
Details for the file pandas_survey_toolkit-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pandas_survey_toolkit-1.0.1-py3-none-any.whl
- Upload date:
- Size: 20.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7de0a312be3e415cbaab2d4fec041ca0c651a042b0e667376255273e9155c7f8
|
|
| MD5 |
d2819ab181876cf5391c9d4727367dfd
|
|
| BLAKE2b-256 |
0ae98fe69719f13c2bad2e3a851e45d9e06b9af04636a68bd6d73d6a8331527f
|