Openstax response validator server

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

response_validation_app

Implements a simple unsupervised method for classifying student short to medium sized responses to questions.

Installation

This was developed in Python 3.6.

It may be installed as a package from the pypi repository, using pip:

pip install response-validator

Development

After cloning the repository, you can install the repo in editable mode, as so:

pip install -e .

Note that this step will download several NLTK corpora, silently, and add them to the deployed tree.

Additional functionality for running algorith tests, etc. can be enabled by installing additional libraries:

pip install -r requirements.txt

Usage

Once installed, python -m validator.app will run the Flask dev webserver.

The recommended production method for deployment is to use a WSGI compliant server, such as gunicorn:

pip install gunicorn
gunicorn validator.app:app

API

The main route for the app is /validate, which accepts a plaintext response (response) that will be checked. It can also accept a number of optional arguments:

uid (e.g., '1000@1', default None): This is the uid for the question pertaining to the response. The uid is used to compute domain-specific and module-specific vocabulary to aid in the classification process. Iff the version of the question specified is not available, any version of the same qid (question id without the version, e.g. 1000) will be used.
remove_stopwords (True or False, default True): Whether or not stopwords (e.g., 'the', 'and', etc) will be removed from the response. This is generally advised since these words carry little predictive value.
tag_numeric (True, False or auto, default auto): Whether numerical values will be tagged (e.g., 123.7 is tagged with a special 'numeric_type_float' identifier). While there are certainly responses for which this would be helpful, a large amount of student garbage consists of random number pressing which limits the utility of this option. Auto enables a mode that only does numeric tag processing if the question this response pertains to (as fond via the uid above) requires a numeric answer.
spelling_correction (True, False or auto, default auto): Whether the app will attempt to correct misspellings. This is done by identifying unknown words in the response and seeing if a closely related known word can be substituted. Currently, the app only attempts spelling correction on words of at least 5 characters in length and only considers candidate words that are within an edit distance of 2 from the misspelled word. When running in auto mode, the app will attempt to determine validity without spelling correction. Only if that is not valid, will it attempt to reasses validty with spelling correction.
spell_correction_max (integer, default 10): Limit spelling corrections applied to this number.
remove_nonwords (True or False, default True): Words that are not recognized (after possibly attempting spelling correction) are flagged with a special 'nonsense_word' tag. This is done primarily to combat keyboard mashes (e.g., 'asdfljasdfk') that make a large percentage of invalid student responses.

Once the app is running, you can send requests using curl, requests, etc. Here is an example using Python's requests library:

Here an example of how to call things using the Python requests library (assuming that the app is running on the default local development port):

import json
import requests
params = {'response': 'this is my answer to the question alkjsdfl',
          'uid': '100@2',
          'remove_stopwords': True,
          'tag_numeric=True': False,
          'spelling_correction': True,
          'remove_nonwords': True}
r = requests.get('http://127.0.0.1:5000/validate', params=params)
print(json.dumps(r.json(), indent=2))
{
  "bad_word_count": 1,
  "common_word_count": 2,
  "computation_time": 0.001367330551147461,
  "domain_word_count": 0,
  "inner_product": -1.6,
  "innovation_word_count": 0,
  "num_spelling_correction": 1,
  "processed_response": "answer question nonsense_word",
  "remove_nonwords": true,
  "remove_stopwords": true,
  "response": "this is my answer to the question alkjsdfl",
  "spelling_correction": true,
  "spelling_correction_used": true,
  "tag_numeric": "auto",
  "tag_numeric_input": "auto",
  "uid_found": false,
  "uid_used": null,
  "valid": false
}

TODO:

While the app is fully functional, there are some other things that will need to be addressed:

Currently there is no security for this app (anything can call it). I am not sure how this is usually handled in Tutor but it should not be too difficult to add an api key or similar security measures.
The Procfile will need to be changed a bit depending on how and where we wish to deploy
By far the largest element of the processing time for a response is devoted to spelling correction. While this does provide a very strong performance improvement for short responses, we may wish to automatically disable this in the case where the response is too long (larger than a paragraph).
Depending on UX, we may want to return more granular information about the response rather than a simple valid/non-valid label. We can modify this easily enough as the need arises.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

5.0.1

Sep 14, 2020

5.0.0

Aug 24, 2020

4.0.0

Aug 7, 2020

3.1.0

Feb 17, 2020

3.0.3

Jan 17, 2020

3.0.2

Nov 11, 2019

3.0.1

Oct 28, 2019

2.3.0

Sep 30, 2019

This version

2.2.1

Sep 3, 2019

2.1.0

Aug 13, 2019

2.0.0

May 13, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

response_validator-2.2.1-py2.py3-none-any.whl (39.2 MB view hashes)

Uploaded Sep 3, 2019 Python 2 Python 3

Hashes for response_validator-2.2.1-py2.py3-none-any.whl

Hashes for response_validator-2.2.1-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`b3ec9436319eab7228ec8775d6b9c6826023fd765ffadc53da155bcceea3f233`
MD5	`300f82a36ee0d562c608698f9ee4905c`
BLAKE2b-256	`02b6ea36b745b3a558feafa20a86c93b8fa8c149d5e61734a918148e4e0deb03`