No project description provided
Project description
speech-analytics is a simple module for processing speech data collected as part of the Calpy project.
Documentation
class ConversationAnalysis
Parameters:
filename (str)
: The name of a calpy-style data file to analyse.
model_type (Optional[str])
: The type of spacy model to use in the analysis.
Default is 'en_core_web_sm'
.
Methods:
add_analysis(analysis_type: str)
Adds the requested type of analysis to the data. Options are:
TOKENIZE
: Tokenize the data in the utterances. The tokens created include raw text, part-of-speech tags, lemma, dependency information, and whether each word is a stop word.UTTERANCE_LENGTH
: Adds information about the number of words and number of tokens in an utterance.TURNS
: Combines utterances into turns (i.e. multiple consecutive utterances by the same speaker would be considered one turn).PREPROCESS
: Runs analysis with TOKENIZE, UTTERANCE_LENGTH, TURNS. Doing so will ensure all other methods work.REMOVE_AUX_VERBS
: Removes anything classified as an auxiliary verb (based on POS-tagging done in tokenization). If tokenization has not occurred before the removal of aux verbs, add_analysis will be called with the TOKENIZE parameter.GRAMMAR_CORRECTION
: Adds attempted corrections to grammar. Note that this analysis does not remove the original text (both the original text and) suggested corrections will be available. Utterances will have grammatical corrections suggested, but turns will only have suggested corrections if this is called after add_analysis with TURNS.
The names of each analysis type are constants provided in the module.
get_tokens()
Returns the raw token information. If no token information is available, this
method will call add_analysis(TOKENIZE)
in order to derive it.
get_utterances()
Returns the raw utterances. This information will not include
utterance length unless add_analysis(UTTERANCE_LENGTH)
is called first.
get_turns()
Returns the raw turns. If turns have not been processed, this method will call
add_analysis(TURNS)
first.
get_turn_info()
Returns the raw turn information. If no turn information is available, this
method will call add_analysis(TURNS)
in order to derive it.
get_grammar_corrections(by_turn=True)
Returns a list of tuples each containing original text and corrected text.
By default, this method will return grammar corrections based on turns
(calling add_analysis(GRAMMAR_CORRECTION)
where necessary). If by_turn
is set to False,
grammar corrections for utterances will be returned instead.
get_pos_tags(by_turn=True)
Returns the pos tags for each turn (if by_turn is True, else each utterance).
The return values is formatted as a list of lists, where each internal list
consists of tuples of (token, pos_tag).
get_turn_length(turn, words=True)
Returns the number of words in a turn. If words is set to False, the method instead
returns the number of tokens in the turn.
get_turn_duration(turn)
Returns the number of seconds in a turn.
get_utterance_length(utterance, words=True)
Returns the number of words in an utterance. If words is set to False, the method instead
returns the number of tokens in the utterance.
get_utterance_duration(utterance)
Returns the number of seconds in an utterance.
get_pause_length(turn)
Returns the total number of seconds between utterances in a turn.
get_average_turn_length()
Returns the average turn length for each speaker, as a dictionary mapping
speaker codes to average turn length.
get_speaker_turns(speaker)
Returns a list of all turns taken by the speaker.
get_speaker_utterances(speaker)
Returns a list of all utterances spoken by the speaker.
get_speaker_names()
Returns the names (ids) of all speakers in the conversation.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file speech-analytics-0.1.8.tar.gz
.
File metadata
- Download URL: speech-analytics-0.1.8.tar.gz
- Upload date:
- Size: 8.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.1 CPython/3.10.0 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f0c33b5248559ecd950f40dd63be9966e5633d6e735ccdc7f9f2792191f430c3 |
|
MD5 | 5d85a0abcb898f2cd3e54f6bc37ad40b |
|
BLAKE2b-256 | 0dcf040718e420ec537ee2dfb8c546b94ffaaddb3fbe4f991b74251489e5e277 |
File details
Details for the file speech_analytics-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: speech_analytics-0.1.8-py3-none-any.whl
- Upload date:
- Size: 7.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.1 CPython/3.10.0 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7136a700f894f36cd98bebc35a39669ef876829e9a7d0de9440b75afa3334676 |
|
MD5 | e0abcf66886cdf4092df858d7ce0605b |
|
BLAKE2b-256 | c95daf99dd8de568a90046c8fd2d0d275685f761604b546c65cdb073be8ad867 |