Skip to main content

No project description provided

Project description

speech-analytics is a simple module for processing speech data collected as part of the Calpy project.

Documentation

class ConversationAnalysis

Parameters:

filename (str): The name of a calpy-style data file to analyse.

model_type (Optional[str]): The type of spacy model to use in the analysis. Default is 'en_core_web_sm'.

Methods:

add_analysis(analysis_type: str)
Adds the requested type of analysis to the data. Options are:

  • TOKENIZE: Tokenize the data in the utterances. The tokens created include raw text, part-of-speech tags, lemma, dependency information, and whether each word is a stop word.
  • UTTERANCE_LENGTH: Adds information about the number of words and number of tokens in an utterance.
  • TURNS: Combines utterances into turns (i.e. multiple consecutive utterances by the same speaker would be considered one turn).
  • PREPROCESS: Runs analysis with TOKENIZE, UTTERANCE_LENGTH, TURNS. Doing so will ensure all other methods work.
  • REMOVE_AUX_VERBS: Removes anything classified as an auxiliary verb (based on POS-tagging done in tokenization). If tokenization has not occurred before the removal of aux verbs, add_analysis will be called with the TOKENIZE parameter.
  • GRAMMAR_CORRECTION: Adds attempted corrections to grammar. Note that this analysis does not remove the original text (both the original text and) suggested corrections will be available. Utterances will have grammatical corrections suggested, but turns will only have suggested corrections if this is called after add_analysis with TURNS.

The names of each analysis type are constants provided in the module.

get_tokens()
Returns the raw token information. If no token information is available, this method will call add_analysis(TOKENIZE) in order to derive it.

get_utterances()
Returns the raw utterances. This information will not include utterance length unless add_analysis(UTTERANCE_LENGTH) is called first.

get_turns()
Returns the raw turns. If turns have not been processed, this method will call add_analysis(TURNS) first.

get_turn_info()
Returns the raw turn information. If no turn information is available, this method will call add_analysis(TURNS) in order to derive it.

get_grammar_corrections(by_turn=True)
Returns a list of tuples each containing original text and corrected text. By default, this method will return grammar corrections based on turns (calling add_analysis(GRAMMAR_CORRECTION) where necessary). If by_turn is set to False, grammar corrections for utterances will be returned instead.

get_pos_tags(by_turn=True)
Returns the pos tags for each turn (if by_turn is True, else each utterance). The return values is formatted as a list of lists, where each internal list consists of tuples of (token, pos_tag).

get_turn_length(turn, words=True)
Returns the number of words in a turn. If words is set to False, the method instead returns the number of tokens in the turn.

get_turn_duration(turn)
Returns the number of seconds in a turn.

get_utterance_length(utterance, words=True)
Returns the number of words in an utterance. If words is set to False, the method instead returns the number of tokens in the utterance.

get_utterance_duration(utterance)
Returns the number of seconds in an utterance.

get_pause_length(turn)
Returns the total number of seconds between utterances in a turn.

get_average_turn_length()
Returns the average turn length for each speaker, as a dictionary mapping speaker codes to average turn length.

get_speaker_turns(speaker)
Returns a list of all turns taken by the speaker.

get_speaker_utterances(speaker)
Returns a list of all utterances spoken by the speaker.

get_speaker_names()
Returns the names (ids) of all speakers in the conversation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speech-analytics-0.1.8.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

speech_analytics-0.1.8-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file speech-analytics-0.1.8.tar.gz.

File metadata

  • Download URL: speech-analytics-0.1.8.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.1 CPython/3.10.0 Darwin/19.6.0

File hashes

Hashes for speech-analytics-0.1.8.tar.gz
Algorithm Hash digest
SHA256 f0c33b5248559ecd950f40dd63be9966e5633d6e735ccdc7f9f2792191f430c3
MD5 5d85a0abcb898f2cd3e54f6bc37ad40b
BLAKE2b-256 0dcf040718e420ec537ee2dfb8c546b94ffaaddb3fbe4f991b74251489e5e277

See more details on using hashes here.

File details

Details for the file speech_analytics-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for speech_analytics-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 7136a700f894f36cd98bebc35a39669ef876829e9a7d0de9440b75afa3334676
MD5 e0abcf66886cdf4092df858d7ce0605b
BLAKE2b-256 c95daf99dd8de568a90046c8fd2d0d275685f761604b546c65cdb073be8ad867

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page