Skip to main content

Grammatical information extraction methods designed for the analysis of historical and contemporary textual corpora.

Project description

posextract

posextract offers grammatical information extraction methods designed for the analysis of historical and contemporary textual corpora. It traverses the syntactic dependency relations between parts-of-speech and returns sequences of words that share a grammatical relationship. See our article for more. You can also download posextract with pip.

Usage

  • extract_triples to extract subject-verb-object (SVO) and subject-verb-adjective complement (SVA) triples
  • extract_adj_noun_pairs to extract adjective-noun pairs
  • extract_subj_verb_pairs to extract subject-verb pairs

Required Paramters:

  • input can be the name of a csv file or an input string
  • output name of the output file

Optional Paramters:

  • --data_column specify the column to extract triples from
  • --id_column specify a unique ID field if csv file is given
  • --lemma specify whether to lemmatize parts-of-speech
  • --post-combine-adj combine triples (adjective predicate with object)

Examples

Interactively:

from posextract import extract_triples

extract_triples(dataframe, sentence, unique_id)

Over CLI:

posextract can extract grammatical triples from text:

python -m posextract.extract_triples "Landlords may exercise oppression." output.csv

# Output: Landlords exercise oppression. 

posextract can extract SVO/SVA relationships separately or it can combine the adjective as part of a SVO triple:

python -m posextract.extract_triples "The soldiers were terminally ill." output.csv --post-combine-adj

# Output: soldiers-were-terminally, soldiers-were-ill 
python -m posextract.extract_triples "The soldiers were terminally ill." output.csv --post-combine-adj

# Output: soldiers-were-terminally-ill

If provided a .csv file:

python -m posextract.extract_triples --data_column sentence --id_column sentence_id input.csv output.csv

For More Information...

... see our Wiki:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

posextract-1.0.4.tar.gz (7.7 kB view hashes)

Uploaded Source

Built Distribution

posextract-1.0.4-py3-none-any.whl (9.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page