Grammatical information extraction methods designed for the analysis of historical and contemporary textual corpora.
Project description
posextract
posextract offers grammatical information extraction methods designed for the analysis of historical and contemporary textual corpora. It traverses the syntactic dependency relations between parts-of-speech and returns sequences of words that share a grammatical relationship. See our article for more.
Users have the options of:
- Extracting subject-verb-object (SVO) and subject-verb-adjective complement (SVA) triples
- Extracting adjective-noun piars
- Extracting subject-verb pairs
Usage
Required Paramters:
input
can be the name of a csv file or an input stringoutput
name of the output file
Optional Paramters:
--data_column
specify the column to extract triples from--id_column
specify a unique ID field if csv file is given--lemma
specify whether to lemmatize parts-of-speech--post-combine-adj
combine triples (adjective predicate with object)
Examples
posextract can extract grammatical triples from text:
python -m posextract.extract_triples "Landlords may exercise oppression." output.csv --post-combine-adj
# Output: Landlords exercise oppression.
posextract can extract SVO/SVA relationships separately or it can combine the adjective as part of a SVO triple:
python -m posextract.extract_triples "The soldiers were terminally ill." output.csv --post-combine-adj
# Output: soldiers-were-terminally, soldiers-were-ill
python -m posextract.extract_triples "The soldiers were terminally ill." output.csv --post-combine-adj
# Output: soldiers-were-terminally-ill
If provided a .csv file:
`python -m posextract.extract_triples --data_column sentence --id_column sentence_id input.csv output.csv`
For More Information...
... see our Wiki:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for posextract-1.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad6bcce3f5fe3cf4d976fee2e657032e64625df7bb759f88972a53940ea683f0 |
|
MD5 | addcfb5ac30566eca5d89b83c0b5e655 |
|
BLAKE2b-256 | 255305e4cd0a216fdc14d7aeb5eabe9db9829fd6c7bac09b8097d0c9d5f06688 |