Package for retrieving collocations from text with Spacy
Project description
Collocater
Collocater is a Python library for retrieving the collocations to be found in a message. The ontology it operates on has been scraped from the Online OXFORD Collocation Dictionary.
Collocater can be added as a pipeline component to SpaCy's preprocessing pipeline, so that a messages' collocations can be retrireved the same way its named entities can.
Check out Collocations Finder to learn more about the project.
Installation
pip install collocater --no-deps
Usage
from collocater import collocater
import spacy
from pprint import pprint
collie = collocater.Collocater.loader()
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe(collie)
text = "If this isn't a bunch of beautiful flowers I don't know what is!"
doc = nlp(text)
print(doc._.collocs) # returns [bunch of, bunch of beautiful flowers, beautiful flowers]
#Tokens with associated collocations in text:
colls = [(col.text, col.start_char, col.end_char, col.label_) for col in doc._.collocs]
pprint(colls) # returns [
# ('bunch of', 16, 24, 'bunch_noun__prep'),
# ('bunch of beautiful flowers', 16, 42, 'flower_noun__quant'),
# ('beautiful flowers', 25, 42, 'flower_noun__adj')
# ]
print(collie(text))
#{'beautiful flowers': {'coll_type': 'flower_noun__adj', 'location': [7, 9]},
# 'bunch of': {'coll_type': 'bunch_noun__prep', 'location': [5, 7]},
# 'bunch of beautiful flowers': {'coll_type': 'flower_noun__quant',
# 'location': [5, 9]}}
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file collocater-0.3.tar.gz
.
File metadata
- Download URL: collocater-0.3.tar.gz
- Upload date:
- Size: 3.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 155a53cff5b0d371968d3a0b9df13bdbd4823398578a9721c08ae8f145134a0d |
|
MD5 | 50b8f971cdf3270e2233b0b2eb8f15ce |
|
BLAKE2b-256 | 84b89baceb184e180ec3d858c2822e61ac81dfcdf5992360ff8957d1d6370fa2 |
File details
Details for the file collocater-0.3-py3-none-any.whl
.
File metadata
- Download URL: collocater-0.3-py3-none-any.whl
- Upload date:
- Size: 3.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.4.2 requests/2.21.0 setuptools/40.6.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a81b2f772b17995625abc14e21bf716ab184236cce461fbb94d89f7469ba0ac8 |
|
MD5 | 495bbcfe6e01ce0f9181b135ff3b4c09 |
|
BLAKE2b-256 | 1412ab8f758614d743f8101d3099577b1b3d3b837ccdb477b39d50279d682095 |