Skip to main content

Package for retrieving collocations from text with Spacy

Project description

Collocater

Collocater is a Python library for retrieving the collocations to be found in a message. The ontology it operates on has been scraped from the Online OXFORD Collocation Dictionary.

Collocater can be added as a pipeline component to SpaCy's preprocessing pipeline, so that a messages' collocations can be retrireved the same way its named entities can.

Check out Collocations Finder to learn more about the project.

Installation

pip install collocater --no-deps

Usage

import collocater
import spacy

collie = collocater.Collocater.loader()
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe(collie)

text = "If this isn't a bunch of beautiful flowers I don't know what is!"
doc = nlp(text)
print(doc._.collocs) # returns [bunch of, bunch of beautiful flowers, beautiful flowers]

#Tokens with associated collocations in text:
colls = [(col.text, col.start_char, col.end_char, col.label_) for col in found_collocations1._.collocs]
pprint(colls) # returns [
#                          ('bunch of', 16, 24, 'bunch_noun__prep'),
#                          ('bunch of beautiful flowers', 16, 42, 'flower_noun__quant'),
#                          ('beautiful flowers', 25, 42, 'flower_noun__adj')
#                          ]

print(collie(text))
#{'beautiful flowers': {'coll_type': 'flower_noun__adj', 'location': [7, 9]},
# 'bunch of': {'coll_type': 'bunch_noun__prep', 'location': [5, 7]},
# 'bunch of beautiful flowers': {'coll_type': 'flower_noun__quant',
#                                'location': [5, 9]}}

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

collocater-0.1.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

collocater-0.1-py3-none-any.whl (1.5 MB view details)

Uploaded Python 3

File details

Details for the file collocater-0.1.tar.gz.

File metadata

  • Download URL: collocater-0.1.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.10

File hashes

Hashes for collocater-0.1.tar.gz
Algorithm Hash digest
SHA256 bd863b2fd98eeae26d22daddcaaf3068e7462d2274d0748b5a24fec51a49af28
MD5 602e543435d42940b571ac39a558e89e
BLAKE2b-256 9381fd52897ade2c24b23c146354e4f2fae8176d2388fd67e59094095a717fe2

See more details on using hashes here.

File details

Details for the file collocater-0.1-py3-none-any.whl.

File metadata

  • Download URL: collocater-0.1-py3-none-any.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/45.2.0.post20200210 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.6.10

File hashes

Hashes for collocater-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0360681cb52fed913cdf16f8282bf5c21d278fbc33aa7e8c850dcc82f2de8f18
MD5 36cf66ab1d3fc732fd62a252965953cf
BLAKE2b-256 c117dede94dc08a8e16b4ce01e1bdb3cd25fe5820b31e531953f38610265ce06

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page