Package for working with semantic spaces.
Project description
Semantic spaces module
This is a python module that allows to compute semantic metrics based on distributional semantics models.
For example, to find words that are semantically similar to the word 'brain':
from semspaces.space import SemanticSpace
space = SemanticSpace.from_csv('space.w2v.gz')
space.most_similar(['brain'])
{'brain': [(u'brain', 0.0),
(u'brains', 0.34469844325620635),
(u'cerebrum', 0.4426992023455152),
(u'cerebellum', 0.4483798859566903),
(u'cortical', 0.469348588934828),
(u'brainstem', 0.4791188497952641),
(u'cortex', 0.479544888313173),
(u'ganglion', 0.49717579235842546),
(u'thalamus', 0.5030885466349713),
(u'thalamic', 0.5059524199702277)]}
The module wraps dense and sparse matrix implementations to provide convenience methods for computing semantic statistics as well as easy input and output of the data.
Installation
pip install -r requirements.txt
python setup.py install
Semantic spaces
You can download a set of validated semantic spaces for English and Dutch here (see Mandera, Keuleers, & Brysbaert, in press).
Contribute
- Issue Tracker: https://github.com/pmandera/semspaces/issues
- Source Code: https://github.com/pmandera/semspaces
Authors
The tool was developed at Center for Reading Research, Ghent University by Paweł Mandera.
License
The project is licensed under the Apache License 2.0.
References
Mandera, P., Keuleers, E., & Brysbaert, M. (in press). Explaining human performance in psycholinguistic tasks with models of semantic similarity based on prediction and counting: A review and empirical validation. Journal of Memory and Language.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for semspaces-0.1.4-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 531493e65501ebdfb908582f56faeedcb349ff0630dffe6f2af57230faa5d698 |
|
MD5 | 893da4a809950c1569e15cf7339d3748 |
|
BLAKE2b-256 | bfeeaf246677a2e171cd145cf7a0006001994725ead997495c6e89f605e8c67d |