Skip to main content

Feature set algebra for linguistics

Project description

Latest PyPI Version License Supported Python Versions Format Readthedocs

Travis Codecov

Features is a simple implementation of feature set algebra in Python.

Linguistic analyses commonly use sets of binary or privative features to refer to different groups of linguistic objects: for example a group of phonemes that share some phonological features like [-consonantal, +high] or a set of morphemes that occur in context of a specific person/number combination like [-participant, GROUP]. Usually, the features are applied in a way such that only some of their combinations are valid, while others are impossible (i.e. refer to no object) – for example [+high, +low], or [-participant, +speaker].

With this package, such feature systems can be defined with a simple contingency table definition (feature matrix) and stored under a section name in a simple clear-text configuration file. Each feature system can then be loaded by its name and provides its own FeatureSet subclass that implements all comparisons and operations between its feature sets according to the given definition (compatibility, entailment, intersection, unification, etc.).

Features creates the complete lattice structure between the possible feature sets of each feature system and lets you navigate and visualize their relations using the Graphviz graph layout software.

Installation

This package runs under Python 2.7 and 3.5+, use pip to install:

$ pip install features

This will also install the concepts package from PyPI providing the Formal Concept Analysis (FCA) algorithms on which this package is based.

Quickstart

Load a predefined feature system by name (in this case features for a six-way person/number distinction, cf. the definitions in the bundled config.ini in the source repository).

>>> import features

>>> fs = features.FeatureSystem('plural')

>>> print(fs.context)  # doctest: +ELLIPSIS
<Context object mapping 6 objects to 10 properties [3011c283] at 0x...>
      |+1|-1|+2|-2|+3|-3|+sg|+pl|-sg|-pl|
    1s|X |  |  |X |  |X |X  |   |   |X  |
    1p|X |  |  |X |  |X |   |X  |X  |   |
    2s|  |X |X |  |  |X |X  |   |   |X  |
    2p|  |X |X |  |  |X |   |X  |X  |   |
    3s|  |X |  |X |X |  |X  |   |   |X  |
    3p|  |X |  |X |X |  |   |X  |X  |   |

Create feature sets from strings or string sequences. Use feature string parsing, get back string sequences and feature or extent strings in their canonical order (definition order):

>>> fs('+1 +sg'), fs(['+2', '+2', '+sg']), fs(['+sg', '+3'])
(FeatureSet('+1 +sg'), FeatureSet('+2 +sg'), FeatureSet('+3 +sg'))

>>> fs('SG1').concept.intent
('+1', '-2', '-3', '+sg', '-pl')

>>> fs('1').string, fs('1').string_maximal, fs('1').string_extent
('+1', '+1 -2 -3', '1s 1p')

Use feature algebra: intersection (join) , union/unification (meet), set inclusion (extension/subsumption). Do feature set comparisons (logical connectives).

>>> fs('+1 +sg') % fs('+2 +sg')
FeatureSet('-3 +sg')

>>> fs('-3') ^ fs('+1') ^ fs('-pl')
FeatureSet('+1 +sg')

>>> fs('+3') > fs('-1') and fs('+pl') < fs('+2 -sg')
True

>>> fs('+1').incompatible_with(fs('+3')) and fs('+sg').complement_of(fs('+pl'))
True

Navigate the created subsumption lattice (Hasse graph) of all valid feature sets:

>>> fs('+1').upper_neighbors, fs('+1').lower_neighbors
([FeatureSet('-3'), FeatureSet('-2')], [FeatureSet('+1 +sg'), FeatureSet('+1 +pl')])

>>> fs('+1').upset()
[FeatureSet('+1'), FeatureSet('-3'), FeatureSet('-2'), FeatureSet('')]

>>> for f in fs:  # doctest: +ELLIPSIS
...     print('[%s] <-> {%s}' % (f.string_maximal, f.string_extent))
[+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl] <-> {}
[+1 -2 -3 +sg -pl] <-> {1s}
...
[-1] <-> {2s 2p 3s 3p}
[] <-> {1s 1p 2s 2p 3s 3p}

See the docs on how to define, load, and use your own feature systems.

Further reading

See also

  • concepts – Formal Concept Analysis with Python

  • fileconfig – Config file sections as objects

  • graphviz – Simple Python interface for Graphviz

License

Features is distributed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

features-0.5.12.zip (126.4 kB view details)

Uploaded Source

Built Distribution

features-0.5.12-py2.py3-none-any.whl (15.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file features-0.5.12.zip.

File metadata

  • Download URL: features-0.5.12.zip
  • Upload date:
  • Size: 126.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/2.7.16

File hashes

Hashes for features-0.5.12.zip
Algorithm Hash digest
SHA256 3504d00f5f6ae7b7c84c013a3f2a603e714220317fbb8d33e864fb2c0f9aa20f
MD5 c2fd544cba99f46b413be3cc6a33adf3
BLAKE2b-256 98ccdcaf1fb02d4eca207be4df3bf6a68d9af8d2af7cb37435dc76e71b389dc3

See more details on using hashes here.

File details

Details for the file features-0.5.12-py2.py3-none-any.whl.

File metadata

  • Download URL: features-0.5.12-py2.py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/42.0.2 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/2.7.16

File hashes

Hashes for features-0.5.12-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 5e1e00c711d178c03e1402a689698856476a16d9d4c2ef2596a7db4541fab090
MD5 35670f0dbae2497d85552bf0b9735c79
BLAKE2b-256 85cef0ee56618cacece9a77a95850c052f640e065b2fe86f2019628e4a9aa799

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page