Skip to main content

Feature set algebra for linguistics

Project description

Latest PyPI Version License Supported Python Versions Format Readthedocs

Features is a simple implementation of feature set algebra in Python.

Linguistic analyses commonly use sets of binary or privative features to refer to different groups of linguistic objects: for example a group of phonemes that share some phonological features like [-consonantal, +high] or a set of morphemes that occur in context of a specific person/number combination like [-participant, GROUP]. Usually, the features are applied in a way such that only some of their combinations are valid, while others are impossible (i.e. refer to no object) – for example [+high, +low], or [-participant, +speaker].

With this package, such feature systems can be defined with a simple contingency table definition (feature matrix) and stored under a section name in a simple clear-text configuration file. Each feature system can then be loaded by its name and provides its own FeatureSet subclass that implements all comparisons and operations between its feature sets according to the given definition (compatibility, entailment, intersection, unification, etc.).

Features creates the complete lattice structure between the possible feature sets of each feature system and lets you navigate and visualize their relations using the Graphviz graph layout software.

Installation

This package runs under Python 2.7 and 3.3+, use pip to install:

$ pip install features

This will also install the concepts package from PyPI providing the Formal Concept Analysis (FCA) algorithms on which this package is based.

Quickstart

Load a predefined feature system by name (in this case features for a six-way person/number distinction, cf. the definitions in the bundled config.ini in the source repository).

>>> import features

>>> fs = features.FeatureSystem('plural')

>>> print(fs.context)  # doctest: +ELLIPSIS
<Context object mapping 6 objects to 10 properties at 0x...>
      |+1|-1|+2|-2|+3|-3|+sg|+pl|-sg|-pl|
    1s|X |  |  |X |  |X |X  |   |   |X  |
    1p|X |  |  |X |  |X |   |X  |X  |   |
    2s|  |X |X |  |  |X |X  |   |   |X  |
    2p|  |X |X |  |  |X |   |X  |X  |   |
    3s|  |X |  |X |X |  |X  |   |   |X  |
    3p|  |X |  |X |X |  |   |X  |X  |   |

Create feature sets from strings or string sequences. Use feature string parsing, get back string sequences and feature or extent strings in their canonical order (definition order):

>>> fs('+1 +sg'), fs(['+2', '+2', '+sg']), fs(['+sg', '+3'])
(FeatureSet('+1 +sg'), FeatureSet('+2 +sg'), FeatureSet('+3 +sg'))

>>> fs('SG1').concept.intent
('+1', '-2', '-3', '+sg', '-pl')

>>> fs('1').string, fs('1').string_maximal, fs('1').string_extent
('+1', '+1 -2 -3', '1s 1p')

Use feature algebra: intersection (join) , union/unification (meet), set inclusion (extension/subsumption). Do feature set comparisons (logical connectives).

>>> fs('+1 +sg') % fs('+2 +sg')
FeatureSet('-3 +sg')

>>> fs('-3') ^ fs('+1') ^ fs('-pl')
FeatureSet('+1 +sg')

>>> fs('+3') > fs('-1') and fs('+pl') < fs('+2 -sg')
True

>>> fs('+1').incompatible_with(fs('+3')) and fs('+sg').complement_of(fs('+pl'))
True

Navigate the created subsumption lattice (Hasse graph) of all valid feature sets:

>>> fs('+1').upper_neighbors, fs('+1').lower_neighbors
([FeatureSet('-3'), FeatureSet('-2')], [FeatureSet('+1 +sg'), FeatureSet('+1 +pl')])

>>> fs('+1').upset()
[FeatureSet('+1'), FeatureSet('-3'), FeatureSet('-2'), FeatureSet('')]

>>> for f in fs:  # doctest: +ELLIPSIS
...     print('[%s] <-> {%s}' % (f.string_maximal, f.string_extent))
[+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl] <-> {}
[+1 -2 -3 +sg -pl] <-> {1s}
...
[-1] <-> {2s 2p 3s 3p}
[] <-> {1s 1p 2s 2p 3s 3p}

See the docs on how to define, load, and use your own feature systems.

Further reading

See also

  • concepts – Formal Concept Analysis with Python

  • fileconfig – Config file sections as objects

  • graphviz – Simple Python interface for Graphviz

License

Features is distributed under the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

features-0.5.5.zip (124.6 kB view details)

Uploaded Source

Built Distribution

features-0.5.5-py2.py3-none-any.whl (17.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file features-0.5.5.zip.

File metadata

  • Download URL: features-0.5.5.zip
  • Upload date:
  • Size: 124.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for features-0.5.5.zip
Algorithm Hash digest
SHA256 733c8a0211c5d6775679c3c00a4a2835d0729b4b032cfa55b1e440ee3562b287
MD5 187446d3177e82825f921a2c6a40a64b
BLAKE2b-256 b9dcc0bcccaf916d7d079d6c74d732efb8f156454caa43de3d72489f2af6db60

See more details on using hashes here.

File details

Details for the file features-0.5.5-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for features-0.5.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 747927ca2a70495dbcd2d23fed96855e1ede50cd9ade640375fa92a252ef8b35
MD5 cdff6b9ec14cc7537410bdb48fb36e76
BLAKE2b-256 1df8280b112c036f77b80bae8c1a6b420b14389ee68433108b7b2f72f3fbf82c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page