Skip to main content

Feature set algebra for linguistics

Project description

Latest PyPI Version License Downloads

Features is a simple implementation of feature set algebra in Python.

Linguistic analyses commonly use sets of binary or privative features to refer to different groups of linguistic objects: for example a group of phonemes that share some phonological features like [-consonantal, +high] or a set of morphemes that occur in context of a specific person/number combination like [-participant, GROUP]. Usually, the features are applied in a way such that only some of their combinations are valid, while others are impossible (i.e. refer to no object) – for example [+high, +low], or [-participant, +speaker].

With this package, such feature systems can be created with a simple contingency table definition (feature matrix) and saved under a section in a configuration file. Each feature system can then be loaded and provides its own FeatureSet subclass that implements all comparisons and operations between its feature sets according to the given definition (compatibility, entailment, intersection, unification, etc.).

Features creates the complete lattice structure between the possible feature sets of each feature system and lets you navigate and visualize their relations using the Graphviz graph layout library.


$ pip install features

This will also install the concepts package from PyPI providing the Formal Concept Analysis (FCA) algorithms which are the base of this package.

Features is essentially a convenience wrapper around the FCA-functionality of concepts.


Features includes some predefined feature systems you can try immediately. To load a feature system, pass its name to features.FeatureSystem:

>>> import features

>>> fs = features.FeatureSystem('plural')

>>> fs
<FeatureSystem('plural') of 6 atoms 22 featuresets>

The built-in feature systems are found in the config.ini file in the package directory (usually Lib/site-packages/concepts in your Python directory).

The definition of a feature system is stored in its context object:

>>> print fs.context  # doctest: +ELLIPSIS
<Context object mapping 6 objects to 10 properties at 0x...>
    1s|X |  |  |X |  |X |X  |   |   |X  |
    1p|X |  |  |X |  |X |   |X  |X  |   |
    2s|  |X |X |  |  |X |X  |   |   |X  |
    2p|  |X |X |  |  |X |   |X  |X  |   |
    3s|  |X |  |X |X |  |X  |   |   |X  |
    3p|  |X |  |X |X |  |   |X  |X  |   |

Check the documentation of concepts for further information on its full functionality.

>>> fs.context.objects
('1s', '1p', '2s', '2p', '3s', '3p')

('+1', '-1', '+2', '-2', '+3', '-3', '+sg', '+pl', '-sg', '-pl')

Feature sets

All feature system contain a contradicting feature set with all features referring to no object:

>>> fs.infimum
FeatureSet('+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl')

>>> fs.infimum.concept.extent

As well as a maximally general tautological feature set with no features referring to all objects:

>>> fs.supremum

>>> fs.supremum.concept.extent
('1s', '1p', '2s', '2p', '3s', '3p')

Use the feature system to iterate over all defined feature sets in shortlex extent order:

>>> for f in fs:
...     print f, f.concept.extent
[+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl] ()
[+1 +sg] ('1s',)
[+1 +pl] ('1p',)
[+2 +sg] ('2s',)
[+2 +pl] ('2p',)
[+3 +sg] ('3s',)
[+3 +pl] ('3p',)
[+1] ('1s', '1p')
[-3 +sg] ('1s', '2s')
[-2 +sg] ('1s', '3s')
[-3 +pl] ('1p', '2p')
[-2 +pl] ('1p', '3p')
[+2] ('2s', '2p')
[-1 +sg] ('2s', '3s')
[-1 +pl] ('2p', '3p')
[+3] ('3s', '3p')
[+sg] ('1s', '2s', '3s')
[+pl] ('1p', '2p', '3p')
[-3] ('1s', '1p', '2s', '2p')
[-2] ('1s', '1p', '3s', '3p')
[-1] ('2s', '2p', '3s', '3p')
[] ('1s', '1p', '2s', '2p', '3s', '3p')


You can call the feature system with an iterable of features to retrieve one of its feature sets:

>>> fs(['+1', '+sg'])
FeatureSet('+1 +sg')

Usually, it is more convenient to let the system extract the features from a string:

>>> fs('+1 +sg')
FeatureSet('+1 +sg')

Leading plusses can be omitted. Spaces are optional. Case, order, and duplication of features are ignored.

>>> fs('2 pl')
FeatureSet('+2 +pl')

>>> fs('SG3sg')
FeatureSet('+3 +sg')

Note that commas are not allowed inside the string.


Feature sets are singletons. The constructor is also idempotent:

>>> fs('1sg') is fs('1sg')

>>> fs(fs('1sg')) is fs('1sg')

All different possible ways to notate a feature set map to the same instance:

>>> fs('+1 -2 -3 -sg +pl') is fs('1pl')

>>> fs('+sg') is fs('-pl')

Notations are equivalent, when they refer to the same set of objects (have the same extent).


Compatibility tests:

>>> fs('+1').incompatible_with(fs('+3'))

>>> fs('sg').complement_of(fs('pl'))

>>> fs('-1').subcontrary_with(fs('-2'))

Set inclusion (subsumption):

>>> fs('') < fs('-3') <= fs('-3') < fs('+1') < fs('1sg')


Intersection (join, closest feature set that subsumes the given ones):

>>> fs('1sg') % fs('2sg')
FeatureSet('-3 +sg')

Intersect an iterable of feature sets:

>>> fs.join([fs('+1'), fs('+2'), fs('1sg')])

Unification (meet, closest feature set that implies the given ones):

>>> fs('-1') ^ fs('-2')

Unify an iterable of feature sets:

>>>[fs('+1'), fs('+sg'), fs('-3')])
FeatureSet('+1 +sg')


Immediately implied/subsumed neighbors.

>>> fs('+1').upper_neighbors
[FeatureSet('-3'), FeatureSet('-2')]

>>> fs('+1').lower_neighbors
[FeatureSet('+1 +sg'), FeatureSet('+1 +pl')]

Complete set of implied/subsumed neighbors.

>>> fs('+1').upset
[FeatureSet('+1'), FeatureSet('-3'), FeatureSet('-2'), FeatureSet('')]

>>> fs('+1').downset  # doctest: +NORMALIZE_WHITESPACE
[FeatureSet('+1 -1 +2 -2 +3 -3 +sg +pl -sg -pl'),
 FeatureSet('+1 +sg'), FeatureSet('+1 +pl'), FeatureSet('+1')]


Create a graph of the feature system lattice.

>>> dot = fs.graphviz()

>>> print dot.source  # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
// <FeatureSystem('plural') of 6 atoms 22 featuresets>
digraph plural {
graph [margin=0]
edge [arrowtail=none dir=back penwidth=.5]
    f0 [label="+1 &minus;1 +2 &minus;2 +3 &minus;3 +sg +pl &minus;sg &minus;pl"]
    f1 [label="+1 +sg"]
            f1 -> f0
    f2 [label="+1 +pl"]
            f2 -> f0

Check the documentation of this package for details on the resulting object.


Create an INI-file with your configurations, for example:

# phonemes.ini - define distinctive features

description = Distinctive vowel place features
str_maximal = true
context =
  i|  X  |     |    |  X |     |  X  |      |  X   |
  y|  X  |     |    |  X |     |  X  |  X   |      |
  ɨ|  X  |     |    |  X |  X  |     |      |  X   |
  u|  X  |     |    |  X |  X  |     |  X   |      |
  e|     |  X  |    |  X |     |  X  |      |  X   |
  ø|     |  X  |    |  X |     |  X  |  X   |      |
  ʌ|     |  X  |    |  X |  X  |     |      |  X   |
  o|     |  X  |    |  X |  X  |     |  X   |      |
  æ|     |  X  |  X |    |     |  X  |      |  X   |
  œ|     |  X  |  X |    |     |  X  |  X   |      |
  ɑ|     |  X  |  X |    |  X  |     |      |  X   |
  ɒ|     |  X  |  X |    |  X  |     |  X   |      |

Add your config file, overriding existing sections with the same name:

>>> features.Config.add('docs/phonemes.ini')

If the filename is relative, it is resolved relative to the file where the add method was called. Check the documentation of the fileconfig package for details.

Load your feature system:

>>> fs = features.FeatureSystem('vowels')

>>> fs
<FeatureSystem('vowels') of 12 atoms 55 featuresets>

Retrieve feature sets, extents and intents:

>>> print fs('+high')
[+high -low]

>>> fs('high round').concept.extent
(u'y', u'u')

>>> fs.lattice[('i', 'e', 'o')].intent

Logical relations between feature pairs:

>>> fs.context.relations()  # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
[<u'+high' Complement u'-high'>, <u'+low' Complement u'-low'>,
 <u'+back' Complement u'-back'>, <u'+round' Complement u'-round'>,
 <u'+high' Incompatible u'+low'>,
 <u'+high' Implication u'-low'>, <u'+low' Implication u'-high'>,
 <u'-high' Subcontrary u'-low'>,
 <u'+high' Orthogonal u'+back'>, <u'+high' Orthogonal u'-back'>,

See also

  • concepts – Formal Concept Analysis with Python
  • fileconfig – Config file sections as objects
  • graphviz – Simple Python interface for Graphviz


Features is distributed under the MIT license.

Project details

Release history Release notifications

History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


History Node


This version
History Node


History Node


History Node


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date (108.1 kB) Copy SHA256 hash SHA256 Source None Jan 20, 2014

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging CloudAMQP CloudAMQP RabbitMQ AWS AWS Cloud computing Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page