Skip to main content

A logical phonology library

Project description

logical-phonology

A Python library implementing the core primitives and operations of Logical Phonology (LP). Given a universal feature set, the library provides a structured way to build and manipulate the formal objects that LP is built on — segments, natural classes, natural class sequences, words, and segment inventories.

All objects are constructed through a central FeatureSystem instance, which enforces feature validity and provides a clean, consistent API for building LP structures.

Logical Phonology is a formal framework for phonological computation. For an introduction to the theory and its literature, see the Logical Phonology Archive.

Installation

pip install logical-phonology

Development

To install the development dependencies:

pip install -e ".[dev]"

Available scripts:

  • hatch run make_api_reference — regenerate the API reference docs
  • ./scripts/publish.sh — bump version, rebuild docs, and publish to PyPI

Quick Start

Defining a Feature System

All objects in logical-phonology are built from a FeatureSystem, which defines the universal set of valid features for your analysis:

import logical_phonology as lp

fs = lp.FeatureSystem.from_features(["F1", "F2", "F3"])

Building Segments

Segments are feature bundles constructed through the feature system:

seg1 = fs.segment({"F1": lp.POS, "F2": lp.NEG})
seg2 = fs.segment({"F1": lp.NEG})  # partial specification is allowed
seg3 = fs.segment({})              # fully underspecified segment

Building an Inventory

An inventory gives names to segments, enabling tokenization and rendering. Here we use an example with Syllabic, High, Low, and Back features:

fs = lp.FeatureSystem(frozenset(["Syllabic", "High", "Low", "Back"]))

inv = fs.inventory({
    "a": fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}),
    "i": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.NEG}),
    "u": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.POS}),
    "p": fs.segment({"Syllabic": lp.NEG}),
    "t": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
    "k": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
})

Segments can be looked up by name, returning the underlying feature bundle:

inv["a"]  # returns Segment({"Syllabic": POS, "Low": POS, "High": NEG, "Back": NEG})
inv["p"]  # returns Segment({"Syllabic": NEG})

Since p, t, and k all map to a segment with the same specification, name_of returns the canonical form for all of them. On the other hand, a, i, and u are fully distinguishable from every other segment:

inv.name_of(inv["p"])  # returns "{-Syllabic}"
inv.name_of(inv["t"])  # returns "{-Syllabic}"
inv.name_of(inv["k"])  # returns "{-Syllabic}"
inv.name_of(inv["a"])  # returns "a"
inv.name_of(inv["i"])  # returns "i"
inv.name_of(inv["u"])  # returns "u"

Extending an Inventory

Inventories are immutable — extend returns a new inventory with additional segments:

inv2 = inv.extend({
    "e": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.NEG, "Back": lp.NEG}),
})

"e" in inv   # False — original inventory unchanged
"e" in inv2  # True

Building Words

Words are ordered sequences of segments. They can be constructed manually through the feature system:

word = fs.word([
    fs.segment({"Syllabic": lp.NEG}), 
    fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}), 
    fs.segment({"Syllabic": lp.NEG})
])
# OR, more conveniently...
word = fs.word([inv["p"], inv["a"], inv["t"]])  # "pat"

Or even more conveniently, words can be tokenized directly from a string using the inventory:

word = inv.tokenize("pat")   # unspaced — uses recursive tokenization
word = inv.tokenize("p a t") # spaced — uses whitespace as delimiter

Rendering converts a word back to a string:

inv.render(word)  # returns '{-Syllabic}a{-Syllabic}', as p and t are aliases of the same segment

Note: If you want render to always produce human-readable names, avoid aliases in your inventory. This can be enforced by passing allow_aliases=False to fs.inventory(). See the API reference for details.

Boundaries

Reserved boundary pseudo-segments can be added to mark the beginning and end of a word:

bounded = fs.add_boundaries(word)  # ⋉pat⋊
inv.render(bounded)                # returns "⋉pat⋊"

Boundaries are pseudo-segments with reserved features — BOS for and EOS for — and can be accessed directly from the feature system or inventory:

fs.BOS  # Segment({"BOS": POS})
fs.EOS  # Segment({"EOS": POS})

Bounded words can also be tokenized directly:

inv.tokenize("⋉pat⋊")  # returns Word with boundaries included

Natural Classes and Natural Class Sequences

A natural class is a partial feature specification that defines a set of segments. Natural classes are constructed through the feature system:

syllabic = fs.natural_class({"Syllabic": lp.POS})   # all vowels
consonant = fs.natural_class({"Syllabic": lp.NEG})  # all consonants
universal_natural_class = fs.natural_class({})      # matches any segment

Membership is checked with the in operator:

inv["a"] in syllabic   # True
inv["p"] in syllabic   # False
inv["p"] in consonant  # True

To iterate over all segments in a natural class, use over() with an inventory:

for seg in syllabic.over(inv):
    print(inv.name_of(seg))  # prints "a", "i", "u"

A natural class sequence is an ordered list of natural classes, defining a set of words. They are useful for matching patterns like CV syllables:

cv = fs.natural_class_sequence([consonant, syllabic])

inv["p"] in consonant  # True — single segment membership
inv.tokenize("pa") in cv  # True — word matches CV pattern
inv.tokenize("ap") in cv  # False — VC, not CV

To iterate over all words matching a sequence over an inventory of segments:

for word in cv.over(inv):
    print(inv.render(word))  # prints all CV combinations

Natural class sequences also support substring matching via matches_at and find_all:

word = inv.tokenize("pat")
consonant_nc = fs.natural_class_sequence([consonant])
consonant_nc.find_all(word)  # returns [0, 2] — consonants at positions 0 and 2
cv.find_all(word)  # returns [0] — CV pattern only at position 0

Design Philosophy

Single entry point. FeatureSystem is the central factory object for the library. All LP primitives are constructed through a FeatureSystem instance, ensuring that feature validity is enforced at construction time.

It is possible to construct objects directly without a FeatureSystem, but this bypasses validation and is discouraged.

Immutability. All objects are frozen dataclasses. Once constructed, nothing can be mutated. This makes objects safe to use as dictionary keys and in sets, and ensures that operations like subtract, unify, and project return new segments rather than modifying existing ones.

Make illegal states unrepresentable. Reserved features cannot appear in user-defined segments. Unknown features are rejected at construction. Aliased segments receive canonical forms automatically.

Partial specifications. Segments can be underspecified. A segment need not specify every feature, and a natural class only specifies the features it cares about.

Separation of concerns. This library is purely about LP primitives and operations.

Structured exceptions. Every error carries structured attributes so callers can handle errors programmatically rather than parsing error strings.

Zero dependencies. The library is implemented entirely in the Python standard library, there are no external runtime dependencies.

API Reference

Full API documentation is available in the API Reference.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logical_phonology-0.10.1.tar.gz (39.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logical_phonology-0.10.1-py3-none-any.whl (28.5 kB view details)

Uploaded Python 3

File details

Details for the file logical_phonology-0.10.1.tar.gz.

File metadata

  • Download URL: logical_phonology-0.10.1.tar.gz
  • Upload date:
  • Size: 39.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for logical_phonology-0.10.1.tar.gz
Algorithm Hash digest
SHA256 1830ec9649ef745b18ca6a9f116a4c65b69c97bde07b5db1c06d4445d78ae60a
MD5 64d84972d90351b990e8c947c2d7c24d
BLAKE2b-256 aa912d29bb22c7b1600054c2bdf97f52a13e10007bbebdb89b77cfec0f4900d0

See more details on using hashes here.

File details

Details for the file logical_phonology-0.10.1-py3-none-any.whl.

File metadata

File hashes

Hashes for logical_phonology-0.10.1-py3-none-any.whl
Algorithm Hash digest
SHA256 17554e07766d1829c7af4e657165b38230d0990fd858af295a0e3f347d12aa2a
MD5 a18e5562778aca08ffda5cba20c472bb
BLAKE2b-256 27c1aff45aa84e9f08dcd9479f237a2d5dd909a930c22dac9919b23d63a9db28

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page