Skip to main content

A logical phonology library

Project description

logical-phonology

A Python library implementing the core primitives and operations of Logical Phonology (LP). Given a universal feature set, the library provides a structured way to build and manipulate the formal objects that LP is built on — segments, natural classes, natural class sequences, words, and segment inventories.

All objects are constructed through a central FeatureSystem instance, which enforces feature validity and provides a clean, consistent API for building LP structures.

Logical Phonology is a formal framework for phonological computation. For an introduction to the theory and its literature, see the Logical Phonology Archive.

Installation

pip install logical-phonology

Development

To install the development dependencies:

pip install -e ".[dev]"

Available scripts:

  • hatch run make_api_reference — regenerate the API reference docs
  • ./scripts/publish.sh — bump version, rebuild docs, and publish to PyPI

Quick Start

Defining a Feature System

All objects in logical-phonology are built from a FeatureSystem, which defines the universal set of valid features for your analysis:

import logical_phonology as lp

fs = lp.FeatureSystem(frozenset(["F1", "F2", "F3"]))

Building Segments

Segments are feature bundles constructed through the feature system:

seg1 = fs.segment({"F1": lp.POS, "F2": lp.NEG})
seg2 = fs.segment({"F1": lp.NEG})  # partial specification is allowed
seg3 = fs.segment({})              # fully underspecified segment

Building an Inventory

An inventory gives names to segments, enabling tokenization and rendering. Here we use an example with Syllabic, High, Low, and Back features:

fs = lp.FeatureSystem(frozenset(["Syllabic", "High", "Low", "Back"]))

inv = fs.inventory({
    "a": fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}),
    "i": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.NEG}),
    "u": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.POS}),
    "p": fs.segment({"Syllabic": lp.NEG}),
    "t": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
    "k": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
})

Segments can be looked up by name, returning the underlying feature bundle:

inv["a"]  # returns Segment({"Syllabic": POS, "Low": POS, "High": NEG, "Back": NEG})
inv["p"]  # returns Segment({"Syllabic": NEG})

Since p, t, and k all map to a segment with the same specification, name_of returns the canonical form for all of them. On the other hand, a, i, and u are fully distinguishable from every other segment:

inv.name_of(inv["p"])  # returns "{-Syllabic}"
inv.name_of(inv["t"])  # returns "{-Syllabic}"
inv.name_of(inv["k"])  # returns "{-Syllabic}"
inv.name_of(inv["a"])  # returns "a"
inv.name_of(inv["i"])  # returns "i"
inv.name_of(inv["u"])  # returns "u"

Extending an Inventory

Inventories are immutable — extend returns a new inventory with additional segments:

inv2 = inv.extend({
    "e": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.NEG, "Back": lp.NEG}),
})

"e" in inv   # False — original inventory unchanged
"e" in inv2  # True

Building Words

Words are ordered sequences of segments. They can be constructed manually through the feature system:

word = fs.word([
    fs.segment({"Syllabic": lp.NEG}), 
    fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}), 
    fs.segment({"Syllabic": lp.NEG})
])
# OR, more conveniently...
word = fs.word([inv["p"], inv["a"], inv["t"]])  # "pat"

Or even more conveniently, words can be tokenized directly from a string using the inventory:

word = inv.tokenize("pat")   # unspaced — uses recursive tokenization
word = inv.tokenize("p a t") # spaced — uses whitespace as delimiter

Rendering converts a word back to a string:

inv.render(word)  # returns '{-Syllabic}a{-Syllabic}', as p and t are aliases of the same segment

Note: If you want render to always produce human-readable names, avoid aliases in your inventory. This can be enforced by passing allow_aliases=False to fs.inventory(). See the API reference for details.

Boundaries

Reserved boundary pseudo-segments can be added to mark the beginning and end of a word:

bounded = fs.add_boundaries(word)  # ⋉pat⋊
inv.render(bounded)                # returns "⋉pat⋊"

Boundaries are pseudo-segments with reserved features — BOS for and EOS for — and can be accessed directly from the feature system or inventory:

fs.BOS  # Segment({"BOS": POS})
fs.EOS  # Segment({"EOS": POS})

Bounded words can also be tokenized directly:

inv.tokenize("⋉pat⋊")  # returns Word with boundaries included

Natural Classes and Natural Class Sequences

A natural class is a partial feature specification that defines a set of segments. Natural classes are constructed through the feature system:

syllabic = fs.natural_class({"Syllabic": lp.POS})   # all vowels
consonant = fs.natural_class({"Syllabic": lp.NEG})  # all consonants
universal_natural_class = fs.natural_class({})      # matches any segment

Membership is checked with the in operator:

inv["a"] in syllabic   # True
inv["p"] in syllabic   # False
inv["p"] in consonant  # True

To iterate over all segments in a natural class, use over() with an inventory:

for seg in syllabic.over(inv):
    print(inv.name_of(seg))  # prints "a", "i", "u"

A natural class sequence is an ordered list of natural classes, defining a set of words. They are useful for matching patterns like CV syllables:

cv = fs.natural_class_sequence([consonant, syllabic])

inv["p"] in consonant  # True — single segment membership
inv.tokenize("pa") in cv  # True — word matches CV pattern
inv.tokenize("ap") in cv  # False — VC, not CV

To iterate over all words matching a sequence over an inventory of segments:

for word in cv.over(inv):
    print(inv.render(word))  # prints all CV combinations

Natural class sequences also support substring matching via matches_at and find_all:

word = inv.tokenize("pat")
consonant_nc = fs.natural_class_sequence([consonant])
consonant_nc.find_all(word)  # returns [0, 2] — consonants at positions 0 and 2
cv.find_all(word)  # returns [0] — CV pattern only at position 0

Design Philosophy

Single entry point. FeatureSystem is the central factory object for the library. All LP primitives are constructed through a FeatureSystem instance, ensuring that feature validity is enforced at construction time.

It is possible to construct objects directly without a FeatureSystem, but this bypasses validation and is discouraged.

Immutability. All objects are frozen dataclasses. Once constructed, nothing can be mutated. This makes objects safe to use as dictionary keys and in sets, and ensures that operations like subtract, unify, and project return new segments rather than modifying existing ones.

Make illegal states unrepresentable. Reserved features cannot appear in user-defined segments. Unknown features are rejected at construction. Aliased segments receive canonical forms automatically.

Partial specifications. Segments can be underspecified. A segment need not specify every feature, and a natural class only specifies the features it cares about.

Separation of concerns. This library is purely about LP primitives and operations.

Structured exceptions. Every error carries structured attributes so callers can handle errors programmatically rather than parsing error strings.

Zero dependencies. The library is implemented entirely in the Python standard library, there are no external runtime dependencies.

API Reference

Full API documentation is available in the API Reference.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logical_phonology-0.2.7.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logical_phonology-0.2.7-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file logical_phonology-0.2.7.tar.gz.

File metadata

  • Download URL: logical_phonology-0.2.7.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for logical_phonology-0.2.7.tar.gz
Algorithm Hash digest
SHA256 bfad019eed429c9ac24234658d47c418e08f03296c33c127ec4f731caf3b34d6
MD5 0edcbcfdb5068cc282eec81d7d537f46
BLAKE2b-256 28800feead1a492b7f01f9fae199cbc6ade421a8a28eb943fce8cffe57f99a8f

See more details on using hashes here.

File details

Details for the file logical_phonology-0.2.7-py3-none-any.whl.

File metadata

File hashes

Hashes for logical_phonology-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a82607a7774f0afaaaad362f04e3f02f106b90fa5d18ccb85c11c0221e7d077e
MD5 fcd01ef0bda62a09687b450a2c67ca6f
BLAKE2b-256 93d8453f5ae062b5386d2e6eb58751a8d0b38411792fdeafe57706f1133bd011

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page