Skip to main content

A logical phonology library

Project description

logical-phonology

A Python library implementing the core primitives and operations of Logical Phonology (LP). Given a universal feature set, the library provides a structured way to build and manipulate the formal objects that LP is built on — segments, natural classes, natural class sequences, words, and segment inventories.

All objects are constructed through a central FeatureSystem instance, which enforces feature validity and provides a clean, consistent API for building LP structures.

Logical Phonology is a formal framework for phonological computation. For an introduction to the theory and its literature, see the Logical Phonology Archive.

Installation

pip install logical-phonology

Development

To install the development dependencies:

pip install -e ".[dev]"

Available scripts:

  • hatch run make_api_reference — regenerate the API reference docs
  • hatch run publish — bump version, rebuild docs, and publish to PyPI

Quick Start

Defining a Feature System

All objects in logical-phonology are built from a FeatureSystem, which defines the universal set of valid features for your analysis:

import logical_phonology as lp

fs = lp.FeatureSystem(frozenset(["F1", "F2", "F3"]))

Building Segments

Segments are feature bundles constructed through the feature system:

seg1 = fs.segment({"F1": lp.POS, "F2": lp.NEG})
seg2 = fs.segment({"F1": lp.NEG})  # partial specification is allowed
seg3 = fs.segment({})              # fully underspecified segment

Building an Inventory

An inventory gives names to segments, enabling tokenization and rendering. Here we use an example with Syllabic, High, Low, and Back features:

fs = lp.FeatureSystem(frozenset(["Syllabic", "High", "Low", "Back"]))

inv = fs.inventory({
    "a": fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}),
    "i": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.NEG}),
    "u": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.POS, "Back": lp.POS}),
    "p": fs.segment({"Syllabic": lp.NEG}),
    "t": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
    "k": fs.segment({"Syllabic": lp.NEG}),  # alias for same segment as "p"
})

Segments can be looked up by name, returning the underlying feature bundle:

inv["a"]  # returns Segment({"Syllabic": POS, "Low": POS, "High": NEG, "Back": NEG})
inv["p"]  # returns Segment({"Syllabic": NEG})

Since p, t, and k all map to a segment with the same specification, name_of returns the canonical form for all of them. On the other hand, a, i, and u are fully distinguishable from every other segment:

inv.name_of(inv["p"])  # returns "{-Syllabic}"
inv.name_of(inv["t"])  # returns "{-Syllabic}"
inv.name_of(inv["k"])  # returns "{-Syllabic}"
inv.name_of(inv["a"])  # returns "a"
inv.name_of(inv["i"])  # returns "i"
inv.name_of(inv["u"])  # returns "u"

Extending an Inventory

Inventories are immutable — extend returns a new inventory with additional segments:

inv2 = inv.extend({
    "e": fs.segment({"Syllabic": lp.POS, "Low": lp.NEG, "High": lp.NEG, "Back": lp.NEG}),
})

"e" in inv   # False — original inventory unchanged
"e" in inv2  # True

Building Words

Words are ordered sequences of segments. They can be constructed manually through the feature system:

word = fs.word([
    fs.segment({"Syllabic": lp.NEG}), 
    fs.segment({"Syllabic": lp.POS, "Low": lp.POS, "High": lp.NEG, "Back": lp.NEG}), 
    fs.segment({"Syllabic": lp.NEG})
])
# OR, more conveniently...
word = fs.word([inv["p"], inv["a"], inv["t"]])  # "pat"

Or even more conveniently, words can be tokenized directly from a string using the inventory:

word = inv.tokenize("pat")   # unspaced — uses recursive tokenization
word = inv.tokenize("p a t") # spaced — uses whitespace as delimiter

Rendering converts a word back to a string:

inv.render(word)  # returns '{-Syllabic}a{-Syllabic}', as p and t are aliases of the same segment

Note: If you want render to always produce human-readable names, avoid aliases in your inventory. This can be enforced by passing allow_aliases=False to fs.inventory(). See the API reference for details.

Boundaries

Reserved boundary pseudo-segments can be added to mark the beginning and end of a word:

bounded = fs.add_boundaries(word)  # ⋉pat⋊
inv.render(bounded)                # returns "⋉pat⋊"

Boundaries are pseudo-segments with reserved features — BOS for and EOS for — and can be accessed directly from the feature system or inventory:

fs.BOS  # Segment({"BOS": POS})
fs.EOS  # Segment({"EOS": POS})

Bounded words can also be tokenized directly:

inv.tokenize("⋉pat⋊")  # returns Word with boundaries included

Natural Classes and Natural Class Sequences

A natural class is a partial feature specification that defines a set of segments. Natural classes are constructed through the feature system:

syllabic = fs.natural_class({"Syllabic": lp.POS})   # all vowels
consonant = fs.natural_class({"Syllabic": lp.NEG})  # all consonants
universal_natural_class = fs.natural_class({})      # matches any segment

Membership is checked with the in operator:

inv["a"] in syllabic   # True
inv["p"] in syllabic   # False
inv["p"] in consonant  # True

To iterate over all segments in a natural class, use over() with an inventory:

for seg in syllabic.over(inv):
    print(inv.name_of(seg))  # prints "a", "i", "u"

A natural class sequence is an ordered list of natural classes, defining a set of words. They are useful for matching patterns like CV syllables:

cv = fs.natural_class_sequence([consonant, syllabic])

inv["p"] in consonant  # True — single segment membership
inv.tokenize("pa") in cv  # True — word matches CV pattern
inv.tokenize("ap") in cv  # False — VC, not CV

To iterate over all words matching a sequence over an inventory of segments:

for word in cv.over(inv):
    print(inv.render(word))  # prints all CV combinations

Natural class sequences also support substring matching via matches_at and find_all:

word = inv.tokenize("pat")
consonant_nc = fs.natural_class_sequence([consonant])
consonant_nc.find_all(word)  # returns [0, 2] — consonants at positions 0 and 2
cv.find_all(word)  # returns [0] — CV pattern only at position 0

Design Philosophy

Single entry point. FeatureSystem is the central factory object for the library. All LP primitives are constructed through a FeatureSystem instance, ensuring that feature validity is enforced at construction time.

It is possible to construct objects directly without a FeatureSystem, but this bypasses validation and is discouraged.

Immutability. All objects are frozen dataclasses. Once constructed, nothing can be mutated. This makes objects safe to use as dictionary keys and in sets, and ensures that operations like subtract, unify, and project return new segments rather than modifying existing ones.

Make illegal states unrepresentable. Reserved features cannot appear in user-defined segments. Unknown features are rejected at construction. Aliased segments receive canonical forms automatically.

Partial specifications. Segments can be underspecified. A segment need not specify every feature, and a natural class only specifies the features it cares about.

Separation of concerns. This library is purely about LP primitives and operations.

Structured exceptions. Every error carries structured attributes so callers can handle errors programmatically rather than parsing error strings.

Zero dependencies. The library is implemented entirely in the Python standard library, there are no external runtime dependencies.

API Reference

Full API documentation is available in the API Reference.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

logical_phonology-0.2.2.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

logical_phonology-0.2.2-py3-none-any.whl (19.2 kB view details)

Uploaded Python 3

File details

Details for the file logical_phonology-0.2.2.tar.gz.

File metadata

  • Download URL: logical_phonology-0.2.2.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for logical_phonology-0.2.2.tar.gz
Algorithm Hash digest
SHA256 dad940e15b4f75f458561b613cb16c6e535b64db5b965083f8ab949b311d0536
MD5 8b3485bdfa2d45d007bedc87ee6537c8
BLAKE2b-256 3ed3965e800cf080993dae229b76764bf59b6dbfadfd28a861502edaa1a799a0

See more details on using hashes here.

File details

Details for the file logical_phonology-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for logical_phonology-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f97fe275d5125de3038be3b97e9df2b240a5d9bf99eaa132055fe84a1bb9eb47
MD5 4147a5af0d642f5b6f2d13f7f97e79ec
BLAKE2b-256 c8f93e8a1d5e283407a208f4d2bcdf37caa9a52a35b129f577598197e325700a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page