Skip to main content

Flexible and Extensible Object for Harmony Representation — a library for representing, storing, querying, and converting musical harmony

Project description

FlexOHR

Flexible and Extensible Object for Harmony Representation

A Python library for representing, storing, querying, and converting musical harmony.

PyPI version Python versions License: Unlicense


Installation

# Core package (no heavy dependencies)
pip install flexohr

# With database/storage support (numpy, pandas, duckdb, rustworkx)
pip install flexohr[database]

# With grammar parsing support
pip install flexohr[grammar]

# Full installation for development
pip install flexohr[dev]

Requirements: Python 3.11+


What Problem Does FlexOHR Solve?

Harmony annotations are trapped in format silos. DCML Roman numerals, RomanText labels, chord symbols, and pitch-class sets each encode overlapping information in incompatible syntaxes. Converting between them is lossy, ad-hoc, and breaks at edge cases. Meanwhile, every annotation standard bakes in assumptions about Western tonal music that silently exclude quintal harmony, microtonal systems, spectral data, and anything beyond 12 pitch classes.

FlexOHR introduces a single internal object — the OHR (Object for Harmony Representation) — that captures what any of these formats can express, plus structures none of them can. Annotations enter through format-specific codecs, live internally as OHR objects, and exit through any codec — including formats they were never written in.


Core Idea: The OHR as a Recursive Container

An OHR is not a chord type. It is a container of components with a reference point:

OHR = reference_component + reference_ohr + body[Component | OHR, ...]
  • A Component wraps a value (absolute point, relative vector, or undefined) plus arbitrary properties (tone function, confidence, voice, ...). In pitch-based paradigms, values are pitchspace types; in other paradigms (spectral, rhythmic, etc.), they can be any type that the paradigm module defines.
  • The reference_component anchors relative components in absolute pitch space.
  • The reference_ohr provides broader context (e.g., a scale that gives meaning to scale degrees).
  • The body is a list of Components and/or nested OHRs.

This recursion is the key insight. A chord is an OHR whose body contains interval-defined components. A scale is an OHR whose body is an ordered sequence of intervals. A pedal tone over a chord progression is an OHR whose body contains a pedal Component and a nested progression-OHR. A Roman numeral like V65/vi is an OHR with a reference_ohr (the tonicized key) containing a chord-OHR. The same structure handles all of these without special cases.

Why Not a Flat Chord Enum?

A flat enumeration of chord types (major triad, dominant seventh, ...) works only for the ~20 chord types that Western tonal theory names. It cannot represent:

  • Arbitrary interval stacks (quintal harmony, Messiaen modes)
  • Partial voicings where only some tones are present
  • Components with uncertainty (GNN predictions with confidence scores)
  • Nested harmonic structures (pedal tones, applied chords, polychords)
  • Non-12-EDO pitch systems

By making the OHR a container of components rather than a named type, FlexOHR handles all of these with one mechanism. Named chord types become a convenience layer: a ChordQuality.major_triad is shorthand for "body = root + M3 + P5", not a fundamental data type.


The Three Shapes: OHR, OHRView, OHRCollection

FlexOHR represents harmony at three granularities, all sharing the same FlexohrProtocol interface. The name is FlexohrProtocol (not HarmonyProtocol) because OHRs also represent structures that are not traditionally called "harmony": Scales are ordered horizontal OHRs (providing specific intervals first, interval classes second), Collections are unordered horizontal/direction-less OHRs (interval classes first, intervals second).

Shape What It Is When to Use
OHR Frozen dataclass. Single object. Carries _ohr_id. Constructing, inspecting, or transforming one chord/scale/collection.
OHRCollection Five DuckDB tables (ohr_nodes, components, closure, properties, edges). Bulk operations: an entire piece, corpus queries, structural/property queries.
OHRView Cached proxy into an OHRCollection. Scalar access without materialization: coll[42].root().

Why Three and Not One?

Musical analysis operates at two scales simultaneously: you reason about individual chords (scalar) and you compute over thousands of annotations (columnar). A single object model forces a choice — either it's slow for bulk operations or awkward for inspection. The three-shape design means:

  • OHR is the mental model. It's what you think about. Frozen, immutable, hashable. On construction, it auto-registers in the default OHRCollection (disable via _register=False or set_default_collection(None)).
  • OHRCollection is the workhorse. Five DuckDB tables with O(1) structural queries (closure table), sub-ms property filters, paradigm-specific views, and SQL-native UDFs.
  • OHRView bridges them. coll[i] returns a cached view (OHRs are immutable, so no live queries needed). Call .materialize() for a standalone OHR.

All three implement FlexohrProtocol, so code written for one works on any:

def dominant_roots(h: FlexohrProtocol) -> SPC | pd.Series:
    """Works on OHR, OHRView, or OHRCollection."""
    return h.root()[h.chord_quality == ChordQuality.dominant_seventh]

Given vs. Inferred: Provenance at Every Level

Every OHR distinguishes what was given (explicitly stated in the original annotation) from what was inferred (derived by FlexOHR).

ohr = OHR.from_label("V65", standard="dcml", globalkey="C", localkey="I")

ohr.root(source=Source.given)     # SPC("G") — the label said "V" in C major
ohr.bass(source=Source.given)     # None — "65" implies the bass but doesn't name it
ohr.bass(source=Source.inferred)  # SPC("B") — inferred from quality + inversion
ohr.bass()                        # SPC("B") — default: given if available, else inferred

Why Track This?

Because downstream tasks need to know the difference. A music theorist comparing two analyses wants to see only what each analyst actually wrote, not what a parser filled in. A machine learning pipeline training on annotations needs to distinguish ground truth from derived features. Without provenance tracking, this information is silently lost the moment you parse a label.

The mechanism is lightweight: a source= parameter on accessor methods, backed by optional {column}_source provenance columns in OHRCollection. No overhead when you don't need it.


Storage: Five-Table Model + DuckDB

OHRCollection stores data across five DuckDB tables:

  1. ohr_nodes — one row per OHR (label, standard, ordered, horizontal, refs).
  2. components — one row per component in any OHR body (value_raw, value_type, position, status). Paradigm-agnostic: handles pitch classes, frequency bins, spectra.
  3. closure — one row per ancestor-descendant pair (ancestor_id, descendant_id, depth). Enables O(1) structural queries.
  4. properties — key-value store for OHR-level and component-level properties, with an optional certainty column for probabilistic values.
  5. edges — transformation and relation edges between OHRs, with serialized transformation type + parameters.

Why Five Tables?

The closure table gives O(1) structural queries ("all descendants of X", "subtree rooted at Z") while keeping OHR decomposition into rows natural. DuckDB operates directly on these tables with sub-ms property filters and paradigm-specific UDFs.

Paradigm-specific convenience columns (root_fifths, chord_quality, etc.) are DuckDB views over the five tables, not separate storage. Each paradigm registers its views at import time. This keeps the core storage paradigm-agnostic.

What Is Settled

  • Five-table model — LOCKED (Phase 3).
  • DuckDB as query engine — LOCKED.
  • Paradigm-specific columns as DuckDB views — LOCKED.
  • Property annotations via certainty column — LOCKED.
  • Transformation type hierarchy serialized to edges table — LOCKED.
  • Auto-registration in default collection — LOCKED.
  • JSON dict format as serialization/interchange — this is the reference format.

Codec System: Annotations In, Annotations Out

FlexOHR doesn't invent a new annotation standard. It provides a common internal representation that annotation standards convert to and from:

DCML label ──→ DCMLCodec.parse()    ──→ OHR ──→ RomanTextCodec.emit() ──→ RomanText label
                                        ↕
                                   OHRCollection (internal)
                                        ↕
FlexOHR syntax ──→ FlexOHRCodec.parse() ──→ OHR ──→ DCMLCodec.emit() ──→ DCML label

Each codec implements a StandardCodec protocol: parse(label, context) → OHR and emit(ohr) → label. A codec always integrates with its paradigm's module and uses the appropriate validators according to a config file controlled by the user through text or through a configuration API. Enum members map between formats via a CodecRegistry populated from declarative JSON tables.

Why Codecs Instead of Direct Conversion?

Direct A→B conversion between N standards requires N*(N-1) converters. The codec pattern requires N codecs (one per standard) and gets all N*N conversions for free via the common internal representation. Adding a new standard (e.g., lead-sheet chord symbols) means writing one codec, not updating every existing converter.

Round-trip fidelity is a design goal: parse → emit in the same standard should reproduce the original label. Cross-standard conversion is deterministic but inherently lossy where standards differ in expressiveness.

Each codec must declare clear guardrails: which subtypes of OHRs it can convert to and from. For example, a Roman-numeral-based (and therefore tertian-harmony-based) codec must reject OHRs that are not pitch-based. This means paradigm modules must integrate into a taxonomy of theory types that reflects what each theory's first-class citizens are (pitch classes, frequency bins, rhythmic patterns, etc.).

Supported Standards

Standard Source Parser Key Capability
DCML ms3 Figured bass inversions, changes, pedal, applied chords
RomanText music21 music21 quality vocabulary, offset-based timing
FlexOHR native DHParser (EBNF grammar) Recursive nesting, interval stacks, multi-system

Planned for later: Harte standard, Forte sets, Jazz chords (e.g., MuseScore flavour, Realbook flavour, etc.), clusters, spectra — depending on the use cases that already have or still need an encoding standard.


Paradigm Agnosticism

FlexOHR's Component.value is not restricted to any particular value domain. The core API must be as agnostic as possible — a spectral-harmony OHR's body will have nothing to do with pitchspace, yet it should still follow the "nested collection of components and OHRs" paradigm as closely as possible.

The pitch-based paradigm (the first and most developed) provides a 3x2 grid of 12 types via the pitchspace module:

Level Pitch Class Pitch Interval Class Interval
Enharmonic (semitones) EPC EP EIC EI
Specific (line of fifths) SPC SP SIC SI
Generic (diatonic steps) GPC GP GIC GI

But the architecture doesn't depend on this grid. A Component holding a FrequencyBin, CentsValue, or a rhythmic-pattern value works identically — same OHR construction, same FlexohrProtocol methods, same codec pipeline. Each paradigm module defines its own value types and registers them. pyarrow ExtensionType subclasses carry type semantics into the storage layer.

Why This Matters

Computational musicology increasingly works with non-Western, microtonal, and spectral data. A library that hardcodes root % 12 or assumes IntervalClass ∈ {0..11} cannot serve these contexts. FlexOHR's paradigm-agnosticism is not future-proofing — it's a fundamental requirement driven by the diversity of the data the library must handle.


Immutability and Modification

OHR is @dataclass(frozen=True, slots=True). You never mutate an OHR; you create a new one via .with_():

new = ohr.with_(root=SPC("G"))                 # different root
new = ohr.with_(bass=ToneFunction.third)        # move bass to third
new = ohr.with_(                                # modify specific component
    components=Where(tone_function=ToneFunction.root).set(octave=4)
)

Note: .with_() is the general modification method. Dedicated mixin methods like .with_bass() exist for common operations but .with_() is always available.

Why Immutable?

Because harmony objects are values, not stateful entities. A "V7 in C major" is the same object wherever it appears. Immutability makes OHRs hashable (usable as dict keys and set members), safe to share across threads, and impossible to corrupt by accidental mutation. The .with_() pattern is explicit about what changes, producing a clear audit trail.


Chord Type Model: Dimensions, Not a Flat List

FlexOHR does not prescribe what property fields a particular paradigm stores with its components. What matters is having mappings between equivalent properties that may have different names in different theories. This cross-paradigm property network must be developed under systematic user consultation for each new paradigm.

Within the pitch-based paradigm, FlexOHR factors chord identity into separate dimensions:

Dimension Type Examples
Quality ChordQuality major_triad, dominant_seventh, french_sixth
Inversion Inversion root_position, first, second, third
Modification ToneModification #11 added, b9 altered

Why Separate Dimensions?

Simply to break down the combinatorial number of possible chords. A flat enum of every quality-inversion-modification combination grows combinatorially and can never be complete. Separate dimensions mean:

  • ChordQuality has ~20 members covering standard types
  • Inversion has 5 members
  • Modifications compose freely

A dominant_seventh in first inversion with an added #11 is three orthogonal facts, not one entry in a massive lookup table. The interval structure for each ChordQuality is defined once (as a tuple of (SIC, ToneFunction) pairs) and reused everywhere.


SQL and DuckDB Integration

OHRCollection exposes its five tables to DuckDB for analytical queries, with paradigm- specific UDFs and views:

coll.sql("SELECT root_fifths, SPC(root_fifths), COUNT(*) FROM ohr_tertian WHERE chord_quality = 'dominant_seventh' GROUP BY 1, 2")

Why DuckDB?

Because musicological research questions are naturally expressed as analytical queries ("what percentage of dominants resolve to the tonic?", "show me all augmented sixths in minor keys"). DuckDB operates directly on the five tables with sub-ms property filters, O(1) structural queries via the closure table, and paradigm-specific UDFs named after pitchspace types (SPC(), SIC(), spc_to_semitones(), etc.). No server infrastructure needed. OHRPattern queries compile to DuckDB SQL with CTEs for structural pattern matching.


Serialization: JSON Dicts, Not Grammar Strings

OHRs serialize to verbose, self-describing JSON dicts, not to annotation syntax strings. These mirror precisely the nested object structure that you interact with using the FlexOHR API:

{
  "type": "ohr",
  "reference_component": {"type": "component", "value": [1], "pitchspace_type": "spc"},
  "body": [
    {"type": "component", "value": [4], "pitchspace_type": "sic",
     "properties": {"tone_function": "third"}}
  ],
  "properties": {"chord_quality": "major_triad"}
}

Why JSON and Not the Grammar?

The FlexOHR grammar is an annotation syntax — designed for humans to write and read. It's compact but lossy (implicit defaults, context-dependent parsing). The JSON format is a storage format — designed for machines to serialize and deserialize without loss. It's verbose but unambiguous: every field is explicit, every type is tagged, every property is spelled out. These dicts are stored as pyarrow struct columns, survive Parquet round-trips, and are queryable via DuckDB's json_extract().


Implementation Roadmap

Development proceeds in ordered work packages. No phase begins without explicit approval. Tests are developed from the start alongside implementation code. Each work package is executed by a team of agents: architect (design), coder (implementation), reviewer (quality), and documentor (discussion moderator who maintains links between high-level specs and low-level decisions in the documentation's "Explanation" sections).

Phase Focus Status
1 Paradigm-independent core (Component, OHR, FlexohrProtocol, types, errors) COMPLETE
2a Pitchspace paradigm (migrate types, pitch-specific enums, chord tables) COMPLETE
2b Storage paradigm research (resolved: five-table model + DuckDB) COMPLETE
3a OHRCollection core (five tables, DuckDB, auto-registration, OHRView, bulk insert) NEXT
3b Query infrastructure (UDFs, OHRPattern, SQL compiler, paradigm views)
3c Property annotations + transformations (Annotated, certainty, type hierarchy, edges)
3d Arrow/IO integration (ExtensionTypes, Fields, Parquet)
4 DCML codec
5 RomanText codec
6 Cross-standard + pitch arrays
7 Advanced (resolution, grammar compiler, paradigm migration, TraversableLike)

Module Structure

flexohr/
  core/           OHR, Component, AbsentComponent, FlexohrProtocol, OHRCollection, OHRView,
                  OHRPattern, compiler, validators, transformations, sampler
  types/          FancyStrEnum (auto()+alias), core enums, JSON mappings
  storage/        OHR → five-table decomposition (flatten.py), DuckDB UDFs (udfs.py)
  arrow/          pyarrow ExtensionTypes, Field objects
  codecs/         StandardCodec protocol, DCML, RomanText, FlexOHR native
  errors/         Error hierarchy; paradigms register their warnings/errors here
  io/             Schema descriptors, Parquet, PitchArraySpec
  pitchspace/     Pitch type system (SPC, SIC, etc. — migrated from syntax/pitchspace/)
  paradigms/
    western_tertian/  TertianHarmonyProtocol, ChordQuality, DuckDB views (ohr_tertian), etc.
    tonfeld/          Tonfeld theory (DuckDB views, field generators)
    (future: spectral/, rhythmic/, ...)

Note on structure: pitchspace/ is the pitch type system. paradigms/western_tertian/ is the harmony paradigm that uses pitchspace types, registers DuckDB views and UDFs, and extends FlexohrProtocol with root(), bass(), chord_quality, etc. The core FlexohrProtocol knows only about components, references, body, properties, and iteration — no pitch-specific concepts.

Note on errors: The errors/ package provides the base error hierarchy. Each paradigm module registers its own warnings and errors there. Codecs use the validators defined by their paradigm.

Dependency Flow

core (Component, OHR, FlexohrProtocol, validators, transformations — paradigm-independent)
  → types (FancyStrEnum, core enums — abstract base)
    → errors (base error hierarchy)
    → storage (flatten: OHR → rows, udfs: DuckDB UDF registration)
      → core/collection (OHRCollection: five tables + DuckDB, auto-registration)
        → core/ohrview (cached proxy into collection)
        → core/pattern + core/compiler (OHRPattern → SQL)
    → arrow (ExtensionTypes, Fields)
      → io (schema, Parquet)
    → core/transformations (Transformation ABC + hierarchy, serialized to edges table)
  → pitchspace (pitch type system — stable, no deps outside core)
    → storage/udfs (registers SPC, SIC, EP, spc_to_semitones, etc.)
  → paradigms/
      western_tertian (uses pitchspace types)
        → types (ChordQuality, ToneFunction, etc. — registered into core types)
        → validators, transformations (paradigm-specific, mixin with core)
        → core/collection (registers paradigm-specific DuckDB views)
        → codecs (DCML, RomanText, FlexOHR native)
          → io (PitchArraySpec, convenience columns)
      tonfeld (uses pitchspace types)
        → field generators, paradigm-specific validators, DuckDB views

The abstract core is the top level. pitchspace is the pitch type system. Paradigms register DuckDB views and UDFs into OHRCollection at import time. Non-pitch paradigms follow the same structure without any pitch-specific assumptions leaking into the core.


Quick API Preview

from flexohr import OHR, OHRCollection, SPC, SIC, ToneFunction, ChordQuality, Source

# --- Scalar ---
# RomanText uses "-" for flats (e.g., "-VI" for bVI); demonstrate with a peculiar label:
ohr = OHR.from_label("-VI65", standard="romantext", key="C")
ohr.root()                           # SPC("Ab") — "-VI" = flat sixth degree
ohr.chord_quality                    # ChordQuality.dominant_seventh
ohr.to_format("dcml")               # "bVI65"

# --- Given vs. Inferred: with and without resolution ---
ohr = OHR.from_label("V65/vi", standard="dcml", globalkey="C", localkey="I")
ohr.bass(source=Source.given)        # None — "65" implies the bass but doesn't name it
ohr.bass(source=Source.inferred)     # SPC("B") — inferred from quality + inversion
ohr.bass()                           # SPC("B") — default: given if available, else inferred

# Access components without resolving (intervals only):
ohr.components(source=Source.given)  # [Component(SIC("M3")), Component(SIC("P5")), ...]
# Access components with resolution (absolute pitches):
ohr.components(resolved=True)        # [Component(SPC("B")), Component(SPC("D")), ...]

# --- Collection ---
coll = OHRCollection.from_labels(df["label"], standard="dcml",
                                 globalkey=df["globalkey"], localkey=df["localkey"])
coll[0].label                        # scalar access via cached OHRView
coll.sql("SELECT root_fifths, SPC(root_fifths), COUNT(*) FROM ohr_tertian "
         "WHERE chord_quality = 'dominant_seventh' GROUP BY 1, 2")

# --- Pitch array generation ---
pitch_df = coll.to_pitch_array(DILEMMADATA_DLC_SPEC)

Development

# Clone the repository
git clone https://github.com/DCMLab/flexohr.git
cd flexohr

# Install in editable mode with dev dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
tox -e py311
# or
pytest

# Run linters
tox -e lint

# Auto-format code
tox -e format

# Build package
tox -e build

Contributing

Contributions are welcome! Please ensure that:

  1. Code is formatted with black and imports sorted with isort
  2. All tests pass (pytest)
  3. New features include tests
  4. Pre-commit hooks pass

License

This project is released into the public domain under the Unlicense. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flexohr-0.1.0.tar.gz (61.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flexohr-0.1.0-py3-none-any.whl (75.1 kB view details)

Uploaded Python 3

File details

Details for the file flexohr-0.1.0.tar.gz.

File metadata

  • Download URL: flexohr-0.1.0.tar.gz
  • Upload date:
  • Size: 61.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flexohr-0.1.0.tar.gz
Algorithm Hash digest
SHA256 4123457c7b66e392221ebcbd05b66b8ebd466cbc689be9c7e163fc1e3268c7b6
MD5 cc7632b43dedf44ae86812acc52fa11d
BLAKE2b-256 96c36176b77933ee27b70481f8f6939c719b4af27764ec0ffc558101724634ad

See more details on using hashes here.

File details

Details for the file flexohr-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: flexohr-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 75.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for flexohr-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7a4c4bb8f96b8cb53b36d7a19f7708f8f6ad81035ed6aac806071311f375a5e2
MD5 d04c221e08c382652732d5da120a3cfd
BLAKE2b-256 8a1769a85315144cb08e4cb68dac0ba0e60929492f6e2fea92452931161d19fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page