Flexible and Extensible Object for Harmony Representation — a library for representing, storing, querying, and converting musical harmony
Project description
FlexOHR
Flexible and Extensible Object for Harmony Representation
A Python library for representing, storing, querying, and converting musical harmony.
Installation
# Core package (no heavy dependencies)
pip install flexohr
# With database/storage support (numpy, pandas, duckdb, rustworkx)
pip install flexohr[database]
# With grammar parsing support
pip install flexohr[grammar]
# Full installation for development
pip install flexohr[dev]
Requirements: Python 3.11+
What Problem Does FlexOHR Solve?
Harmony annotations are trapped in format silos. DCML Roman numerals, RomanText labels, chord symbols, and pitch-class sets each encode overlapping information in incompatible syntaxes. Converting between them is lossy, ad-hoc, and breaks at edge cases. Meanwhile, every annotation standard bakes in assumptions about Western tonal music that silently exclude quintal harmony, microtonal systems, spectral data, and anything beyond 12 pitch classes.
FlexOHR introduces a single internal object — the OHR (Object for Harmony Representation) — that captures what any of these formats can express, plus structures none of them can. Annotations enter through format-specific codecs, live internally as OHR objects, and exit through any codec — including formats they were never written in.
Core Idea: The OHR as a Recursive Container
An OHR is not a chord type. It is a container of components with a reference point:
OHR = reference_component + reference_ohr + body[Component | OHR, ...]
- A Component wraps a value (absolute point, relative vector, or undefined) plus arbitrary properties (tone function, confidence, voice, ...). In pitch-based paradigms, values are pitchspace types; in other paradigms (spectral, rhythmic, etc.), they can be any type that the paradigm module defines.
- The reference_component anchors relative components in absolute pitch space.
- The reference_ohr provides broader context (e.g., a scale that gives meaning to scale degrees).
- The body is a list of Components and/or nested OHRs.
This recursion is the key insight. A chord is an OHR whose body contains interval-defined
components. A scale is an OHR whose body is an ordered sequence of intervals. A pedal
tone over a chord progression is an OHR whose body contains a pedal Component and a
nested progression-OHR. A Roman numeral like V65/vi is an OHR with a reference_ohr
(the tonicized key) containing a chord-OHR. The same structure handles all of these
without special cases.
Why Not a Flat Chord Enum?
A flat enumeration of chord types (major triad, dominant seventh, ...) works only for the ~20 chord types that Western tonal theory names. It cannot represent:
- Arbitrary interval stacks (quintal harmony, Messiaen modes)
- Partial voicings where only some tones are present
- Components with uncertainty (GNN predictions with confidence scores)
- Nested harmonic structures (pedal tones, applied chords, polychords)
- Non-12-EDO pitch systems
By making the OHR a container of components rather than a named type, FlexOHR handles
all of these with one mechanism. Named chord types become a convenience layer: a
ChordQuality.major_triad is shorthand for "body = root + M3 + P5", not a fundamental
data type.
The Three Shapes: OHR, OHRView, OHRCollection
FlexOHR represents harmony at three granularities, all sharing the same FlexohrProtocol
interface. The name is FlexohrProtocol (not HarmonyProtocol) because OHRs also
represent structures that are not traditionally called "harmony": Scales are ordered
horizontal OHRs (providing specific intervals first, interval classes second),
Collections are unordered horizontal/direction-less OHRs (interval classes first,
intervals second).
| Shape | What It Is | When to Use |
|---|---|---|
OHR |
Frozen dataclass. Single object. Carries _ohr_id. |
Constructing, inspecting, or transforming one chord/scale/collection. |
OHRCollection |
Five DuckDB tables (ohr_nodes, components, closure, properties, edges). | Bulk operations: an entire piece, corpus queries, structural/property queries. |
OHRView |
Cached proxy into an OHRCollection. | Scalar access without materialization: coll[42].root(). |
Why Three and Not One?
Musical analysis operates at two scales simultaneously: you reason about individual chords (scalar) and you compute over thousands of annotations (columnar). A single object model forces a choice — either it's slow for bulk operations or awkward for inspection. The three-shape design means:
- OHR is the mental model. It's what you think about. Frozen, immutable, hashable.
On construction, it auto-registers in the default OHRCollection (disable via
_register=Falseorset_default_collection(None)). - OHRCollection is the workhorse. Five DuckDB tables with O(1) structural queries (closure table), sub-ms property filters, paradigm-specific views, and SQL-native UDFs.
- OHRView bridges them.
coll[i]returns a cached view (OHRs are immutable, so no live queries needed). Call.materialize()for a standalone OHR.
All three implement FlexohrProtocol, so code written for one works on any:
def dominant_roots(h: FlexohrProtocol) -> SPC | pd.Series:
"""Works on OHR, OHRView, or OHRCollection."""
return h.root()[h.chord_quality == ChordQuality.dominant_seventh]
Given vs. Inferred: Provenance at Every Level
Every OHR distinguishes what was given (explicitly stated in the original annotation) from what was inferred (derived by FlexOHR).
ohr = OHR.from_label("V65", standard="dcml", globalkey="C", localkey="I")
ohr.root(source=Source.given) # SPC("G") — the label said "V" in C major
ohr.bass(source=Source.given) # None — "65" implies the bass but doesn't name it
ohr.bass(source=Source.inferred) # SPC("B") — inferred from quality + inversion
ohr.bass() # SPC("B") — default: given if available, else inferred
Why Track This?
Because downstream tasks need to know the difference. A music theorist comparing two analyses wants to see only what each analyst actually wrote, not what a parser filled in. A machine learning pipeline training on annotations needs to distinguish ground truth from derived features. Without provenance tracking, this information is silently lost the moment you parse a label.
The mechanism is lightweight: a source= parameter on accessor methods, backed by
optional {column}_source provenance columns in OHRCollection. No overhead when you
don't need it.
Storage: Five-Table Model + DuckDB
OHRCollection stores data across five DuckDB tables:
ohr_nodes— one row per OHR (label, standard, ordered, horizontal, refs).components— one row per component in any OHR body (value_raw, value_type, position, status). Paradigm-agnostic: handles pitch classes, frequency bins, spectra.closure— one row per ancestor-descendant pair (ancestor_id, descendant_id, depth). Enables O(1) structural queries.properties— key-value store for OHR-level and component-level properties, with an optionalcertaintycolumn for probabilistic values.edges— transformation and relation edges between OHRs, with serialized transformation type + parameters.
Why Five Tables?
The closure table gives O(1) structural queries ("all descendants of X", "subtree rooted at Z") while keeping OHR decomposition into rows natural. DuckDB operates directly on these tables with sub-ms property filters and paradigm-specific UDFs.
Paradigm-specific convenience columns (root_fifths, chord_quality, etc.) are DuckDB views over the five tables, not separate storage. Each paradigm registers its views at import time. This keeps the core storage paradigm-agnostic.
What Is Settled
- Five-table model — LOCKED (Phase 3).
- DuckDB as query engine — LOCKED.
- Paradigm-specific columns as DuckDB views — LOCKED.
- Property annotations via
certaintycolumn — LOCKED. - Transformation type hierarchy serialized to edges table — LOCKED.
- Auto-registration in default collection — LOCKED.
- JSON dict format as serialization/interchange — this is the reference format.
Codec System: Annotations In, Annotations Out
FlexOHR doesn't invent a new annotation standard. It provides a common internal representation that annotation standards convert to and from:
DCML label ──→ DCMLCodec.parse() ──→ OHR ──→ RomanTextCodec.emit() ──→ RomanText label
↕
OHRCollection (internal)
↕
FlexOHR syntax ──→ FlexOHRCodec.parse() ──→ OHR ──→ DCMLCodec.emit() ──→ DCML label
Each codec implements a StandardCodec protocol: parse(label, context) → OHR and
emit(ohr) → label. A codec always integrates with its paradigm's module and uses
the appropriate validators according to a config file controlled by the user through
text or through a configuration API. Enum members map between formats via a
CodecRegistry populated from declarative JSON tables.
Why Codecs Instead of Direct Conversion?
Direct A→B conversion between N standards requires N*(N-1) converters. The codec pattern requires N codecs (one per standard) and gets all N*N conversions for free via the common internal representation. Adding a new standard (e.g., lead-sheet chord symbols) means writing one codec, not updating every existing converter.
Round-trip fidelity is a design goal: parse → emit in the same standard should
reproduce the original label. Cross-standard conversion is deterministic but inherently
lossy where standards differ in expressiveness.
Each codec must declare clear guardrails: which subtypes of OHRs it can convert to and from. For example, a Roman-numeral-based (and therefore tertian-harmony-based) codec must reject OHRs that are not pitch-based. This means paradigm modules must integrate into a taxonomy of theory types that reflects what each theory's first-class citizens are (pitch classes, frequency bins, rhythmic patterns, etc.).
Supported Standards
| Standard | Source Parser | Key Capability |
|---|---|---|
| DCML | ms3 | Figured bass inversions, changes, pedal, applied chords |
| RomanText | music21 | music21 quality vocabulary, offset-based timing |
| FlexOHR native | DHParser (EBNF grammar) | Recursive nesting, interval stacks, multi-system |
Planned for later: Harte standard, Forte sets, Jazz chords (e.g., MuseScore flavour, Realbook flavour, etc.), clusters, spectra — depending on the use cases that already have or still need an encoding standard.
Paradigm Agnosticism
FlexOHR's Component.value is not restricted to any particular value domain. The
core API must be as agnostic as possible — a spectral-harmony OHR's body will have
nothing to do with pitchspace, yet it should still follow the "nested collection of
components and OHRs" paradigm as closely as possible.
The pitch-based paradigm (the first and most developed) provides a 3x2 grid of 12 types
via the pitchspace module:
| Level | Pitch Class | Pitch | Interval Class | Interval |
|---|---|---|---|---|
| Enharmonic (semitones) | EPC |
EP |
EIC |
EI |
| Specific (line of fifths) | SPC |
SP |
SIC |
SI |
| Generic (diatonic steps) | GPC |
GP |
GIC |
GI |
But the architecture doesn't depend on this grid. A Component holding a FrequencyBin,
CentsValue, or a rhythmic-pattern value works identically — same OHR construction,
same FlexohrProtocol methods, same codec pipeline. Each paradigm module defines its
own value types and registers them. pyarrow ExtensionType subclasses carry type
semantics into the storage layer.
Why This Matters
Computational musicology increasingly works with non-Western, microtonal, and spectral
data. A library that hardcodes root % 12 or assumes IntervalClass ∈ {0..11} cannot
serve these contexts. FlexOHR's paradigm-agnosticism is not future-proofing — it's a
fundamental requirement driven by the diversity of the data the library must handle.
Immutability and Modification
OHR is @dataclass(frozen=True, slots=True). You never mutate an OHR; you create a new
one via .with_():
new = ohr.with_(root=SPC("G")) # different root
new = ohr.with_(bass=ToneFunction.third) # move bass to third
new = ohr.with_( # modify specific component
components=Where(tone_function=ToneFunction.root).set(octave=4)
)
Note: .with_() is the general modification method. Dedicated mixin methods like
.with_bass() exist for common operations but .with_() is always available.
Why Immutable?
Because harmony objects are values, not stateful entities. A "V7 in C major" is the same
object wherever it appears. Immutability makes OHRs hashable (usable as dict keys and set
members), safe to share across threads, and impossible to corrupt by accidental mutation.
The .with_() pattern is explicit about what changes, producing a clear audit trail.
Chord Type Model: Dimensions, Not a Flat List
FlexOHR does not prescribe what property fields a particular paradigm stores with its components. What matters is having mappings between equivalent properties that may have different names in different theories. This cross-paradigm property network must be developed under systematic user consultation for each new paradigm.
Within the pitch-based paradigm, FlexOHR factors chord identity into separate dimensions:
| Dimension | Type | Examples |
|---|---|---|
| Quality | ChordQuality |
major_triad, dominant_seventh, french_sixth |
| Inversion | Inversion |
root_position, first, second, third |
| Modification | ToneModification |
#11 added, b9 altered |
Why Separate Dimensions?
Simply to break down the combinatorial number of possible chords. A flat enum of every quality-inversion-modification combination grows combinatorially and can never be complete. Separate dimensions mean:
ChordQualityhas ~20 members covering standard typesInversionhas 5 members- Modifications compose freely
A dominant_seventh in first inversion with an added #11 is three orthogonal facts,
not one entry in a massive lookup table. The interval structure for each ChordQuality
is defined once (as a tuple of (SIC, ToneFunction) pairs) and reused everywhere.
SQL and DuckDB Integration
OHRCollection exposes its five tables to DuckDB for analytical queries, with paradigm- specific UDFs and views:
coll.sql("SELECT root_fifths, SPC(root_fifths), COUNT(*) FROM ohr_tertian WHERE chord_quality = 'dominant_seventh' GROUP BY 1, 2")
Why DuckDB?
Because musicological research questions are naturally expressed as analytical queries
("what percentage of dominants resolve to the tonic?", "show me all augmented sixths in
minor keys"). DuckDB operates directly on the five tables with sub-ms property filters,
O(1) structural queries via the closure table, and paradigm-specific UDFs named after
pitchspace types (SPC(), SIC(), spc_to_semitones(), etc.). No server infrastructure
needed. OHRPattern queries compile to DuckDB SQL with CTEs for structural pattern matching.
Serialization: JSON Dicts, Not Grammar Strings
OHRs serialize to verbose, self-describing JSON dicts, not to annotation syntax strings. These mirror precisely the nested object structure that you interact with using the FlexOHR API:
{
"type": "ohr",
"reference_component": {"type": "component", "value": [1], "pitchspace_type": "spc"},
"body": [
{"type": "component", "value": [4], "pitchspace_type": "sic",
"properties": {"tone_function": "third"}}
],
"properties": {"chord_quality": "major_triad"}
}
Why JSON and Not the Grammar?
The FlexOHR grammar is an annotation syntax — designed for humans to write and read.
It's compact but lossy (implicit defaults, context-dependent parsing). The JSON format is
a storage format — designed for machines to serialize and deserialize without loss. It's
verbose but unambiguous: every field is explicit, every type is tagged, every property is
spelled out. These dicts are stored as pyarrow struct columns, survive Parquet round-trips,
and are queryable via DuckDB's json_extract().
Implementation Roadmap
Development proceeds in ordered work packages. No phase begins without explicit approval. Tests are developed from the start alongside implementation code. Each work package is executed by a team of agents: architect (design), coder (implementation), reviewer (quality), and documentor (discussion moderator who maintains links between high-level specs and low-level decisions in the documentation's "Explanation" sections).
| Phase | Focus | Status |
|---|---|---|
| 1 | Paradigm-independent core (Component, OHR, FlexohrProtocol, types, errors) | COMPLETE |
| 2a | Pitchspace paradigm (migrate types, pitch-specific enums, chord tables) | COMPLETE |
| 2b | Storage paradigm research (resolved: five-table model + DuckDB) | COMPLETE |
| 3a | OHRCollection core (five tables, DuckDB, auto-registration, OHRView, bulk insert) | NEXT |
| 3b | Query infrastructure (UDFs, OHRPattern, SQL compiler, paradigm views) | |
| 3c | Property annotations + transformations (Annotated, certainty, type hierarchy, edges) | |
| 3d | Arrow/IO integration (ExtensionTypes, Fields, Parquet) | |
| 4 | DCML codec | |
| 5 | RomanText codec | |
| 6 | Cross-standard + pitch arrays | |
| 7 | Advanced (resolution, grammar compiler, paradigm migration, TraversableLike) |
Module Structure
flexohr/
core/ OHR, Component, AbsentComponent, FlexohrProtocol, OHRCollection, OHRView,
OHRPattern, compiler, validators, transformations, sampler
types/ FancyStrEnum (auto()+alias), core enums, JSON mappings
storage/ OHR → five-table decomposition (flatten.py), DuckDB UDFs (udfs.py)
arrow/ pyarrow ExtensionTypes, Field objects
codecs/ StandardCodec protocol, DCML, RomanText, FlexOHR native
errors/ Error hierarchy; paradigms register their warnings/errors here
io/ Schema descriptors, Parquet, PitchArraySpec
pitchspace/ Pitch type system (SPC, SIC, etc. — migrated from syntax/pitchspace/)
paradigms/
western_tertian/ TertianHarmonyProtocol, ChordQuality, DuckDB views (ohr_tertian), etc.
tonfeld/ Tonfeld theory (DuckDB views, field generators)
(future: spectral/, rhythmic/, ...)
Note on structure: pitchspace/ is the pitch type system. paradigms/western_tertian/
is the harmony paradigm that uses pitchspace types, registers DuckDB views and UDFs, and
extends FlexohrProtocol with root(), bass(), chord_quality, etc. The core
FlexohrProtocol knows only about components, references, body, properties, and
iteration — no pitch-specific concepts.
Note on errors: The errors/ package provides the base error hierarchy. Each paradigm
module registers its own warnings and errors there. Codecs use the validators defined
by their paradigm.
Dependency Flow
core (Component, OHR, FlexohrProtocol, validators, transformations — paradigm-independent)
→ types (FancyStrEnum, core enums — abstract base)
→ errors (base error hierarchy)
→ storage (flatten: OHR → rows, udfs: DuckDB UDF registration)
→ core/collection (OHRCollection: five tables + DuckDB, auto-registration)
→ core/ohrview (cached proxy into collection)
→ core/pattern + core/compiler (OHRPattern → SQL)
→ arrow (ExtensionTypes, Fields)
→ io (schema, Parquet)
→ core/transformations (Transformation ABC + hierarchy, serialized to edges table)
→ pitchspace (pitch type system — stable, no deps outside core)
→ storage/udfs (registers SPC, SIC, EP, spc_to_semitones, etc.)
→ paradigms/
western_tertian (uses pitchspace types)
→ types (ChordQuality, ToneFunction, etc. — registered into core types)
→ validators, transformations (paradigm-specific, mixin with core)
→ core/collection (registers paradigm-specific DuckDB views)
→ codecs (DCML, RomanText, FlexOHR native)
→ io (PitchArraySpec, convenience columns)
tonfeld (uses pitchspace types)
→ field generators, paradigm-specific validators, DuckDB views
The abstract core is the top level. pitchspace is the pitch type system.
Paradigms register DuckDB views and UDFs into OHRCollection at import time.
Non-pitch paradigms follow the same structure without any pitch-specific assumptions
leaking into the core.
Quick API Preview
from flexohr import OHR, OHRCollection, SPC, SIC, ToneFunction, ChordQuality, Source
# --- Scalar ---
# RomanText uses "-" for flats (e.g., "-VI" for bVI); demonstrate with a peculiar label:
ohr = OHR.from_label("-VI65", standard="romantext", key="C")
ohr.root() # SPC("Ab") — "-VI" = flat sixth degree
ohr.chord_quality # ChordQuality.dominant_seventh
ohr.to_format("dcml") # "bVI65"
# --- Given vs. Inferred: with and without resolution ---
ohr = OHR.from_label("V65/vi", standard="dcml", globalkey="C", localkey="I")
ohr.bass(source=Source.given) # None — "65" implies the bass but doesn't name it
ohr.bass(source=Source.inferred) # SPC("B") — inferred from quality + inversion
ohr.bass() # SPC("B") — default: given if available, else inferred
# Access components without resolving (intervals only):
ohr.components(source=Source.given) # [Component(SIC("M3")), Component(SIC("P5")), ...]
# Access components with resolution (absolute pitches):
ohr.components(resolved=True) # [Component(SPC("B")), Component(SPC("D")), ...]
# --- Collection ---
coll = OHRCollection.from_labels(df["label"], standard="dcml",
globalkey=df["globalkey"], localkey=df["localkey"])
coll[0].label # scalar access via cached OHRView
coll.sql("SELECT root_fifths, SPC(root_fifths), COUNT(*) FROM ohr_tertian "
"WHERE chord_quality = 'dominant_seventh' GROUP BY 1, 2")
# --- Pitch array generation ---
pitch_df = coll.to_pitch_array(DILEMMADATA_DLC_SPEC)
Development
# Clone the repository
git clone https://github.com/DCMLab/flexohr.git
cd flexohr
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install
# Run tests
tox -e py311
# or
pytest
# Run linters
tox -e lint
# Auto-format code
tox -e format
# Build package
tox -e build
Contributing
Contributions are welcome! Please ensure that:
- Code is formatted with
blackand imports sorted withisort - All tests pass (
pytest) - New features include tests
- Pre-commit hooks pass
License
This project is released into the public domain under the Unlicense. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flexohr-0.1.0.tar.gz.
File metadata
- Download URL: flexohr-0.1.0.tar.gz
- Upload date:
- Size: 61.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4123457c7b66e392221ebcbd05b66b8ebd466cbc689be9c7e163fc1e3268c7b6
|
|
| MD5 |
cc7632b43dedf44ae86812acc52fa11d
|
|
| BLAKE2b-256 |
96c36176b77933ee27b70481f8f6939c719b4af27764ec0ffc558101724634ad
|
File details
Details for the file flexohr-0.1.0-py3-none-any.whl.
File metadata
- Download URL: flexohr-0.1.0-py3-none-any.whl
- Upload date:
- Size: 75.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7a4c4bb8f96b8cb53b36d7a19f7708f8f6ad81035ed6aac806071311f375a5e2
|
|
| MD5 |
d04c221e08c382652732d5da120a3cfd
|
|
| BLAKE2b-256 |
8a1769a85315144cb08e4cb68dac0ba0e60929492f6e2fea92452931161d19fc
|