General Unified World Model — a typed causal ontology of civilization, built on canvas-engineering structured latent spaces.

These details have not been verified by PyPI

Project links

Project description

general-unified-world-model

A typed causal ontology of civilization, built on canvas-engineering structured latent spaces.

Canvas engineering structures what a diffusion model thinks in. This repo declares a 857-field typed schema spanning planetary physics through individual psychology, compiles it onto a structured latent canvas, and trains it on heterogeneous real-world data — without throwing out samples that are missing fields.

The idea

Every dataset in the world describes a slice of the same underlying reality. GDP data captures macroeconomic output. Market data captures prices. News captures narratives. Earnings calls capture firm strategy. But no single dataset captures everything.

Traditional approaches either:

(a) restrict to the intersection — throw out data missing any field
(b) impute missing values — introduce noise

General Unified World Model takes option (c): mask missing fields in the loss, train on what you have. Each dataset declares which fields it populates. The model learns the joint distribution across all modalities, even though no single dataset contains everything.

The key enabler is canvas-engineering — a type system for multimodal latent computation. Each field in the world model occupies specific positions on a 3D (T, H, W) canvas grid, with declared temporal frequency, loss weight, and connectivity. The topology is the compute graph.

Quick start

pip install general-unified-world-model

Compile the full world model

from canvas_engineering import compile_schema, ConnectivityPolicy
from general_unified_world_model import World

world = World()
bound = compile_schema(
    world,
    T=1, H=128, W=128, d_model=64,
    connectivity=ConnectivityPolicy(
        intra="dense",
        parent_child="hub_spoke",
    ),
)

print(f"{len(bound.field_names)} fields, "
      f"{bound.layout.num_positions} positions, "
      f"{len(bound.topology.connections)} connections")
# 857 fields, 16384 positions, 11735 connections

Project to a subset

You don't need the full 857-field model. Declare what you care about:

from general_unified_world_model import WorldProjection, project

# Hedge fund: macro + financial + two firms
proj = WorldProjection(
    include=[
        "financial",
        "country_us.macro",
        "regime",
        "forecasts.macro",
        "forecasts.financial",
    ],
    firms=["AAPL", "NVDA"],
)

bound = project(proj, T=1, H=64, W=64, d_model=64)
# ~200 fields, focused on what matters

Train on heterogeneous data

from general_unified_world_model import (
    WorldProjection, project, build_world_model,
    FieldEncoder, FieldDecoder, MaskedCanvasTrainer,
    DatasetSpec, FieldMapping, build_mixed_dataloader,
)

# Two data sources with different field coverage
macro_spec = DatasetSpec(
    name="FRED",
    mappings=[
        FieldMapping("gdp", "country_us.macro.output.gdp_nowcast"),
        FieldMapping("cpi", "country_us.macro.inflation.headline_cpi"),
    ],
)
market_spec = DatasetSpec(
    name="Yahoo",
    mappings=[
        FieldMapping("vix", "financial.equities.vix"),
        FieldMapping("ust10y", "financial.yield_curves.ten_year"),
    ],
)

# Both train the same canvas — missing fields are masked, not imputed
loader = build_mixed_dataloader(
    bound,
    sources=[(macro_spec, macro_data), (market_spec, market_data)],
    batch_size=32,
)

The schema

19 layers, 857 fields, 8 temporal frequency classes:

Layer	Fields	Frequency	What it captures
Planetary Physical	Climate, infrastructure, disasters	τ6–τ7 (annual–multi-year)	Slow structural constraints
Resources & Energy	Crude, metals, food, water, compute	τ1–τ4 (hourly–monthly)	Physical inputs to production
Global Financial	Yields, credit, FX, equities, crypto	τ0–τ2 (sub-minute–daily)	High-bandwidth reflexive core
Macroeconomy	GDP, inflation, labor, fiscal, trade, housing	τ3–τ5 (weekly–quarterly)	Real economy per country
Political	Executive, legislative, judicial, geopolitical	τ4–τ7 (monthly–multi-year)	Governance structures
Narrative & Belief	Media, elite consensus, public sentiment	τ0–τ4 (sub-minute–monthly)	Reflexivity layer
Technology	AI, biotech, quantum, robotics, productivity	τ5–τ7 (quarterly–multi-year)	Long-run structural drivers
Demographics	Population, dependency, urbanization	τ7 (multi-year)	Slowest structural force
Sector	Demand, supply, margins, disruption risk	τ3–τ5 (weekly–quarterly)	Per GICS sector
Supply Chain	Concentration, lead time, bottleneck severity	τ2–τ4 (daily–monthly)	Graph-structured nodes
Business	Financials, operations, strategy, market, risk	τ2–τ5 (daily–quarterly)	Per firm (sparse)
Individual	Cognitive, incentives, network, state	τ2–τ5 (daily–quarterly)	Key decision-makers (very sparse)
Event Tape	News, social, filings, policy, conflict	τ0–τ1 (sub-minute–hourly)	Real-time event stream
Data Channel Trust	Government, market, alternative, corporate	τ3–τ7	Meta-epistemic calibration
Regime State	Growth, inflation, financial cycle, fragility	τ5–τ7	Compressed global latent
Intervention	Monetary, fiscal, regulatory, military + effects	τ2–τ5	Counterfactual analysis
Forecast Bundle	Recession prob, credit stress, conflict risk	output	Structured prediction heads
Country	Macro + politics + demographics per country	composite	Per major economy

Temporal frequency classes

τ0 = sub-minute   (period=1)      markets, breaking news
τ1 = hourly        (period=4)      grid load, commodities
τ2 = daily         (period=16)     commodity prices, port congestion
τ3 = weekly        (period=48)     claims, inventories, payroll
τ4 = monthly       (period=192)    CPI, PMI, company closes
τ5 = quarterly     (period=576)    earnings, GDP, capex
τ6 = annual        (period=2304)   demographics, infrastructure
τ7 = multi-year    (period=4608)   regime changes, tech diffusion

Use cases

CEO: "Model my company in context"

proj = WorldProjection(
    include=[
        "country_us.macro",
        "sector_tech",
        "financial.yield_curves",
        "financial.equities",
        "regime",
        "forecasts",
    ],
    firms=["ACME", "RIVAL"],
    individuals=["ceo", "cfo", "cto"],
)

Government: "Model policy impact"

proj = WorldProjection(
    include=[
        "country_us",
        "country_cn.macro",
        "country_eu.macro",
        "financial",
        "interventions",
        "forecasts",
        "regime",
    ],
    countries=["jp", "uk"],
)

Computer use agent: "Model the user's world"

proj = WorldProjection(
    include=[
        "events",
        "regime.compressed_world_state",
        "forecasts.macro.recession_prob_3m",
    ],
    individuals=["user"],
    firms=["user_org"],
)

Training architecture

Phase 1: Independent domains (parallelizable)

Train each domain separately on small canvases. Financial markets, US macro, narratives, etc. each get their own backbone. This is fast because canvases are small.

Phase 2: Domain coupling

Merge causally adjacent domains (financial + macro, narratives + financial). Pretrained encoders/decoders transfer via matching field names. The shared regime latent begins learning cross-domain structure.

Phase 3: Full integration

All domains on one canvas. The regime state gets gradient from everything. This is the most expensive phase but leverages all pretrained structure.

Phase 4: Task-specific fine-tuning

Freeze backbone. Train projection-specific heads (recession prediction, equity regime, conflict escalation).

Why this works

The semantic type system lets us proxy generalization distance between any two modalities by their semantic embedding distance. GDP growth and industrial production are semantically close — their latent dynamics will be correlated. GDP growth and seismic risk are semantically far — nearly independent. This guides curriculum design: couple close domains first, distant later.

Heterogeneous data training

The key innovation: masked loss on structured canvas.

Dataset A (FRED):     GDP ✓  CPI ✓  VIX ✗  Yields ✗
Dataset B (Yahoo):    GDP ✗  CPI ✗  VIX ✓  Yields ✓
Dataset C (News):     GDP ✗  CPI ✗  VIX ✗  Yields ✗  News ✓

Canvas loss:  L = Σ (prediction - target)² × presence × loss_weight
                      ↑ model predicts all    ↑ only active  ↑ from schema

Both A and B train the shared regime latent, even though their field coverage doesn't overlap. The regime latent learns to compress the joint distribution from partial observations.

Data adapters

Built-in adapters for common data sources:

from general_unified_world_model.data.adapters import fred_adapter, yahoo_finance_adapter

# FRED: 50+ macro series mapped to world model fields
fred_spec, fred_data = fred_adapter(api_key="...", start_date="2010-01-01")

# Yahoo Finance: equities, FX, commodities, crypto
yahoo_spec, yahoo_data = yahoo_finance_adapter(
    include_equity=True, include_fx=True,
    firm_tickers={"AAPL": "firm_AAPL"},
)

# Generic CSV/Parquet
from general_unified_world_model.data.adapters import tabular_adapter
spec, data = tabular_adapter(
    "My Dataset", "data.csv",
    column_mappings={"gdp_growth": "country_us.macro.output.gdp_nowcast"},
    transforms={"gdp_growth": "z_score"},
)

Temporal entities

Entities can appear and disappear over time:

from general_unified_world_model import TemporalTopology
from general_unified_world_model.schema.business import Business

tt = TemporalTopology()
tt.add("firm_AAPL", Business(), start_tick=100)    # founded
tt.add("firm_ENRON", Business(), start_tick=0, end_tick=500)  # dissolved

# At tick 50: ENRON exists, AAPL doesn't yet
active = tt.active_at(50)

# Generate attention mask that blocks inactive entities
mask = tt.generate_temporal_attention_mask((0, 1000), bound_schema)

Inference

from general_unified_world_model import WorldModel

model = WorldModel.load("checkpoint.pt", projection)

# Observe what you know
model.observe("financial.yield_curves.ten_year", 4.25)
model.observe("country_us.macro.inflation.headline_cpi", 3.1)
model.observe("financial.equities.vix", 18.5)

# Predict everything else
predictions = model.predict(n_steps=50)

recession_prob = predictions["forecasts.macro.recession_prob_3m"]
regime = predictions["regime.growth_regime"]
credit_stress = predictions["forecasts.financial.credit_stress_3m"]

Installation

# Core
pip install general-unified-world-model

# With real data adapters
pip install general-unified-world-model[data]

# With training infrastructure
pip install general-unified-world-model[train]

# Everything
pip install general-unified-world-model[all]

Requires Python 3.10+ and PyTorch 2.0+.

Examples

examples/
├── 01_quickstart.py           # Compile full world model, inspect fields
├── 02_ceo_company_model.py    # CEO use case: company + context
├── 03_government_policy.py    # Government: policy impact analysis
├── 04_computer_use_agent.py   # Agent: user psychology + world context
├── 05_train_financial.py      # Train on real FRED + Yahoo data
└── 06_curriculum_training.py  # Full 3-phase curriculum training

Development

git clone https://github.com/JacobFV/general-unified-world-modeling.git
cd general-unified-world-modeling
pip install -e ".[dev]"
pytest

Branch structure

develop — active development, PRs target here
release — stable releases, tagged commits trigger PyPI publish

Running tests

# Full suite (39 tests)
pytest

# With coverage
pytest --cov=general_unified_world_model --cov-report=term-missing

# Specific module
pytest tests/test_schema.py -v

Project layout

src/general_unified_world_model/
├── schema/           # 19 schema modules (physical → forecast)
│   ├── world.py      # Top-level World composition (857 fields)
│   ├── physical.py   # Planetary physical substrate
│   ├── resources.py  # Energy, metals, food, water, compute
│   ├── financial.py  # Global monetary & financial
│   ├── macro.py      # Macroeconomy (per country)
│   ├── political.py  # Political & institutional
│   ├── narrative.py  # Narrative, belief & expectations
│   ├── technology.py # Technology & innovation
│   ├── demographics.py
│   ├── sector.py     # Per GICS sector
│   ├── supply_chain.py
│   ├── business.py   # Per firm (sparse)
│   ├── individual.py # Key decision-makers (very sparse)
│   ├── events.py     # Real-time event tape
│   ├── trust.py      # Data channel trust (meta-epistemic)
│   ├── regime.py     # Privileged regime latent
│   ├── intervention.py
│   ├── forecast.py   # Structured output heads
│   ├── country.py    # Composite per country
│   └── observability.py  # Reusable epistemic bundles
├── projection/       # Subsetting & connectivity
│   ├── subset.py     # WorldProjection, project()
│   ├── temporal.py   # Temporal entity management
│   └── transfer.py   # Semantic transfer distance
├── training/         # Training infrastructure
│   ├── backbone.py   # Transformer backbone
│   ├── heterogeneous.py  # Masked canvas trainer
│   ├── diffusion.py  # Diffusion objective
│   └── curriculum.py # Multi-phase curriculum
├── data/             # Data adapters
│   └── adapters.py   # FRED, Yahoo, PMI, earnings, news, CSV
└── inference.py      # Observe/predict API

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.0.3

Mar 8, 2026

0.0.2

Mar 8, 2026

This version

0.0.1

Mar 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

general_unified_world_model-0.0.1.tar.gz (58.7 kB view details)

Uploaded Mar 8, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

general_unified_world_model-0.0.1-py3-none-any.whl (61.8 kB view details)

Uploaded Mar 8, 2026 Python 3

File details

Details for the file general_unified_world_model-0.0.1.tar.gz.

File metadata

Download URL: general_unified_world_model-0.0.1.tar.gz
Upload date: Mar 8, 2026
Size: 58.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for general_unified_world_model-0.0.1.tar.gz
Algorithm	Hash digest
SHA256	`d933cecf2ef7bcd29469c42e1da4fb3220d75fb9006c542a72cc0246252770b4`
MD5	`c8642a5b11083ea2077c63e642be3a57`
BLAKE2b-256	`6f258d85bfc2fed9cc89cb356e5df5f3f725d2bb562b8d88879693af2f2236e1`

See more details on using hashes here.

File details

Details for the file general_unified_world_model-0.0.1-py3-none-any.whl.

File metadata

Download URL: general_unified_world_model-0.0.1-py3-none-any.whl
Upload date: Mar 8, 2026
Size: 61.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for general_unified_world_model-0.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8b57d9e5a60770d65cf16c588f78afa85caa3eceea545c0ea0dc2500fb14bb82`
MD5	`8cea046db0c697084e09ca9d9eefbf9d`
BLAKE2b-256	`472cce5554ef23930ed0355108c24bc791258854c0a619000548bb1633092fb3`

See more details on using hashes here.

general-unified-world-model 0.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

general-unified-world-model

A typed causal ontology of civilization, built on canvas-engineering structured latent spaces.

The idea

Quick start

Compile the full world model

Project to a subset

Train on heterogeneous data

The schema

Temporal frequency classes

Use cases

CEO: "Model my company in context"

Government: "Model policy impact"

Computer use agent: "Model the user's world"

Training architecture

Phase 1: Independent domains (parallelizable)

Phase 2: Domain coupling

Phase 3: Full integration

Phase 4: Task-specific fine-tuning

Why this works

Heterogeneous data training

Data adapters

Temporal entities

Inference

Installation

Examples

Development

Branch structure

Running tests

Project layout

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes