Track how discrete representations evolve during training

These details have not been verified by PyPI

Project description

reptimeline

Part of a wider research program. reptimeline is paper P3 in a program on prime-factorisation neurosymbolic AI (P1–P4) and quaternionic logic (P11–P13). For the full program — companion papers, computational substrate, and formal framework — see github.com/arturoornelasb.

Track how discrete representations evolve during neural network training.

reptimeline monitors lifecycle events in discrete representation systems: when concepts are "born" (first become distinguishable), when they "die" (collapse), when relationships form, and where phase transitions occur. It then discovers what each feature means, labels it, and tests causal effects.

Backend-agnostic: works with triadic bits, VQ-VAE codebooks, FSQ levels, sparse autoencoders, binary codes, or any discrete bottleneck.

Features

Lifecycle tracking -- birth, death, and connection events for individual code elements across training
Phase transition detection -- automatic discovery of training regime changes via metric discontinuities
Bottom-up ontology discovery -- duals, dependencies, 3-way interactions, and hierarchical structure without pre-defined primitives
Auto-labeling -- three strategies: embedding-based, contrastive, and LLM-based
Causal verification -- intervention testing with bootstrap CIs, permutation p-values, and BH-FDR correction
Theory reconciliation -- compare discovered structure against manually-defined domain primitives
Visualizations -- 5 static (matplotlib): swimlane, phase dashboard, churn heatmap, layer emergence, causal heatmap; 4 interactive (Plotly): swimlane, phase dashboard, churn heatmap, causal heatmap
Export -- JSON round-trip (save_json/load_json), CSV export (events, curves, codes, stability)

Tech Stack

Component	Details
Language	Python 3.10 -- 3.13
Core dependencies	numpy >= 1.24, matplotlib >= 3.7, tqdm >= 4.60
Optional	torch >= 2.0 (extractors), plotly >= 5.0 (interactive plots)
Testing	pytest, pytest-cov, 224 tests
Linting	ruff (zero warnings), mypy (zero errors)
CI	GitHub Actions (tests + lint + typecheck + coverage)
Docs	pdoc, auto-deployed to GitHub Pages
License	BUSL-1.1 (converts to AGPL-3.0 on 2030-03-21)

Installation

pip install reptimeline

From source (for development):

git clone https://github.com/arturoornelasb/reptimeline.git
cd reptimeline
pip install -e ".[dev]"

To run the examples (MNIST, Pythia SAE, causal experiments):

pip install -r requirements-examples.txt

Quick Start

1. Use a built-in extractor or implement your own

Three backends ship ready to use:

from reptimeline.extractors import SAEExtractor, VQVAEExtractor, FSQExtractor

# Sparse Autoencoder (top-k binarization, intervention support)
sae = SAEExtractor(n_features=32768, encode_fn=my_sae.encode,
                   decode_fn=my_sae.decode, feature_indices=selected)

# VQ-VAE (codebook index → binary indicator)
vqvae = VQVAEExtractor(n_codebook=512, encode_fn=my_vqvae.encode)

# FSQ (finite scalar quantization, nonzero or one-hot binarization)
fsq = FSQExtractor(n_levels=[3, 5, 3, 3], encode_fn=my_fsq.encode)

Or implement RepresentationExtractor for any other discrete bottleneck:

from reptimeline.extractors.base import RepresentationExtractor
from reptimeline.core import ConceptSnapshot

class MyExtractor(RepresentationExtractor):
    def extract(self, checkpoint_path, concepts, device='cpu'):
        codes = {}
        for concept in concepts:
            codes[concept] = get_discrete_code(model, concept)  # List[int]
        return ConceptSnapshot(step=parse_step(checkpoint_path), codes=codes)

    def similarity(self, code_a, code_b):
        ...  # Jaccard, Hamming, or domain-specific

    def shared_features(self, code_a, code_b):
        ...  # Indices where both codes are active

See examples/ for complete pipelines (MNIST binary AE, Pythia-70M SAE, triadic bits).

2. Analyze representation evolution

from reptimeline import TimelineTracker

extractor = MyExtractor()
snapshots = extractor.extract_sequence("checkpoints/", concepts)
tracker = TimelineTracker(extractor)
timeline = tracker.analyze(snapshots)
timeline.print_summary()

3. Discover what each code element means

from reptimeline import BitDiscovery, AutoLabeler

discovery = BitDiscovery()
report = discovery.discover(snapshots[-1], timeline=timeline)
discovery.print_report(report)

# Auto-label with embeddings (no API needed)
labeler = AutoLabeler()
labels = labeler.label_by_embedding(report, embeddings)

4. Test causal effects

from reptimeline import CausalVerifier

verifier = CausalVerifier(intervene_fn=my_intervene_fn)
causal_report = verifier.verify(snapshots[-1])

5. Export results

# JSON round-trip
timeline.save_json("results/timeline.json")
restored = Timeline.load_json("results/timeline.json")

# CSV export (events, curves, codes, stability)
timeline.to_csv("results/csv/")

6. Interactive plots (requires plotly)

from reptimeline.viz.interactive import plot_phase_dashboard_interactive

fig = plot_phase_dashboard_interactive(timeline, save_html="dashboard.html")

7. CLI

reptimeline --snapshots data.json --discover --plot
reptimeline --snapshots data.json --overlay primitivos.json --output result.json
reptimeline --snapshots data.json --causal effects.json --plot-dir plots/

Architecture

Your model checkpoints
        |
        v
RepresentationExtractor    (SAE, VQ-VAE, FSQ built-in, or your own)
        |  ConceptSnapshot objects
        v
TimelineTracker            (births, deaths, connections, phase transitions)
        |
        v
BitDiscovery               (duals, dependencies, 3-way interactions, hierarchy)
        |
        v
AutoLabeler                (embedding / contrastive / LLM labeling)
        |
        v
CausalVerifier             (intervention effects + statistical testing)
        |
        v
Reconciler                 (compare discovered vs. expected structure)
        |
        v
Visualizations             (swimlane, phase dashboard, churn, causal heatmap)

Validated Results

MNIST Binary Autoencoder (32-bit)

Metric	Value
Decoder determinism	100% (32-bit code fully determines output; n=100 swaps)
Dual pairs discovered	65 anti-correlated
Dependencies discovered	179
Phase transitions	0
Lifecycle tracking	297 births, 106 deaths, 45 connections
Training	10 epochs, 6 checkpoints, 10 digit classes

Pythia-70M Sparse Autoencoder (32K features)

Metric	Value
Causal selectivity (KL)	8 features with finite selectivity (1.96x--98.4x, mean 26.8x L2); 8 with zero cross-activation
Dual pairs discovered	34 anti-correlated
Lifecycle tracking	12 checkpoints (step 0 to 143K), 1835 births, 875 deaths

SAE causal intervention heatmap
Causal intervention on Pythia-70M SAE features. Yellow = no effect; dark red = strong effect.

Limitations

Prediction experiments did not improve over baseline. Using discovered SAE features for next-token prediction produced -0.13% (embedding-based) and -4.20% (MLP-based) accuracy relative to baseline. Features are individually meaningful but do not yet translate to prediction improvements.
Sentinel features. 8 of 16 tested SAE features showed zero cross-activation, which may reflect SAE sparsity rather than proven causal selectivity. These are reported separately.
Statistical corrections. Discovery includes Bonferroni and BH-FDR correction. Use null_baseline() to estimate false positive rates for your data dimensions.

Project Structure

reptimeline/
  __init__.py             # Public API
  __main__.py             # python -m reptimeline
  core.py                 # ConceptSnapshot, Timeline, lifecycle events
  tracker.py              # TimelineTracker
  discovery.py            # BitDiscovery: bottom-up ontology
  autolabel.py            # AutoLabeler: 3 labeling strategies
  reconcile.py            # Reconciler: discovered vs. theory
  causal.py               # CausalVerifier: intervention testing
  exceptions.py           # Domain-specific exception hierarchy
  stats.py                # Bootstrap, permutation tests, BH-FDR
  cli.py                  # Command-line interface
  extractors/
    base.py               # RepresentationExtractor ABC
    sae.py                # Sparse autoencoder extractor
    vqvae.py              # VQ-VAE extractor
    fsq.py                # FSQ extractor
  overlays/
    primitive_overlay.py  # Domain-specific primitive overlay
  viz/
    swimlane.py           # Concept activation swimlane
    phase_dashboard.py    # Metric trends + phase transitions
    churn_heatmap.py      # Per-concept code churn
    layer_emergence.py    # Layer stabilization order (dynamic colors)
    causal_heatmap.py     # Causal intervention effects
    interactive.py        # Plotly interactive versions (optional)
tests/                    # 18 test modules, 224 tests (pytest)
examples/                 # Reference pipelines and extractors
results/                  # Pre-computed results (MNIST, Pythia-70M)

Development

# Install with dev deps + pre-commit hooks
pip install -e ".[dev]"
pre-commit install

# Run tests with coverage
pytest tests/ -v --cov=reptimeline

# Lint + type check
ruff check reptimeline/ tests/
mypy reptimeline/

CI runs lint, typecheck, and tests on every push and PR (Python 3.10 -- 3.13).

See CONTRIBUTING.md for contribution guidelines.

Companion Repos

This library is part of a four-paper program on triadic neurosymbolic representations:

Triadic Neurosymbolic Engine (algebraic encoding):

triadic-microgpt (learnable projection head):

Triadic Emergent Duality (ontological framework):

License

Business Source License 1.1 (BUSL-1.1)

Free for research, education, evaluation, development, and personal use
Commercial production use requires a license -- contact arturoornelas62@gmail.com
Converts to AGPL-3.0 on 2030-03-21
All dependencies are commercially compatible (BSD, MIT, Apache-2.0, MPL-2.0 -- zero copyleft)

Citation

If you use reptimeline in your research, please cite the paper and/or software:

@article{ornelas2026reptimeline,
  author = {Ornelas Brand, J. Arturo},
  title = {reptimeline: Tracking Discrete Representation Evolution
           During Neural Network Training},
  year = {2026},
  doi = {10.5281/zenodo.19208672}
}

@software{ornelas2026reptimeline_software,
  author = {Ornelas Brand, J. Arturo},
  title = {reptimeline},
  year = {2026},
  url = {https://github.com/arturoornelasb/reptimeline},
  doi = {10.5281/zenodo.19208628}
}

Origin

Extracted from triadic-microgpt. Part of a three-repository research project by J. Arturo Ornelas Brand (2026):

Project	Paper	Repository
Triadic Neurosymbolic Engine (parent)
triadic-microgpt
reptimeline (this repo)

Coming from triadic-microgpt? See the migration guide.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.2

May 4, 2026

0.2.1

May 4, 2026

0.1.1

Mar 25, 2026

0.1.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reptimeline-0.2.2.tar.gz (74.7 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

reptimeline-0.2.2-py3-none-any.whl (61.7 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file reptimeline-0.2.2.tar.gz.

File metadata

Download URL: reptimeline-0.2.2.tar.gz
Upload date: May 4, 2026
Size: 74.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for reptimeline-0.2.2.tar.gz
Algorithm	Hash digest
SHA256	`1724519e5b02c23cdbc5457a512293fed4b74b214f70bbecbeab4e95fac3fc5d`
MD5	`070f49db36b45e15d91b93d2f866c8d4`
BLAKE2b-256	`79aa2748f9ad609a080df3e6528180ca754794a7a8db5997f8d46c688855ca43`

See more details on using hashes here.

Provenance

The following attestation bundles were made for reptimeline-0.2.2.tar.gz:

Publisher: publish.yml on arturoornelasb/reptimeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: reptimeline-0.2.2.tar.gz
- Subject digest: 1724519e5b02c23cdbc5457a512293fed4b74b214f70bbecbeab4e95fac3fc5d
- Sigstore transparency entry: 1436912180
- Sigstore integration time: May 4, 2026
Source repository:
- Permalink: arturoornelasb/reptimeline@89d4519fae16bfa6823764a63f87653a7f07a150
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/arturoornelasb
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@89d4519fae16bfa6823764a63f87653a7f07a150
- Trigger Event: release

File details

Details for the file reptimeline-0.2.2-py3-none-any.whl.

File metadata

Download URL: reptimeline-0.2.2-py3-none-any.whl
Upload date: May 4, 2026
Size: 61.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for reptimeline-0.2.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`569b62d8e1ca49f80568758eca62da1a30b8cc66844266bb356fcd5e047a4fd7`
MD5	`6344900c79a3a9ca4f884651cecf21af`
BLAKE2b-256	`c1e17f1ae61a3df27c5819348e1efc39af120146ce5879993c87aadce61e8fda`

See more details on using hashes here.

Provenance

The following attestation bundles were made for reptimeline-0.2.2-py3-none-any.whl:

Publisher: publish.yml on arturoornelasb/reptimeline

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: reptimeline-0.2.2-py3-none-any.whl
- Subject digest: 569b62d8e1ca49f80568758eca62da1a30b8cc66844266bb356fcd5e047a4fd7
- Sigstore transparency entry: 1436912183
- Sigstore integration time: May 4, 2026
Source repository:
- Permalink: arturoornelasb/reptimeline@89d4519fae16bfa6823764a63f87653a7f07a150
- Branch / Tag: refs/tags/v0.2.2
- Owner: https://github.com/arturoornelasb
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@89d4519fae16bfa6823764a63f87653a7f07a150
- Trigger Event: release

reptimeline 0.2.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

reptimeline

Features

Tech Stack

Installation

Quick Start

1. Use a built-in extractor or implement your own

2. Analyze representation evolution

3. Discover what each code element means

4. Test causal effects

5. Export results

6. Interactive plots (requires plotly)

7. CLI

Architecture

Validated Results

MNIST Binary Autoencoder (32-bit)

Pythia-70M Sparse Autoencoder (32K features)

Limitations

Project Structure

Development

Companion Repos

License

Citation

Origin

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance