Skip to main content

GPU forge — PyTorch training loop with tile framing and loss tracking

Project description

plato-torch — Self-Training Rooms

21 AI training methods as grab-and-go PLATO rooms.

Every method shares the same API:

room = ReinforceRoom("poker-room", ensign_dir="./ensigns", buffer_dir="./tiles")
room.feed(data)                    # Give it experience
room.train_step(batch)             # Learn from it
prediction = room.predict(input)   # Use the knowledge
model = room.export_model()        # Save it

Quick Start

import sys; sys.path.insert(0, "src")
from presets import PRESET_MAP

# See all 21 presets
for name, cls in sorted(PRESET_MAP.items()):
    print(name, cls.__name__)

# Pick one and use it
from presets import ReinforceRoom
room = ReinforceRoom("my-room")
room.observe("state-1", "action-a", "won")
room.observe("state-1", "action-b", "lost")
room.train_step(room._load_tiles())
print(room.predict("state-1"))

All 21 Presets

Classic ML

Preset Class Description
supervised SupervisedRoom Labeled input→output via frequency counting
contrastive ContrastiveRoom Cosine similarity, triplet margin learning
self_supervised SelfSupervisedRoom JEPA-style masked prediction (Welford online)

Reinforcement

Preset Class Description
reinforce ReinforceRoom Policy gradient, Monte Carlo returns
inverse_rl InverseRLRoom Observe expert, infer reward function
imitate ImitateRoom Clone expert behavior from demonstrations

Efficient Tuning

Preset Class Description
lora LoRARoom PEFT delta table simulation
qlora QLoRARoom 4-bit quantized base + LoRA delta adapters

Population Methods

Preset Class Description
evolve EvolveRoom Genetic algorithm, tournament selection
adversarial AdversarialRoom Red team vs blue team attack tracking
collaborative CollaborativeRoom Multi-agent knowledge sharing, majority vote

Meta / Federated

Preset Class Description
meta_learn MetaLearnRoom Nearest-task fast adaptation (1-3 shot)
federate FederateRoom Federated averaging across agents
multitask MultitaskRoom Shared backbone + task-specific heads

Lifecycle

Preset Class Description
curriculum CurriculumRoom Easy first, then harder (dojo progression)
continual ContinualRoom Lifelong learning, EWC-inspired replay buffer
fewshot FewshotRoom Prototype matching from 1-5 examples
active ActiveRoom Model chooses what data to learn from

Generative

Preset Class Description
generate GenerateRoom N-gram data augmentation, synthetic state generation

Hybrid

Preset Class Description
neurosymbolic NeurosymbolicRoom Neural instinct + symbolic rules blend
distill DistillRoom Teacher→student with temperature scaling

Architecture

plato-torch/
├── src/
│   ├── room_base.py          # RoomBase abstract class (feed/train_step/predict/export)
│   ├── torch_room.py         # TorchRoom — the full room with sentiment + tiles
│   ├── room_sentiment.py     # 6-dimensional room mood (energy, flow, frustration...)
│   ├── tile_grabber.py       # Learned attention over tile space
│   ├── instinct_net.py       # Tiny instinct network
│   ├── room_presets.py       # Registry of all 21 presets
│   └── presets/
│       ├── __init__.py       # PRESET_MAP — all 21 classes
│       ├── reinforce.py      # RL policy gradient
│       ├── evolve.py         # Genetic algorithm
│       ├── distill.py        # Teacher→student
│       ├── supervised.py     # Label frequency
│       ├── contrastive.py    # Triplet similarity
│       ├── self_supervised.py # JEPA masked prediction
│       ├── lora_train.py     # PEFT delta table
│       ├── qlora.py          # 4-bit quantized LoRA
│       ├── meta_learn.py     # Fast task adaptation
│       ├── federate.py       # Federated averaging
│       ├── multitask.py      # Shared backbone, task heads
│       ├── active.py         # Uncertainty sampling
│       ├── curriculum.py     # Difficulty progression
│       ├── imitate.py        # Expert cloning
│       ├── neurosymbolic.py  # Neural + symbolic rules
│       ├── continual.py      # EWC replay buffer
│       ├── fewshot.py        # Prototype matching
│       ├── generate.py       # N-gram augmentation
│       ├── adversarial.py    # Red/blue team
│       └── collaborative.py  # Multi-agent knowledge
├── docs/
│   ├── training-rooms.md           # Room architecture overview
│   └── training-seed-synergy.md    # Training ↔ seed-programming synergy paper
├── tests/
│   └── test_torch_room.py          # Unit tests
├── README.md
└── ARCHITECTURE-PLAN.md

Key Concepts

RoomBase API

Every preset inherits from RoomBase and implements:

  • feed(data) — ingest experience
  • train_step(batch) — learn from a batch of tiles
  • predict(input) — use accumulated knowledge
  • export_model() — serialize for transport

Room Sentiment

Rooms have 6-dimensional mood: energy, flow, frustration, discovery, tension, confidence. The room reads its own vibe and steers randomness toward productive exploration.

Biased Randomness

When a room is frustrated → bias safe actions. Discovery mode → bias novel actions. The room is an active participant, not a passive arena.

Live Tile Stream

Tiles are generated hand-in-glove with room events. The stream feeds training in real-time.

JEPA Context

context_for_jepa() outputs sentiment vectors for JC1's JEPA model consumption.

Design Decisions

  1. Pure Python, no torch — all presets use statistical methods (frequency, hashing, sampling)
  2. Same API everywhere — walk into any room, same interface
  3. Composable — rooms can feed into each other (evolve → distill → ensign)
  4. Agent-native — designed for AI agents to use autonomously, not just humans
  5. Statistical grounding — every method has real math under the hood (Monte Carlo, Welford, tournament selection, etc.)

For Agents Following This Work

How To Add A New Preset

  1. Create src/presets/your_preset.py
  2. Inherit from RoomBase (from room_base import RoomBase)
  3. Implement: feed(), train_step(), predict(), export_model()
  4. Constructor: def __init__(self, room_id: str, **kwargs)super().__init__(room_id, **kwargs)
  5. Add to src/presets/__init__.py PRESET_MAP
  6. Add to src/room_presets.py PRESET_REGISTRY
  7. Test: room = YourPreset("test", ensign_dir="/tmp/e", buffer_dir="/tmp/b")

How Training Relates To Ensigns

  • plato-torch rooms accumulate experience as tiles
  • plato-ensign exports room wisdom as a portable ensign (LoRA/GGUF/Interpreter)
  • The ensign loads instantly in any agent — "walk into room → load ensign → instant instinct"
  • See docs/training-seed-synergy.md for the full alignment philosophy

Fleet Integration

  • Oracle1 (cloud): runs training rooms, coordinates fleet learning
  • Forgemaster (RTX 4050): trains LoRA adapters from accumulated tiles
  • JC1 (Jetson Orin): deploys ensigns for edge inference

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plato_torch-0.5.0a1.tar.gz (73.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

plato_torch-0.5.0a1-py3-none-any.whl (90.3 kB view details)

Uploaded Python 3

File details

Details for the file plato_torch-0.5.0a1.tar.gz.

File metadata

  • Download URL: plato_torch-0.5.0a1.tar.gz
  • Upload date:
  • Size: 73.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for plato_torch-0.5.0a1.tar.gz
Algorithm Hash digest
SHA256 706c1df375680ef3ae53e823e38c1ce94fddf899ee1d0dfbc1420e3a992010f5
MD5 06a61785a8b159ee764512ceaacd8581
BLAKE2b-256 eca53e5e05426d0fac9b2131ef772c07bf46fafbd65593b5a1f35f89234bc6c8

See more details on using hashes here.

File details

Details for the file plato_torch-0.5.0a1-py3-none-any.whl.

File metadata

  • Download URL: plato_torch-0.5.0a1-py3-none-any.whl
  • Upload date:
  • Size: 90.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for plato_torch-0.5.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 f9d2f3e587b65ed4030b869cc8352275edd472796aff97dd811e02df06951a46
MD5 9b06dbeb4d1a59f4501cf76c20783c8e
BLAKE2b-256 ed7dfa4ce207b41135c4dbe6f17cac48a76ffc046dd6f70dceeb2e475314499b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page