Skip to main content

Remote candidate training worker for the JuniperCascor neural network service

Project description

Juniper




Juniper: Dynamic Neural Network Research Platform

Juniper is an AI/ML research platform for investigating dynamic neural network architectures and novel learning paradigms. The project emphasizes ground-up implementations from primary literature, enabling a more transparent exploration of fundamental algorithms.

Juniper Cascor Worker

juniper-cascor-worker is the distributed candidate-training worker of the Juniper platform. A worker process connects outbound to a running juniper-cascor instance on its /ws/v1/workers WebSocket endpoint, receives candidate-unit training tasks from the cascor service, and returns trained candidates so that the cascor service can select the next unit to recruit. The package supports two operating modes — a default WebSocket mode (CascorWorkerAgent) and a deprecated legacy mode (CandidateTrainingWorker over multiprocessing.managers, retained for transitional deployments). The worker is managed by juniper-cascor rather than imported by it: there is no code-import dependency between the two repositories, only a wire-protocol contract.

Distribution

juniper-cascor-worker is published on PyPI as juniper-cascor-worker. The package is also surfaced through the platform meta-distribution juniper-ml, which installs the full client stack via pip install juniper-ml[all].

pip install juniper-cascor-worker

Ecosystem Compatibility

This package is part of the Juniper ecosystem. Verified compatible versions:

juniper-data juniper-cascor juniper-canopy data-client cascor-client cascor-worker
0.6.x 0.4.x 0.4.x >=0.4.1 >=0.4.0 >=0.3.0

For full-stack Docker deployment and integration tests, see juniper-deploy.

Architecture

juniper-cascor-worker is a long-running client process that connects outbound to juniper-cascor's worker WebSocket. The worker holds the candidate-training computation; the cascor service holds the scheduling, candidate selection, and network-growth logic.

┌─────────────────────────┐                  ┌──────────────────────┐
│  juniper-cascor-worker  │ ◄── X-API-Key ──►│   juniper-cascor     │
│  CascorWorkerAgent      │   over WSS/WS    │   Training Svc       │
│  (this package)         │ ──────────────►  │   /ws/v1/workers     │
│                         │   tensor frames  │   Port 8200          │
└─────────────────────────┘                  └──────────────────────┘

The worker authenticates with the X-API-Key header on connection, exchanges structured JSON control frames plus binary tensor frames for candidate-unit training, and reports task progress through periodic heartbeats. Multiple worker instances may be connected concurrently; juniper-cascor distributes candidate-pool training across them.

Related Services

Service Relationship Notes
juniper-cascor Worker's upstream service; manages candidate-pool scheduling Default URL ws://juniper-cascor:8200/ws/v1/workers
juniper-deploy Provides the orchestrated juniper-cascor-worker Docker service See juniper-deploy/docker-compose.yml

Service Configuration

Environment variables are read by juniper_cascor_worker/config.py and grouped by mode. Default mode (WebSocket) reads only the WebSocket variables; --legacy mode reads only the legacy variables. The shared variables (logging, health-probe surface) apply to both modes.

CFG-06 (>= 0.4.0): canonical env-var names are JUNIPER_CASCOR_WORKER_*. Legacy CASCOR_* (and CASCOR_WORKER_*) names still work but emit a DeprecationWarning per process. The full legacy → canonical mapping lives in AGENTS.md § Legacy env-var names.

WebSocket mode

Variable Required Default Description
JUNIPER_CASCOR_WORKER_SERVER_URL Yes Worker endpoint URL (ws:// or wss://)
JUNIPER_CASCOR_WORKER_AUTH_TOKEN No empty Token sent as the X-API-Key header
JUNIPER_CASCOR_WORKER_HEARTBEAT_INTERVAL No 10.0 Seconds between heartbeat messages
JUNIPER_CASCOR_WORKER_TASK_TIMEOUT No 3600.0 Maximum seconds for a single training task
JUNIPER_CASCOR_WORKER_TLS_CERT No unset Client certificate path for mTLS
JUNIPER_CASCOR_WORKER_TLS_KEY No unset Client private key path for mTLS
JUNIPER_CASCOR_WORKER_TLS_CA No unset Custom CA bundle for TLS verification

Legacy mode (--legacy)

Variable Required Default Description
JUNIPER_CASCOR_WORKER_MANAGER_HOST No 127.0.0.1 Manager hostname
JUNIPER_CASCOR_WORKER_MANAGER_PORT No 50000 Manager port
JUNIPER_CASCOR_WORKER_AUTHKEY Yes Manager authentication key
JUNIPER_CASCOR_WORKER_NUM_WORKERS No 1 Worker process count
JUNIPER_CASCOR_WORKER_MP_CONTEXT No forkserver Multiprocessing context (forkserver, spawn, fork)

Shared

Variable Required Default Description
CASCOR_WORKER_HEALTH_PORT No 8210 Health-check HTTP server port (R1.3 health-probe surface)
CASCOR_WORKER_HEALTH_BIND No 127.0.0.1 Health-check bind address; set to 0.0.0.0 when running under Kubernetes

Docker Deployment

# Full stack (recommended) — see juniper-deploy:
git clone https://github.com/pcalnon/juniper-deploy.git  # (private repository)
cd juniper-deploy && docker compose up --build

# Standalone:
docker build -t juniper-cascor-worker:latest .
docker run --rm \
  -e CASCOR_SERVER_URL=ws://<cascor-host>:8200/ws/v1/workers \
  -e CASCOR_AUTH_TOKEN=<worker-token> \
  juniper-cascor-worker:latest

The Dockerfile defaults to juniper-cascor-worker --server-url ws://juniper-cascor:8200/ws/v1/workers, which resolves the cascor service by name on the juniper-deploy Docker network. Container liveness is probed by kill -0 1 (PID-1 liveness) rather than an HTTP endpoint to avoid PyTorch initialization races on the dedicated health-server thread.

Dependency Lockfile

Two lockfiles ship with this package, both regenerated by the same uv pip compile invocations.

File Purpose
requirements.lock Default lockfile; pins the full GPU-capable dependency surface (includes CUDA-enabled PyTorch wheels) for non-Docker developer installs
requirements-cpu.lock CPU-only lockfile (Phase 4E, CW-02); used by the Dockerfile to keep the runtime image slim by excluding the ~2–4 GB NVIDIA/CUDA transitive stack

Regenerate the default lock:

uv pip compile pyproject.toml --no-emit-package torch -o requirements.lock

Regenerate the CPU-only lock (PyTorch installed separately from the official PyTorch CPU index in the Dockerfile):

echo "torch==2.9.1+cpu" > /tmp/torch-cpu-override
uv pip compile pyproject.toml \
  --constraint /tmp/torch-cpu-override \
  --extra-index-url https://download.pytorch.org/whl/cpu \
  --index-strategy unsafe-best-match \
  --no-emit-package torch \
  -o requirements-cpu.lock

The ecosystem-wide lockfile-freshness gate enforces regeneration on every PR that touches pyproject.toml; the /tmp + mv pattern avoids the self-pin trap of uv pip compile -o <file> reading the existing file.

Active Research Components

juniper-cascor-worker contributes the distributed candidate-pool training research component to the Juniper platform: a wire-protocol-defined parallelisation of Cascade-Correlation's candidate-unit selection step across an arbitrary number of worker hosts, coordinated by juniper-cascor over a WebSocket worker protocol (/ws/v1/workers) with mTLS support, structured heartbeats, and reassignment of tasks from workers that have exceeded the heartbeat timeout. The protocol itself — defined by juniper-cascor-protocol — is the research artifact; this package is its reference implementation on the worker side.

Quick Start Guide

Prerequisites

  • Python ≥ 3.12
  • A running juniper-cascor instance reachable at the URL passed via --server-url or CASCOR_SERVER_URL
  • A worker auth token issued by juniper-cascor (JUNIPER_CASCOR_API_KEYS); the same token is passed to the worker via --auth-token or CASCOR_AUTH_TOKEN
  • The JuniperCascor source code importable on the worker machine — the worker runs CasCor's candidate-training code locally rather than depending on a published CasCor library

Installation

pip install juniper-cascor-worker

Verification — WebSocket mode

juniper-cascor-worker \
  --server-url ws://<cascor-host>:8200/ws/v1/workers \
  --auth-token <worker-token>

A successful start logs Connected to ws://<cascor-host>:8200/ws/v1/workers. Configurable behaviour through optional flags (heartbeat interval, mTLS, task timeout) is documented under Service Configuration. The worker can also be embedded in Python:

import asyncio
from juniper_cascor_worker import CascorWorkerAgent, WorkerConfig

config = WorkerConfig(
    server_url="ws://<cascor-host>:8200/ws/v1/workers",
    auth_token="<worker-token>",
)

agent = CascorWorkerAgent(config)
asyncio.run(agent.run())

Verification — Legacy mode

Legacy mode is retained only for transitional deployments and is deprecated. New deployments should use WebSocket mode.

juniper-cascor-worker --legacy \
  --manager-host <manager-host> \
  --manager-port 50000 \
  --authkey <legacy-authkey> \
  --workers 4

Next Steps

Research Philosophy

The Juniper platform exists to study learning algorithms whose network architecture is not fixed in advance. Its initial anchor is the Cascade-Correlation algorithm of Fahlman and Lebiere (1990), implemented from the primary literature without recourse to higher-level abstractions that elide the algorithm's operational detail. The organising commitment is that algorithm implementations remain inspectable at the level at which they were originally specified: candidate units, correlation objectives, weight-freezing semantics, and the structural events that grow the network are first-class artifacts of the codebase rather than internal details of a library wrapper. This permits comparative work — across algorithms, datasets, and hyperparameter regimes — to be conducted on a known and reproducible substrate.

The current platform comprises a Cascade-Correlation training service exposing a REST and WebSocket interface, a dataset-generation service with a named-version registry that includes the ARC-AGI families, a real-time monitoring dashboard for inspecting training dynamics as they occur, and a distributed worker that parallelises candidate-unit training across hosts. Near-term work extends the architectural-growth catalogue beyond Cascade-Correlation, introduces multi-network orchestration for comparative experiments at the level of network populations rather than individual runs, and tightens the dataset–training–monitoring loop into a reproducible research workbench. The longer-term direction is the systematic empirical study of constructive and architecture-growing learning algorithms, with first-class infrastructure for the ablation, comparison, and replication that such a study requires.

Documentation

Document Purpose
docs/DOCUMENTATION_OVERVIEW.md Navigation index for all juniper-cascor-worker documentation
docs/QUICK_START.md Complete installation and verification guide
docs/REFERENCE.md Full configuration, CLI, and environment-variable reference
docs/DEVELOPER_CHEATSHEET.md Quick-reference card for development tasks
CHANGELOG.md Version history

License

MIT License — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

juniper_cascor_worker-0.4.0.tar.gz (76.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

juniper_cascor_worker-0.4.0-py3-none-any.whl (40.1 kB view details)

Uploaded Python 3

File details

Details for the file juniper_cascor_worker-0.4.0.tar.gz.

File metadata

  • Download URL: juniper_cascor_worker-0.4.0.tar.gz
  • Upload date:
  • Size: 76.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for juniper_cascor_worker-0.4.0.tar.gz
Algorithm Hash digest
SHA256 c1df07494961252a59d3f330433071424bca117a38ce54e0985224d27a11d706
MD5 a65ff9ed6a214e5393594242f51bd898
BLAKE2b-256 8a159f21df78f9d0716facb6f670bc98944847863dbf74943ab87c488a0a87f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for juniper_cascor_worker-0.4.0.tar.gz:

Publisher: publish.yml on pcalnon/juniper-cascor-worker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file juniper_cascor_worker-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for juniper_cascor_worker-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 11a333beb18673f0a821b7a0c2dcd3cf8053eb26ee7808b087d4d9fe81c64f27
MD5 58cf1bb8bb157b75480e6e7bcf2c1ad8
BLAKE2b-256 8a139dfdd4bac813465126ce944ec297b23f8cd68427f726cfda6f6c01c0d9d5

See more details on using hashes here.

Provenance

The following attestation bundles were made for juniper_cascor_worker-0.4.0-py3-none-any.whl:

Publisher: publish.yml on pcalnon/juniper-cascor-worker

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page