Skip to main content

coreason-model-foundry

Project description

coreason-model-foundry

Industrial Automation Engine for Training Specialized "Student Models"

License: Prosperity 3.0 CI Code Style: Ruff Documentation

The coreason-model-foundry serves as the "Refinery" in the CoReason AI ecosystem. It is an orchestrator for post-training optimization, designed to select the right mathematical strategy (DoRA, ORPO, QLoRA) for the task, prune data for maximum information density, and distribute the resulting artifacts safely.

It implements a Select-Prune-Train-Merge-Distribute Loop, utilizing unsloth for accelerated training and mergekit for model merging.

Features

  • Polymorphic Training Architecture: Dynamically loads the training kernel based on the goal:
    • DoRA: Logic & Math (via UnslothSFTTrainer).
    • ORPO: Alignment & Safety (via UnslothORPOTrainer).
    • QLoRA: Memory Efficiency (via 4-bit quantization).
  • Data Curator: Maximizes "Information Density" using Semantic Deduplication (SemDeDup) to remove 95%+ similar duplicates.
  • Hardware Safety: "Fail Fast" mechanism prevents OOM crashes by validating VRAM requirements (e.g., enforces 24GB for full ORPO).
  • The Alchemist (Merging): Integrates mergekit to combine adapters using the DARE-TIES algorithm.
  • Artifact Distribution: Automatically pushes trained models to the coreason-publisher registry.
  • GxP Compliance: Calculates provenance hashes (Lot Numbers) for datasets and manifests.

Installation

pip install -r requirements.txt

Note: This library relies on unsloth and torch (CUDA). Ensure these are installed in your environment suitable for your hardware.

Usage

Python API

from coreason_model_foundry import orchestrate_training

# Run the full training pipeline with a manifest file
orchestrate_training("manifest.yaml")

Example Manifest

job_id: "train-prod-2025-01-15"
base_model: "unsloth/llama-3-8b-bnb-4bit"

method_config:
  type: "orpo"
  rank: 64
  strict_hardware_check: true

dataset:
  ref: "synthesis://batch_clinical_reasoning"
  sem_dedup: true

compute:
  batch_size: 4
  grad_accum: 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_model_foundry-0.1.0.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_model_foundry-0.1.0-py3-none-any.whl (36.2 kB view details)

Uploaded Python 3

File details

Details for the file coreason_model_foundry-0.1.0.tar.gz.

File metadata

  • Download URL: coreason_model_foundry-0.1.0.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_model_foundry-0.1.0.tar.gz
Algorithm Hash digest
SHA256 16841ed6c8665f0b0cc8034c88ffc0ee007eb51a6e11e5bc07eddc2cf1dd0b64
MD5 7bdf15804e74497ba9da0eccfcc6ee77
BLAKE2b-256 8905ebb944dfe38450e5f09e6e2c620b428da70113a178509d3d6d3423b7882e

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.1.0.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_model_foundry-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for coreason_model_foundry-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ca7b38432df61d113cba06dcb6e0b4bbab9fcfc0049bd4721075ae4e5764edfd
MD5 c4d16ba7d96820a2a2aeb24334549743
BLAKE2b-256 df366d850b134bed50231f9f1bb15ce0970494fdc784f0c1e86dad515d115844

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.1.0-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page