Skip to main content

coreason-model-foundry

Project description

coreason-model-foundry

Industrial Automation Engine for Training Specialized "Student Models"

License: Prosperity 3.0 CI Code Style: Ruff Documentation

The coreason-model-foundry serves as the "Refinery" in the CoReason AI ecosystem. It is an orchestrator for post-training optimization, designed to select the right mathematical strategy (DoRA, ORPO, QLoRA) for the task, prune data for maximum information density, and distribute the resulting artifacts safely.

It implements a Select-Prune-Train-Merge-Distribute Loop, utilizing unsloth for accelerated training and mergekit for model merging.

Features

  • Polymorphic Training Architecture: Dynamically loads the training kernel based on the goal:
    • DoRA: Logic & Math (via UnslothSFTTrainer).
    • ORPO: Alignment & Safety (via UnslothORPOTrainer).
    • QLoRA: Memory Efficiency (via 4-bit quantization).
  • Data Curator: Maximizes "Information Density" using Semantic Deduplication (SemDeDup) to remove 95%+ similar duplicates.
  • Hardware Safety: "Fail Fast" mechanism prevents OOM crashes by validating VRAM requirements (e.g., enforces 24GB for full ORPO).
  • The Alchemist (Merging): Integrates mergekit to combine adapters using the DARE-TIES algorithm.
  • Artifact Distribution: Automatically pushes trained models to the coreason-publisher registry.
  • GxP Compliance: Calculates provenance hashes (Lot Numbers) for datasets and manifests.

Installation

pip install -r requirements.txt

Note: This library relies on unsloth and torch (CUDA). Ensure these are installed in your environment suitable for your hardware.

Usage

Python API

from coreason_model_foundry import orchestrate_training

# Run the full training pipeline with a manifest file
orchestrate_training("manifest.yaml")

Example Manifest

job_id: "train-prod-2025-01-15"
base_model: "unsloth/llama-3-8b-bnb-4bit"

method_config:
  type: "orpo"
  rank: 64
  strict_hardware_check: true

dataset:
  ref: "synthesis://batch_clinical_reasoning"
  sem_dedup: true

compute:
  batch_size: 4
  grad_accum: 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_model_foundry-0.2.1.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_model_foundry-0.2.1-py3-none-any.whl (38.1 kB view details)

Uploaded Python 3

File details

Details for the file coreason_model_foundry-0.2.1.tar.gz.

File metadata

  • Download URL: coreason_model_foundry-0.2.1.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_model_foundry-0.2.1.tar.gz
Algorithm Hash digest
SHA256 cba69865314c4764c3a42b94eaea6cdf44ab9e07ad9fe8abe63c4d966f51641c
MD5 c9fcc73a2ed8e5bb0aa0357baa65e344
BLAKE2b-256 287ad65356b1fbec45348bbc0d46d971c9b725058dfde56a51e39d16d4f516e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.2.1.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_model_foundry-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for coreason_model_foundry-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7990416780ba11e175406071e1318a37b1485942480a90901c8372e75c64f899
MD5 ca6fcb8963c056333ed48107d953d2e6
BLAKE2b-256 e3d2edaa58afef4a9a7e7f69b19caf961d9338e598cfa13933f24fc7153e448b

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.2.1-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page