Skip to main content

coreason-model-foundry

Project description

coreason-model-foundry

Industrial Automation Engine for Training Specialized "Student Models"

License: Prosperity 3.0 CI Code Style: Ruff Documentation

The coreason-model-foundry serves as the "Refinery" in the CoReason AI ecosystem. It is an orchestrator for post-training optimization, designed to select the right mathematical strategy (DoRA, ORPO, QLoRA) for the task, prune data for maximum information density, and distribute the resulting artifacts safely.

It implements a Select-Prune-Train-Merge-Distribute Loop, utilizing unsloth for accelerated training and mergekit for model merging.

Features

  • Polymorphic Training Architecture: Dynamically loads the training kernel based on the goal:
    • DoRA: Logic & Math (via UnslothSFTTrainer).
    • ORPO: Alignment & Safety (via UnslothORPOTrainer).
    • QLoRA: Memory Efficiency (via 4-bit quantization).
  • Data Curator: Maximizes "Information Density" using Semantic Deduplication (SemDeDup) to remove 95%+ similar duplicates.
  • Hardware Safety: "Fail Fast" mechanism prevents OOM crashes by validating VRAM requirements (e.g., enforces 24GB for full ORPO).
  • The Alchemist (Merging): Integrates mergekit to combine adapters using the DARE-TIES algorithm.
  • Artifact Distribution: Automatically pushes trained models to the coreason-publisher registry.
  • GxP Compliance: Calculates provenance hashes (Lot Numbers) for datasets and manifests.

Installation

pip install -r requirements.txt

Note: This library relies on unsloth and torch (CUDA). Ensure these are installed in your environment suitable for your hardware.

Usage

Python API

from coreason_model_foundry import orchestrate_training

# Run the full training pipeline with a manifest file
orchestrate_training("manifest.yaml")

Example Manifest

job_id: "train-prod-2025-01-15"
base_model: "unsloth/llama-3-8b-bnb-4bit"

method_config:
  type: "orpo"
  rank: 64
  strict_hardware_check: true

dataset:
  ref: "synthesis://batch_clinical_reasoning"
  sem_dedup: true

compute:
  batch_size: 4
  grad_accum: 4

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

coreason_model_foundry-0.2.0.tar.gz (22.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

coreason_model_foundry-0.2.0-py3-none-any.whl (38.1 kB view details)

Uploaded Python 3

File details

Details for the file coreason_model_foundry-0.2.0.tar.gz.

File metadata

  • Download URL: coreason_model_foundry-0.2.0.tar.gz
  • Upload date:
  • Size: 22.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for coreason_model_foundry-0.2.0.tar.gz
Algorithm Hash digest
SHA256 05eac1634a4bfd497a55c8ace614ff6e58960580cf74c741f27de647b91b2d0c
MD5 1d61952e69f6ba8162705cf546911dc4
BLAKE2b-256 ed97f29eb00986d3cb1d0ab6a45031a56b48b893e52db10121daaa593e94856c

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.2.0.tar.gz:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file coreason_model_foundry-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for coreason_model_foundry-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 342f800ee4a4b7b05fbeb954648edce971d848ae0d849b659b819afe40160195
MD5 1f2018dc419e870ebafeb2e62bda6579
BLAKE2b-256 4207d23e54e2b5c2b3a5e5b0855f27967151abd02bf0f9d444063955230bbf73

See more details on using hashes here.

Provenance

The following attestation bundles were made for coreason_model_foundry-0.2.0-py3-none-any.whl:

Publisher: publish.yml on CoReason-AI/coreason-model-foundry

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page