A configuration-driven LLM compression framework using low-rank factorization and layer removal

These details have not been verified by PyPI

Project links

Project description

Goldcrest

LLM compression framework focusing on low-rank factorization (matrices and tensors), managing your whole experiment pipelines - compression, evaluation, profiling and logging.

Features

Category	Options
Compression Algorithms	SVD, Tucker, CP (CANDECOMP/PARAFAC), Tensor Train, weight pruning (layer removal)
Rank Selection Criteria	Activation-informed algorithms like ASVD and SVD-LLM, entropies, Fisher Information, stable rank, and metrics included in information_flow (external download is required)
Algebra Backends	CoLA (Lanczos, LOBPCG), PyTorch, TensorLy
Target Layers	MLP, attention layers (Q/K/V/O), embedding layers
Evaluation	lm-eval-harness integration, self-defined language task evaluation metrics
Profiling	Memory (RSS, VMS, GPU allocated/reserved, peak), latency (per-phase), subprocess isolation
Logging	CSV logging for experiments, including models, evaluations, and compression runs, with cross-experiment comparison

Quick Start

1. Install

pip install goldcrest

With optional dependencies:

pip install goldcrest[eval]    # lm-eval-harness integration
pip install goldcrest[cola]    # CoLA algebra backend

2. Configure

Create a YAML config file specifying your compression pipeline, this is an example:

model:
  name: Qwen/Qwen3-14B
  source: hf
  device: cuda

compression:
  svd:
    backend: cola
    cola:
      algorithm: lanczos

factorization:
  objects:
    - "model.layers[*].mlp.gate_proj"
    - "model.layers[*].mlp.up_proj"
    - "model.layers[*].self_attn.q_proj"
  func_name: svd
  rank: 64

evaluation:
  type: lm_eval         #   lm-eval-harness
  tasks: [arc_easy]

More example configurations are in config/.

3. Run

python scripts/examples/gpu/h100_activation_svd.py --config config/h100_activation_svd.yaml

Configuration Options

Section	Key	Description
`model`	`name`	HuggingFace model ID or local path
`model`	`device`	`cuda` or `cpu`
`analysis`	`type`	`activation_metrics`, `fisher_information`, `weight_metrics`
`compression.svd`	`backend`	`cola`, `torch`, `tensorly`
`factorization`	`objects`	Layer patterns to compress (supports wildcards)
`factorization`	`func_name`	`svd`, `tucker`, `cp`, `tensor_train`
`evaluation`	`type`	`lm_eval` for benchmark tasks

There are more detailed clarification about configuration settings for compression algorithms, and layer types to be compressed.

Example Scripts

CPU Examples (`scripts/examples/cpu/`)

Lightweight scripts for development and testing.

Script	Description
`svd_gemma3.py`	Single-pass SVD compression for Gemma-3
`svd_qwen3.py`	Single-pass SVD compression for Qwen-3
`loop_svd_gemma3.py`	Iterative SVD compression for Gemma-3
`loop_svd_qwen3.py`	Iterative SVD compression for Qwen-3
`asvd_svdllm_pipeline.py`	Combined ASVD and SVD-LLM pipeline
`benchmark_reproducibility.py`	Reproducibility benchmarks
`perplexity_evaluation.py`	Perplexity evaluation utilities
`memory_profiling.py`	Memory usage profiling
`profile_compressed.py`	Profile compressed model performance
`comparison_baseline_compressed.py`	Compare baseline vs compressed models

GPU Examples (`scripts/examples/gpu/`)

GPU scripts for H100 and H200 workloads:

Script	Description
`h100_activation_svd.py`	Activation-guided SVD compression (H100)
`h100_entropy_svd.py`	Entropy-based rank selection (H100)
`h100_hybrid_mi.py`	Mutual information hybrid compression (H100)
`h100_full_pipeline.py`	Complete analysis + compression + evaluation (H100)
`h100_tucker.py`	Tucker decomposition compression (H100)
`h100_cp.py`	CP (CANDECOMP/PARAFAC) decomposition (H100)
`h100_tensor_train.py`	Tensor-Train decomposition (H100)
`h100_weight_pruning.py`	Weight pruning compression (H100)
`h200_activation_svd.py`	Activation-guided SVD compression (H200)
`h200_entropy_svd.py`	Entropy-based rank selection (H200)
`h200_hybrid_mi.py`	Mutual information hybrid compression (H200)
`h200_full_pipeline.py`	Complete analysis + compression + evaluation (H200)
`h200_tucker.py`	Tucker decomposition compression (H200)
`h200_cp.py`	CP (CANDECOMP/PARAFAC) decomposition (H200)
`h200_tensor_train.py`	Tensor-Train decomposition (H200)
`h200_weight_pruning.py`	Weight pruning compression (H200)
`asvd_svdllm_pipeline.py`	Combined ASVD and SVD-LLM pipeline
`benchmark_reproducibility.py`	Reproducibility benchmarks
`perplexity_evaluation.py`	Perplexity evaluation utilities
`memory_profiling.py`	Memory usage profiling
`profile_compressed.py`	Profile compressed model performance
`comparison_baseline_compressed.py`	Compare baseline vs compressed models

Third-Party Integrations (`scripts/examples/third-party/`)

Script	Description
`cola_vs_torch_benchmark.py`	CoLA vs PyTorch SVD performance comparison
`lanczos_truncated_svd.py`	Lanczos algorithm for truncated SVD
`lobpcg_ill_conditioned.py`	LOBPCG for ill-conditioned matrices
`info_flow_metrics_loading.py`	Information flow metric computation
`layer_sensitivity_correlation.py`	Layer sensitivity analysis
`metric_guided_compression.py`	Metric-driven compression selection
`mi_hybrid_compression.py`	Mutual information hybrid approach
`third_party_full_pipeline.py`	Full pipeline with third-party backends

Project Structure

goldcrest/
├── config/                # Experiment configs plus base/ and profiles/
│   ├── base/
│   └── profiles/
├── docs/                  # Documentation and changelogs
├── logs/                  # Runtime logs
├── scripts/
│   ├── bash/              # Shell scripts (cpu/, gpu/, third-party/)
│   ├── examples/          # Python examples (cpu/, gpu/, third-party/)
│   ├── logs/              # Script execution logs
│   └── utils/             # Utility scripts
├── tests/                 # Pytest regression and integration suite
└── goldcrest/
    ├── config/            # Config loader
    ├── framework/         # Core framework (layers, state, IO)
    ├── orchestration/     # Pipeline orchestration
    └── plugins/
        ├── analysis/      # Metrics and layer selection
        ├── compression/   # SVD, Tucker, CP, pruning
        ├── evaluation/    # lm-eval integration
        └── models/        # Model utilities

Documentation

Architecture — Plugin-based design, EventBus, and workflow system
Changelogs — Notes on recent fixes
Successful Runs — H200 GPU benchmark results (7 compression strategies, third-party tests)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

goldcrest-0.1.0.tar.gz (164.5 kB view details)

Uploaded Mar 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

goldcrest-0.1.0-py3-none-any.whl (193.8 kB view details)

Uploaded Mar 24, 2026 Python 3

File details

Details for the file goldcrest-0.1.0.tar.gz.

File metadata

Download URL: goldcrest-0.1.0.tar.gz
Upload date: Mar 24, 2026
Size: 164.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for goldcrest-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9741272eb65e87bee13cbe3aed1d1f0def971c6f381c3286bdb9bdc1bd2a665b`
MD5	`1bfe4811eb6dfb3e25dffe7a6c545410`
BLAKE2b-256	`8d5eeb39e4e8ed53205f2cb9a0b736e3b9bf4478ed74126ff2e80b0caf05f6a1`

See more details on using hashes here.

File details

Details for the file goldcrest-0.1.0-py3-none-any.whl.

File metadata

Download URL: goldcrest-0.1.0-py3-none-any.whl
Upload date: Mar 24, 2026
Size: 193.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for goldcrest-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1801ca784fad826d255657aa6dd3404e2530fadee09caccb2543e5c6c3f07fe0`
MD5	`4c5aa4d02460c291cac31378fdfd6a16`
BLAKE2b-256	`f35ae9c12c5c962c209ce42e48013b4b0871c6431906de217e6de9b1518a1394`

See more details on using hashes here.

goldcrest 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Goldcrest

Features

Quick Start

1. Install

2. Configure

3. Run

Configuration Options

Example Scripts

CPU Examples (`scripts/examples/cpu/`)

GPU Examples (`scripts/examples/gpu/`)

Third-Party Integrations (`scripts/examples/third-party/`)

Project Structure

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

goldcrest 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Goldcrest

Features

Quick Start

1. Install

2. Configure

3. Run

Configuration Options

Example Scripts

CPU Examples (scripts/examples/cpu/)

GPU Examples (scripts/examples/gpu/)

Third-Party Integrations (scripts/examples/third-party/)

Project Structure

Documentation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

CPU Examples (`scripts/examples/cpu/`)

GPU Examples (`scripts/examples/gpu/`)

Third-Party Integrations (`scripts/examples/third-party/`)