Train small LLMs and deploy them for fast structured extraction on CPU

These details have not been verified by PyPI

Project description

Fuse

Train small LLMs and deploy them for fast structured extraction on CPU.

Fuse lets you pull any GGUF model from HuggingFace, run zero-shot structured extraction with dynamic schemas, fine-tune with LoRA via Unsloth/HuggingFace, and export to GGUF for fast CPU inference. No predefined Pydantic models required.

Install

With uv (recommended)

uv add fusellm

With training support:

uv add "fusellm[training]"

Run without installing

# One-shot extraction — no install needed
uvx fusellm extract "Sarah Chen is a 34-year-old architect at Stripe" \
  --model bartowski/Llama-3.2-1B-Instruct-GGUF \
  --fields "name:str,age:int,job_title:str"

# Or with a config file
uvx fusellm extract "SpaceX was founded in 2002" \
  --config extract_company.yaml

With pip

pip install fusellm
pip install "fusellm[training]"

Quick Start

Pull a model from HuggingFace and extract

import fuse

# Auto-downloads the best Q4 GGUF from HuggingFace Hub
backend = fuse.LlamaCppBackend(model_name="bartowski/Llama-3.2-1B-Instruct-GGUF")
extractor = fuse.Extractor(backend)

# Zero-shot structured extraction — no Pydantic model needed
result = extractor.extract_from_fields(
    "Sarah Chen is a 34-year-old software architect at Stripe.",
    {"name": str, "age": int, "job_title": str, "company": str}
)
# {'name': 'Sarah Chen', 'age': 34, 'job_title': 'software architect', 'company': 'Stripe'}

Use a local GGUF model

backend = fuse.LlamaCppBackend(model_path="./models/llama-3.2-1b-q4.gguf")
extractor = fuse.Extractor(backend)

result = extractor.extract_from_fields(
    "John is 30 years old and knows Python and Rust",
    {"name": str, "age": int, "skills": list[str]}
)
# {'name': 'John', 'age': 30, 'skills': ['Python', 'Rust']}

Config-driven extraction

config = fuse.InferenceConfig(
    model_name="bartowski/Phi-4-mini-instruct-GGUF",
    n_ctx=4096,
    n_threads=8,
    temperature=0.0,
)
backend = fuse.LlamaCppBackend.from_config(config)

Extract from a JSON schema

schema = fuse.SchemaBuilder.from_json_schema({
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "skills": {"type": "array", "items": {"type": "string"}},
    },
    "required": ["name", "age"],
})
result = extractor.extract("John is 30 and knows Rust", schema)

Let the LLM infer the schema

result = extractor.extract_from_description(
    "The Series A raised $15M from Sequoia, following a $2.5M seed from YC.",
    "Extract monetary amounts, funding round type, and investor names"
)

CLI

Extract with a config file

fuse extract "Sarah Chen is a 34-year-old architect at Stripe" \
  --config examples/extract_person.yaml

extract_person.yaml:

model:
  model_name: "bartowski/Llama-3.2-1B-Instruct-GGUF"
  n_ctx: 2048
  temperature: 0.0

fields:
  name: str
  age: int
  job_title: str
  company: str

prompt_format: llama
max_tokens: 256

Extract with inline flags

# HuggingFace model — auto-downloads
fuse extract "SpaceX was founded in 2002" \
  --model bartowski/Phi-4-mini-instruct-GGUF \
  --fields "company:str,year:int,industry:str"

# Local GGUF model
fuse extract "John is 30" \
  --model ./model.gguf \
  --fields "name:str,age:int"

# Using a JSON schema file
fuse extract "John is 30 and knows Python" \
  --model bartowski/Llama-3.2-1B-Instruct-GGUF \
  --schema schema.json

Train

fuse train --config examples/train_extraction.yaml

Quantize to GGUF

fuse quantize --model ./output --output model.gguf --method q4_0

Supported Models

Any GGUF model on HuggingFace works. Some good small models for CPU extraction:

Model	Size	HuggingFace Repo
Llama 3.2 1B Instruct	~1GB Q4	`bartowski/Llama-3.2-1B-Instruct-GGUF`
Llama 3.2 3B Instruct	~2GB Q4	`bartowski/Llama-3.2-3B-Instruct-GGUF`
Qwen 2.5 1.5B Instruct	~1GB Q4	`bartowski/Qwen2.5-1.5B-Instruct-GGUF`
Phi-4 Mini Instruct	~2.5GB Q4	`bartowski/Phi-4-mini-instruct-GGUF`

Models are auto-downloaded and cached to ~/.cache/fuse/models/.

Development

uv sync --extra dev
uv run nox                    # All CI checks
uv run nox -s lint            # Ruff lint + format
uv run nox -s typecheck       # ty type check
uv run nox -s tests           # Pytest across Python 3.11-3.13

License

MIT

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.2

Mar 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fusellm-0.0.2.tar.gz (200.3 kB view details)

Uploaded Mar 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fusellm-0.0.2-py3-none-any.whl (21.2 kB view details)

Uploaded Mar 21, 2026 Python 3

File details

Details for the file fusellm-0.0.2.tar.gz.

File metadata

Download URL: fusellm-0.0.2.tar.gz
Upload date: Mar 21, 2026
Size: 200.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fusellm-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`3322db58c6573201e14b11649b9796e0eea7f397220e56f3fcbba662da629142`
MD5	`b94d711d4fffc29aaec18c5adc0cdee6`
BLAKE2b-256	`d940587544fef5b7aff9c50adfede7dfde1c183e80929ba9246424d5161c305a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fusellm-0.0.2.tar.gz:

Publisher: release.yml on sandeep-selvaraj/fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fusellm-0.0.2.tar.gz
- Subject digest: 3322db58c6573201e14b11649b9796e0eea7f397220e56f3fcbba662da629142
- Sigstore transparency entry: 1154468214
- Sigstore integration time: Mar 21, 2026
Source repository:
- Permalink: sandeep-selvaraj/fuse@f0f7535c17e11206f5dead8cef641e073e35f3f1
- Branch / Tag: refs/tags/v0.0.2
- Owner: https://github.com/sandeep-selvaraj
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f0f7535c17e11206f5dead8cef641e073e35f3f1
- Trigger Event: push

File details

Details for the file fusellm-0.0.2-py3-none-any.whl.

File metadata

Download URL: fusellm-0.0.2-py3-none-any.whl
Upload date: Mar 21, 2026
Size: 21.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fusellm-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0b58162254c57cf9d63430de76355834675d66efbe16442a361f2be97f7baef9`
MD5	`024c5b8fcaa13db5571a1a82abdb0fa4`
BLAKE2b-256	`c37b0fd05dfafa1ea7fc3b94c0ba3a655507a100f84339e2b28fbb1d2da67b97`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fusellm-0.0.2-py3-none-any.whl:

Publisher: release.yml on sandeep-selvaraj/fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fusellm-0.0.2-py3-none-any.whl
- Subject digest: 0b58162254c57cf9d63430de76355834675d66efbe16442a361f2be97f7baef9
- Sigstore transparency entry: 1154468215
- Sigstore integration time: Mar 21, 2026
Source repository:
- Permalink: sandeep-selvaraj/fuse@f0f7535c17e11206f5dead8cef641e073e35f3f1
- Branch / Tag: refs/tags/v0.0.2
- Owner: https://github.com/sandeep-selvaraj
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@f0f7535c17e11206f5dead8cef641e073e35f3f1
- Trigger Event: push

fusellm 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Fuse

Install

With uv (recommended)

Run without installing

With pip

Quick Start

Pull a model from HuggingFace and extract

Use a local GGUF model

Config-driven extraction

Extract from a JSON schema

Let the LLM infer the schema

CLI

Extract with a config file

Extract with inline flags

Train

Quantize to GGUF

Supported Models

Development

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance