Skip to main content

Train small LLMs and deploy them for fast structured extraction on CPU

Project description

Fuse

Train small LLMs and deploy them for fast structured extraction on CPU.

Fuse lets you pull any GGUF model from HuggingFace, run zero-shot structured extraction with dynamic schemas, fine-tune with LoRA via Unsloth/HuggingFace, and export to GGUF for fast CPU inference. No predefined Pydantic models required.

Install

With uv (recommended)

uv add fusellm

With training support:

uv add "fusellm[training]"

Run without installing

# One-shot extraction — no install needed
uvx fusellm extract "Sarah Chen is a 34-year-old architect at Stripe" \
  --model bartowski/Llama-3.2-1B-Instruct-GGUF \
  --fields "name:str,age:int,job_title:str"

# Or with a config file
uvx fusellm extract "SpaceX was founded in 2002" \
  --config extract_company.yaml

With pip

pip install fusellm
pip install "fusellm[training]"

Quick Start

Pull a model from HuggingFace and extract

import fuse

# Auto-downloads the best Q4 GGUF from HuggingFace Hub
backend = fuse.LlamaCppBackend(model_name="bartowski/Llama-3.2-1B-Instruct-GGUF")
extractor = fuse.Extractor(backend)

# Zero-shot structured extraction — no Pydantic model needed
result = extractor.extract_from_fields(
    "Sarah Chen is a 34-year-old software architect at Stripe.",
    {"name": str, "age": int, "job_title": str, "company": str}
)
# {'name': 'Sarah Chen', 'age': 34, 'job_title': 'software architect', 'company': 'Stripe'}

Use a local GGUF model

backend = fuse.LlamaCppBackend(model_path="./models/llama-3.2-1b-q4.gguf")
extractor = fuse.Extractor(backend)

result = extractor.extract_from_fields(
    "John is 30 years old and knows Python and Rust",
    {"name": str, "age": int, "skills": list[str]}
)
# {'name': 'John', 'age': 30, 'skills': ['Python', 'Rust']}

Config-driven extraction

config = fuse.InferenceConfig(
    model_name="bartowski/Phi-4-mini-instruct-GGUF",
    n_ctx=4096,
    n_threads=8,
    temperature=0.0,
)
backend = fuse.LlamaCppBackend.from_config(config)

Extract from a JSON schema

schema = fuse.SchemaBuilder.from_json_schema({
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "skills": {"type": "array", "items": {"type": "string"}},
    },
    "required": ["name", "age"],
})
result = extractor.extract("John is 30 and knows Rust", schema)

Let the LLM infer the schema

result = extractor.extract_from_description(
    "The Series A raised $15M from Sequoia, following a $2.5M seed from YC.",
    "Extract monetary amounts, funding round type, and investor names"
)

CLI

Extract with a config file

fuse extract "Sarah Chen is a 34-year-old architect at Stripe" \
  --config examples/extract_person.yaml

extract_person.yaml:

model:
  model_name: "bartowski/Llama-3.2-1B-Instruct-GGUF"
  n_ctx: 2048
  temperature: 0.0

fields:
  name: str
  age: int
  job_title: str
  company: str

prompt_format: llama
max_tokens: 256

Extract with inline flags

# HuggingFace model — auto-downloads
fuse extract "SpaceX was founded in 2002" \
  --model bartowski/Phi-4-mini-instruct-GGUF \
  --fields "company:str,year:int,industry:str"

# Local GGUF model
fuse extract "John is 30" \
  --model ./model.gguf \
  --fields "name:str,age:int"

# Using a JSON schema file
fuse extract "John is 30 and knows Python" \
  --model bartowski/Llama-3.2-1B-Instruct-GGUF \
  --schema schema.json

Train

fuse train --config examples/train_extraction.yaml

Quantize to GGUF

fuse quantize --model ./output --output model.gguf --method q4_0

Supported Models

Any GGUF model on HuggingFace works. Some good small models for CPU extraction:

Model Size HuggingFace Repo
Llama 3.2 1B Instruct ~1GB Q4 bartowski/Llama-3.2-1B-Instruct-GGUF
Llama 3.2 3B Instruct ~2GB Q4 bartowski/Llama-3.2-3B-Instruct-GGUF
Qwen 2.5 1.5B Instruct ~1GB Q4 bartowski/Qwen2.5-1.5B-Instruct-GGUF
Phi-4 Mini Instruct ~2.5GB Q4 bartowski/Phi-4-mini-instruct-GGUF

Models are auto-downloaded and cached to ~/.cache/fuse/models/.

Development

uv sync --extra dev
uv run nox                    # All CI checks
uv run nox -s lint            # Ruff lint + format
uv run nox -s typecheck       # ty type check
uv run nox -s tests           # Pytest across Python 3.11-3.13

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fusellm-0.0.2.tar.gz (200.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fusellm-0.0.2-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file fusellm-0.0.2.tar.gz.

File metadata

  • Download URL: fusellm-0.0.2.tar.gz
  • Upload date:
  • Size: 200.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fusellm-0.0.2.tar.gz
Algorithm Hash digest
SHA256 3322db58c6573201e14b11649b9796e0eea7f397220e56f3fcbba662da629142
MD5 b94d711d4fffc29aaec18c5adc0cdee6
BLAKE2b-256 d940587544fef5b7aff9c50adfede7dfde1c183e80929ba9246424d5161c305a

See more details on using hashes here.

Provenance

The following attestation bundles were made for fusellm-0.0.2.tar.gz:

Publisher: release.yml on sandeep-selvaraj/fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file fusellm-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: fusellm-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for fusellm-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0b58162254c57cf9d63430de76355834675d66efbe16442a361f2be97f7baef9
MD5 024c5b8fcaa13db5571a1a82abdb0fa4
BLAKE2b-256 c37b0fd05dfafa1ea7fc3b94c0ba3a655507a100f84339e2b28fbb1d2da67b97

See more details on using hashes here.

Provenance

The following attestation bundles were made for fusellm-0.0.2-py3-none-any.whl:

Publisher: release.yml on sandeep-selvaraj/fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page