Skip to main content

A lightweight pandas accessor for batch OpenAI-compatible LLM extraction

Project description

SilkLoom Core

SilkLoom Core is a small pandas accessor for batch LLM extraction.

DataFrame rows -> Jinja prompt render -> OpenAI-compatible chat call -> repaired JSON -> result DataFrame

Install

pip install silkloom-core

Quick Start

Importing silkloom_core registers df.llm.

import pandas as pd
import silkloom_core

df = pd.DataFrame(
    {
        "title": ["A clear experiment", "A weak evaluation"],
        "abstract": ["Reliable and reproducible.", "Too small to conclude much."],
    }
)

results = df.llm.setup(
    api_key="...",
    base_url="https://api.openai.com/v1",
    cache_path=".llm_cache.db",
).extract(
    "Title: {{ title }}\nAbstract: {{ abstract }}\nReturn JSON with keys label and summary.",
    model="gpt-4o-mini",
    max_workers=8,
    json_mode=True,
)

results contains only the parsed model output columns and keeps the original index, so you can join it back when needed:

df = df.join(results)

Client Setup

You can let SilkLoom create an OpenAI client:

df.llm.setup(api_key="...", base_url="...")

Or pass any OpenAI-compatible client with client.chat.completions.create(...):

from openai import OpenAI

client = OpenAI(api_key="...", base_url="...")
df.llm.setup(client=client)

Extraction

Use Jinja placeholders that match DataFrame columns. Literal JSON braces can stay as normal braces.

out = df.llm.extract(
    'Classify {{ text }} and return JSON like {"label": "positive", "score": 0.9}',
    model="gpt-4o-mini",
    temperature=0.1,
    max_workers=4,
    max_retries=2,
    verbose=True,
)

Malformed JSON is parsed with json_repair. If the model returns a JSON object, its keys become columns. If it returns another JSON value, the value is placed in _llm_raw. Parse or request failures are returned in _llm_error.

Cache

Successful raw responses are cached in SQLite. The cache key includes the model, rendered messages, JSON mode, and request options.

df.llm.setup(cache_path="cache/llm.sqlite").extract(...)

Use a new cache path or delete the SQLite file when you want a fresh run.

Images

Pass image_column for local image paths, HTTP(S) image URLs, or existing data:image/... URLs. Local files are encoded as base64 data URLs.

out = df.llm.extract(
    "Extract fields from this receipt and return JSON.",
    image_column="receipt_path",
    model="gpt-4o-mini",
)

Rows with missing image values fall back to text-only prompts.

Progress And Cancel

Use progress_callback for UI integration:

def progress(done, total):
    print(done, total)

out = df.llm.extract("Analyze {{ text }}", progress_callback=progress)

From another thread or UI event, call:

df.llm.cancel()

Queued work is cancelled where possible, and running rows stop before the next retry.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

silkloom_core-6.0.1.tar.gz (8.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

silkloom_core-6.0.1-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file silkloom_core-6.0.1.tar.gz.

File metadata

  • Download URL: silkloom_core-6.0.1.tar.gz
  • Upload date:
  • Size: 8.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for silkloom_core-6.0.1.tar.gz
Algorithm Hash digest
SHA256 e4a452c9d9168916aea746cc4be44d7dd04fd24d35c0d61354991bb41d285f13
MD5 573c1546dc9135952c19c60bb917fb0f
BLAKE2b-256 ac02cf12d92b3469b2c675870992e0fb729ada644be0c0004f04f1d0ad4e3553

See more details on using hashes here.

Provenance

The following attestation bundles were made for silkloom_core-6.0.1.tar.gz:

Publisher: publish.yml on LeLiu-GeoAI/silkloom-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file silkloom_core-6.0.1-py3-none-any.whl.

File metadata

  • Download URL: silkloom_core-6.0.1-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for silkloom_core-6.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f5cbccc673b0a932085cf2dbb7b2d260b0ef20cd9e816d5cc785438634167d42
MD5 828588fa069cbdf44f22cd922b8f11a1
BLAKE2b-256 88722385195783128790601be50ac4b35db465e71b9208de30ac7de22b7e78e9

See more details on using hashes here.

Provenance

The following attestation bundles were made for silkloom_core-6.0.1-py3-none-any.whl:

Publisher: publish.yml on LeLiu-GeoAI/silkloom-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page