Skip to main content

A lightweight pandas accessor for batch OpenAI-compatible LLM extraction

Project description

SilkLoom Core

SilkLoom Core is a small pandas accessor for batch LLM extraction.

DataFrame rows -> prompt.format(row) -> OpenAI-compatible chat call -> repaired JSON -> result DataFrame

Install

pip install silkloom-core

Quick Start

Importing silkloom_core registers df.llm.

import pandas as pd
import silkloom_core

df = pd.DataFrame(
    {
        "title": ["A clear experiment", "A weak evaluation"],
        "abstract": ["Reliable and reproducible.", "Too small to conclude much."],
    }
)

results = df.llm.setup(
    api_key="...",
    base_url="https://api.openai.com/v1",
    cache_path=".llm_cache.db",
).extract(
    "Title: {title}\nAbstract: {abstract}\nReturn JSON with keys label and summary.",
    model="gpt-4o-mini",
    max_workers=8,
    json_mode=True,
)

results contains only the parsed model output columns and keeps the original index, so you can join it back when needed:

df = df.join(results)

Client Setup

You can let SilkLoom create an OpenAI client:

df.llm.setup(api_key="...", base_url="...")

Or pass any OpenAI-compatible client with client.chat.completions.create(...):

from openai import OpenAI

client = OpenAI(api_key="...", base_url="...")
df.llm.setup(client=client)

Extraction

Use Python format placeholders that match DataFrame columns.

out = df.llm.extract(
    "Classify this text and return JSON: {text}",
    model="gpt-4o-mini",
    temperature=0.1,
    max_workers=4,
    max_retries=2,
    verbose=True,
)

Malformed JSON is parsed with json_repair. If the model returns a JSON object, its keys become columns. If it returns another JSON value, the value is placed in _llm_raw. Parse or request failures are returned in _llm_error.

Cache

Successful raw responses are cached in SQLite. The cache key includes the model, rendered messages, JSON mode, and request options.

df.llm.setup(cache_path="cache/llm.sqlite").extract(...)

Use a new cache path or delete the SQLite file when you want a fresh run.

Images

Pass image_column for local image paths, HTTP(S) image URLs, or existing data:image/... URLs. Local files are encoded as base64 data URLs.

out = df.llm.extract(
    "Extract fields from this receipt and return JSON.",
    image_column="receipt_path",
    model="gpt-4o-mini",
)

Rows with missing image values fall back to text-only prompts.

Progress And Cancel

Use progress_callback for UI integration:

def progress(done, total):
    print(done, total)

out = df.llm.extract("Analyze {text}", progress_callback=progress)

From another thread or UI event, call:

df.llm.cancel()

Queued work is cancelled where possible, and running rows stop before the next retry.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

silkloom_core-6.0.0.tar.gz (8.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

silkloom_core-6.0.0-py3-none-any.whl (7.2 kB view details)

Uploaded Python 3

File details

Details for the file silkloom_core-6.0.0.tar.gz.

File metadata

  • Download URL: silkloom_core-6.0.0.tar.gz
  • Upload date:
  • Size: 8.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for silkloom_core-6.0.0.tar.gz
Algorithm Hash digest
SHA256 756d70becb3efa70486aef6d7e640b4d13438704e2ec7743b59b9375ed767a40
MD5 20baf74ce2fbe167211870cba9f2b88b
BLAKE2b-256 991621bca316c73174d42223a359594080771bd2d2cbb23fd004fa66f10ff2c9

See more details on using hashes here.

Provenance

The following attestation bundles were made for silkloom_core-6.0.0.tar.gz:

Publisher: publish.yml on LeLiu-GeoAI/silkloom-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file silkloom_core-6.0.0-py3-none-any.whl.

File metadata

  • Download URL: silkloom_core-6.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for silkloom_core-6.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0eb1c04cf07ad6d86b873bab001ad868e0d3a628b390eb27237c18eda309c2ed
MD5 05d58310a413746a0872c89e0479f6cd
BLAKE2b-256 e850df5215f5cf8ba1e705ede8dd4b1dec553e431b5c8645f5d4d35dd19d4601

See more details on using hashes here.

Provenance

The following attestation bundles were made for silkloom_core-6.0.0-py3-none-any.whl:

Publisher: publish.yml on LeLiu-GeoAI/silkloom-core

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page