A lightweight pandas accessor for batch OpenAI-compatible LLM extraction
Project description
SilkLoom Core
SilkLoom Core is a small pandas accessor for batch LLM extraction.
DataFrame rows -> prompt.format(row) -> OpenAI-compatible chat call -> repaired JSON -> result DataFrame
Install
pip install silkloom-core
Quick Start
Importing silkloom_core registers df.llm.
import pandas as pd
import silkloom_core
df = pd.DataFrame(
{
"title": ["A clear experiment", "A weak evaluation"],
"abstract": ["Reliable and reproducible.", "Too small to conclude much."],
}
)
results = df.llm.setup(
api_key="...",
base_url="https://api.openai.com/v1",
cache_path=".llm_cache.db",
).extract(
"Title: {title}\nAbstract: {abstract}\nReturn JSON with keys label and summary.",
model="gpt-4o-mini",
max_workers=8,
json_mode=True,
)
results contains only the parsed model output columns and keeps the original index, so you can join it back when needed:
df = df.join(results)
Client Setup
You can let SilkLoom create an OpenAI client:
df.llm.setup(api_key="...", base_url="...")
Or pass any OpenAI-compatible client with client.chat.completions.create(...):
from openai import OpenAI
client = OpenAI(api_key="...", base_url="...")
df.llm.setup(client=client)
Extraction
Use Python format placeholders that match DataFrame columns.
out = df.llm.extract(
"Classify this text and return JSON: {text}",
model="gpt-4o-mini",
temperature=0.1,
max_workers=4,
max_retries=2,
verbose=True,
)
Malformed JSON is parsed with json_repair. If the model returns a JSON object, its keys become columns. If it returns another JSON value, the value is placed in _llm_raw. Parse or request failures are returned in _llm_error.
Cache
Successful raw responses are cached in SQLite. The cache key includes the model, rendered messages, JSON mode, and request options.
df.llm.setup(cache_path="cache/llm.sqlite").extract(...)
Use a new cache path or delete the SQLite file when you want a fresh run.
Images
Pass image_column for local image paths, HTTP(S) image URLs, or existing data:image/... URLs. Local files are encoded as base64 data URLs.
out = df.llm.extract(
"Extract fields from this receipt and return JSON.",
image_column="receipt_path",
model="gpt-4o-mini",
)
Rows with missing image values fall back to text-only prompts.
Progress And Cancel
Use progress_callback for UI integration:
def progress(done, total):
print(done, total)
out = df.llm.extract("Analyze {text}", progress_callback=progress)
From another thread or UI event, call:
df.llm.cancel()
Queued work is cancelled where possible, and running rows stop before the next retry.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file silkloom_core-6.0.0.tar.gz.
File metadata
- Download URL: silkloom_core-6.0.0.tar.gz
- Upload date:
- Size: 8.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
756d70becb3efa70486aef6d7e640b4d13438704e2ec7743b59b9375ed767a40
|
|
| MD5 |
20baf74ce2fbe167211870cba9f2b88b
|
|
| BLAKE2b-256 |
991621bca316c73174d42223a359594080771bd2d2cbb23fd004fa66f10ff2c9
|
Provenance
The following attestation bundles were made for silkloom_core-6.0.0.tar.gz:
Publisher:
publish.yml on LeLiu-GeoAI/silkloom-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
silkloom_core-6.0.0.tar.gz -
Subject digest:
756d70becb3efa70486aef6d7e640b4d13438704e2ec7743b59b9375ed767a40 - Sigstore transparency entry: 1821791467
- Sigstore integration time:
-
Permalink:
LeLiu-GeoAI/silkloom-core@81f79c9cf4499397919805fbb28dd841b8289d20 -
Branch / Tag:
refs/tags/v6.0.0 - Owner: https://github.com/LeLiu-GeoAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81f79c9cf4499397919805fbb28dd841b8289d20 -
Trigger Event:
push
-
Statement type:
File details
Details for the file silkloom_core-6.0.0-py3-none-any.whl.
File metadata
- Download URL: silkloom_core-6.0.0-py3-none-any.whl
- Upload date:
- Size: 7.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0eb1c04cf07ad6d86b873bab001ad868e0d3a628b390eb27237c18eda309c2ed
|
|
| MD5 |
05d58310a413746a0872c89e0479f6cd
|
|
| BLAKE2b-256 |
e850df5215f5cf8ba1e705ede8dd4b1dec553e431b5c8645f5d4d35dd19d4601
|
Provenance
The following attestation bundles were made for silkloom_core-6.0.0-py3-none-any.whl:
Publisher:
publish.yml on LeLiu-GeoAI/silkloom-core
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
silkloom_core-6.0.0-py3-none-any.whl -
Subject digest:
0eb1c04cf07ad6d86b873bab001ad868e0d3a628b390eb27237c18eda309c2ed - Sigstore transparency entry: 1821791522
- Sigstore integration time:
-
Permalink:
LeLiu-GeoAI/silkloom-core@81f79c9cf4499397919805fbb28dd841b8289d20 -
Branch / Tag:
refs/tags/v6.0.0 - Owner: https://github.com/LeLiu-GeoAI
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@81f79c9cf4499397919805fbb28dd841b8289d20 -
Trigger Event:
push
-
Statement type: