Verified-on-Spark patterns lifted from the ai-field-notes blog into one importable Python package.

These details have not been verified by PyPI

Project links

Project description

fieldkit

Verified-on-Spark patterns lifted from the ai-field-notes blog into one importable Python package.

Every essay in ai-field-notes ends with evidence/ — a folder of working code that produced the article's numbers. After 30+ articles the same patterns kept reappearing: the same NIM client wrapper, the same chunk-embed-store dance, the same bench harness, the same verifier-loop math. fieldkit is what those evidence/ folders look like once the boilerplate is lifted into a real package.

The blog stays the long-form rationale. fieldkit is the pip install-able surface so you can reproduce — and extend — the work without re-pasting 80 lines of NIM-client setup per article.

Install

pip install fieldkit

For the bleeding edge between releases, install from the git tag instead:

pip install "git+https://github.com/manavsehgal/ai-field-notes.git@fieldkit/v0.2.0#subdirectory=fieldkit"

Quickstart

from fieldkit.nim import NIMClient

client = NIMClient(base_url="http://localhost:8000/v1", model="meta/llama-3.1-8b-instruct")
print(client.chat([{"role": "user", "content": "Hello, Spark."}]))

What's in v0.2.0

Module	Purpose	Source articles
`fieldkit.capabilities`	Typed Python facade over `spark-capabilities.json` — KV cache math, weight bytes, inference envelope.	`kv-cache-arithmetic-at-inference`, `gpu-sizing-math-for-fine-tuning`
`fieldkit.nim`	OpenAI-compatible NIM client wrapper with retry, chunking, and the 8192-token context guard.	`nim-first-inference-dgx-spark` and friends
`fieldkit.rag`	`Pipeline(embed_url, rerank_url, pgvector_dsn, generator)` — ingest → retrieve → rerank → fuse.	`naive-rag-on-spark` and friends
`fieldkit.eval`	`Bench`, `Judge`, `Trajectory` — plus v0.2's `AssertionGrader`, `PassAtK`, `AgentRun`, `MatchedBaseComparison`.	every article with a `bench.py` or `benchmark.py`, plus `clawgym-on-spark`, `autoresearchbench-on-spark`, `pass-at-k-after-the-seventh-patch`
`fieldkit.training` (new in v0.2)	`LoraReferenceSnapshot` (sidesteps peft 0.19's offloader bug), `WeightDeltaTracker` — for any RL or SFT loop. Lazy `torch` import; pure-inference envs don't pay.	`clawgym-on-spark-grpo`
`fieldkit.cli`	`fieldkit bench rag`, `fieldkit feasibility <id>`, `fieldkit envelope <size>`.	discoverability

What v0.2 adds

fieldkit.training — new module. LoraReferenceSnapshot is a CPU-resident snapshot of a peft adapter's LoRA tensors plus a context manager that swaps the snapshot in for one no-grad forward pass and restores trainable weights on exit. Solves a real peft 0.19 bug: model.load_adapter(adapter_name="reference", is_trainable=False) crashes with KeyError under device_map="auto" whenever the GPU has anything else resident — peft's offload-detection over-triggers on Spark unified memory. WeightDeltaTracker is a pre/post snapshot of trainable params with L2 + max|Δ| reporting — sanity-check that any fine-tuning step actually moved weights.
fieldkit.eval.AssertionGrader — pure-function grader over five file-system assertion primitives (file_exists, file_not_exists, file_contents_contain, file_contents_match_regex, file_unchanged). Lifted from clawgym-on-spark's deterministic grader; no LLM, no fuzzy matching.
fieldkit.eval.PassAtK + pass_at_k_estimator — verifier-loop with the Chen 2021 unbiased pass@k estimator (lower variance than the naive 1 - (1-p)^k for finite n).
fieldkit.eval.AgentRun + TurnDetail + summarize_agent_runs — per-question agent-bench schema with overrideable field-name path tuples for non-AutoResearchBench layouts.
fieldkit.eval.MatchedBaseComparison + GroupStats — two-rollout B−A driver with per-group and per-assertion-kind delta and a markdown .report(). Reusable for any LoRA / adapter ablation, fine-tuned-vs-base, or system-prompt-A-vs-B comparison.

Deferred to v0.3+: fieldkit.agents (Persona / WorkspaceSeed / SynthTask / TaskAuthor / Sandbox / RolloutDriver / Trajectory + TurnRecord — 7 symbols), fieldkit.inference.VLLMClient, and replay_messages_from_trajectory. Each needs a second consuming article before its public API locks.

Hardware

Every code path is verified on a DGX Spark (GB10, 128 GB unified memory, NIM 8B + embed NIM + pgvector co-resident). fieldkit.training's torch + safetensors imports are lazy, so the package costs nothing on inference-only boxes — install torch and safetensors yourself in the training environment when you need the training primitives. NeMo / Triton / pytorch-base containers ship them; pure-inference envs don't.

Portability to non-Spark CUDA 12.x boxes lands when there's demand.

License

Apache-2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

May 22, 2026

0.4.3

May 17, 2026

0.4.2

May 16, 2026

0.4.1

May 14, 2026

0.4.0

May 14, 2026

0.3.0

May 11, 2026

This version

0.2.0.post1

May 6, 2026

0.2.0

May 6, 2026

0.1.0

May 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fieldkit-0.2.0.post1.tar.gz (91.8 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fieldkit-0.2.0.post1-py3-none-any.whl (54.5 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file fieldkit-0.2.0.post1.tar.gz.

File metadata

Download URL: fieldkit-0.2.0.post1.tar.gz
Upload date: May 6, 2026
Size: 91.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fieldkit-0.2.0.post1.tar.gz
Algorithm	Hash digest
SHA256	`9af04ae5f21af6f259ed49d1bf79d480df1b568eace518e63da535fcfc51f704`
MD5	`27356691a641323604a38ddf85fcbfcc`
BLAKE2b-256	`b17ee7c18b35ec5011b450074018bd3ca75e73f352596ead72866e8a0b029e4a`

See more details on using hashes here.

File details

Details for the file fieldkit-0.2.0.post1-py3-none-any.whl.

File metadata

Download URL: fieldkit-0.2.0.post1-py3-none-any.whl
Upload date: May 6, 2026
Size: 54.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for fieldkit-0.2.0.post1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4fbf6a6515ee3d44d3f665444e77f794335622bce3b1e0c88d164b2bb9af04c4`
MD5	`2d0db06a3977331db75b592c089e8c83`
BLAKE2b-256	`b75575a97e02560b7e66d8d9d9b9a0d7d925bc194458db557d4881a666c9e280`

See more details on using hashes here.

fieldkit 0.2.0.post1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

fieldkit

Install

Quickstart

What's in v0.2.0

What v0.2 adds

Hardware

License

Links

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes