Activation steering and trait monitoring for HuggingFace transformers — Python library, OpenAI-compatible server, and terminal UI
Project description
saklas
Activation steering and trait monitoring for HuggingFace transformer models. Extract steering vectors from contrastive pairs, apply them during generation with per-call alpha control, and watch how activations shift across behavioral probes — all without touching model weights.
Three frontends over one engine:
saklas <model>— interactive terminal UI for exploring vectors, live probe readings, and A/B comparisonsaklas serve <model>— OpenAI-compatible HTTP server; drop-in for the OpenAI SDK, LangChain,curlSaklasSession— Python library for scripted experiments, batch sweeps, and embedding steering into your pipelines
53 architectures supported out of the box. Steering vectors compose. Alphas are per-call — no persistent hooks, no model mutation. Probe history accumulates across generations.
Quick start
pip install saklas
saklas google/gemma-3-4b-it
That's the whole thing. The first run downloads the model, extracts the 21 bundled probes against it (a one-time cost, cached to disk), and drops you into the TUI. Hit /steer angry 0.3 — saklas resolves that to the bundled angry.calm axis with α=+0.3 so the model leans toward the angry pole. Type /steer calm 0.3 and you get the same vector at α=−0.3. Or [ / ] to nudge temperature, or Ctrl+A to A/B compare the steered output against the unsteered baseline.
Want to try it as an API server instead?
pip install saklas[serve]
saklas serve google/gemma-3-4b-it --steer cheerful:0.2
Or from Python:
from saklas import SaklasSession
with SaklasSession("google/gemma-3-4b-it") as s:
name, profile = s.extract("angry.calm") # bundled bipolar pack
s.steer(name, profile)
print(s.generate("What makes a good day?", alphas={name: 0.3}).text)
Install
pip install saklas # library + TUI
pip install saklas[serve] # + FastAPI/uvicorn for the API server
pip install saklas[research] # + datasets/pandas for dataset loading and DataFrame export
Requires Python 3.11+ and PyTorch 2.2+. Runs on Linux, macOS, and Windows. CPU works but is slow — CUDA or Apple Silicon MPS is recommended for anything interactive.
Quantization (experimental). The bnb and cuda extras pull in bitsandbytes and flash-attn for 4-bit/8-bit loading and fused attention. These depend on platform-specific CUDA toolchains and don't build cleanly everywhere; only the vanilla install is officially supported.
pip install saklas[bnb] # bitsandbytes only
pip install saklas[cuda] # bitsandbytes + flash-attn (Linux + CUDA_HOME required)
From source.
git clone https://github.com/a9lim/saklas
cd saklas
pip install -e ".[dev]" # + pytest
How it works
Steering vectors
Saklas uses Representation Engineering (Zou et al., 2023): for each contrastive pair, capture the last-content-token hidden state at every layer, diff the positive and negative sides, and take the first principal component per layer via SVD. Every layer gets a direction and a score (explained variance ratio); there is no manual layer selection.
Alpha is normalized per-profile so the same numeric value means the same intensity across backbones: α≈0.5 sits in the coherent-nuanced band on every bundled architecture, α≈1.0 is past the collapse cliff. Vectors are registered without alphas and applied per-call, so nothing persists on the model between generations.
Multiple vectors compose naturally — they register into a single manager that, per generation, installs a single in-place hidden-state hook per active layer (co-layer directions sum). Hooks are transient: composed before generation, removed after.
Custom concepts
When you steer on a concept that isn't in the curated probe library, the loaded model writes its own contrastive pairs. Generation is seeded across multiple "specificity lenses" (unique facts, physical traits, social dynamics, inner life, routines) and the prompt explicitly rejects generic pairs that could apply to anything similar. Pairs cache at ~/.saklas/vectors/local/<concept>/statements.json and are model-independent, so they're reused across models.
This means /steer "anything" works — religions, animals, fictional characters, "man who ate too much spaghetti." The vector captures what's distinctive about the concept, not generic associations.
Trait monitor
After each generation, saklas runs a separate forward pass over the generated text, pools hidden states from the last content token (matching probe extraction), mean-centers them against a cached per-layer baseline (computed from 45 neutral prompts), and scores against each probe via score-weighted cosine similarity. History accumulates across generations, enabling sparklines and running statistics.
Layer means are cached at ~/.saklas/models/<safe_model_id>/layer_means.safetensors and auto-invalidate when ~/.saklas/neutral_statements.json changes hash.
Probe library
21 probes across 6 categories, each backed by 45 curated contrastive pairs (topically disjoint, not minimal-word-swap — see CLAUDE.md for the generation discipline). Most probes are bipolar: the name carries both poles (angry.calm, masculine.feminine), the positive pole activates on α>0 and the negative pole on α<0. Monopolar probes have no named opposite.
| Category | Probes |
|---|---|
| Affect | angry.calm, fearful.brave, happy.sad |
| Epistemic | confident.uncertain, honest.deceptive, hallucinating.grounded |
| Alignment | agentic, refusal.compliant, sycophantic.blunt, manipulative |
| Register | formal.casual, direct.indirect, verbose.concise, creative.conventional |
| Social stance | authoritative.submissive, hierarchical.egalitarian, high_context.low_context |
| Cultural | masculine.feminine, western.eastern, religious.secular, traditional.progressive |
Bipolar probes are extracted from Speaker A IS X / Speaker B IS Y contrastive pairs, so the negative direction is a real coherent pole rather than "absence of X." /steer angry - calm and /steer angry.calm resolve to the same vector — each pole is slugged independently (non-alphanumerics collapse to _), joined by the bipolar separator ..
The bundled pairs are generated by saklas itself. scripts/regenerate_bundled_statements.py loads a capable instruct model (gemma-4-31b-it by default) and calls the same SaklasSession.generate_pairs pipeline the TUI uses when you /steer a novel concept — same system prompt, same five-domain seeds, same parser. Shipping the pack this way is both a calibration target and an end-to-end demonstration: the on-model generation path is robust enough that it's what populates saklas/data/vectors/ in the first place. Regenerate with python scripts/regenerate_bundled_statements.py --purge and the whole bundled pack is rebuilt from scratch by the same code you run every day.
Pole aliasing. Typing a single-pole name resolves to the full composite with the correct sign flip: /steer angry 0.5 is an alias for /steer angry.calm 0.5, and /steer calm 0.5 is an alias for /steer angry.calm -0.5. This works for any installed bipolar pack — bundled, HF-pulled, or user-authored — so /steer bob/wolf 0.4 resolves to bob/deer.wolf at α=-0.4 if that's what's installed. Collisions (e.g. alice/angry exists alongside default/angry.calm) raise the same ambiguity error as any namespace collision; disambiguate with ns/name.
Probes extract on first run against a new model and cache to ~/.saklas/vectors/default/<concept>/<safe_model_id>.safetensors.
Supported architectures
53 families via model.py:_LAYER_ACCESSORS. Adding a new one = one function entry. See CONTRIBUTING.md.
Llama (1–4), Mistral, Ministral, Mixtral, Gemma (1–4), Phi (1–3), PhiMoE, Qwen (1–3.5), Qwen-MoE variants, Cohere (1–2), DeepSeek (V2–V3), StarCoder2, OLMo (1–3), OLMoE, GLM (3–4), Granite, GraniteMoE, Nemotron, StableLM, GPT-2/Neo/J/BigCode/NeoX/OSS, Bloom, Falcon, Falcon-H1, MPT, DBRX, OPT, RecurrentGemma.
Terminal UI
saklas google/gemma-2-9b-it
saklas mistralai/Mistral-7B-Instruct-v0.3 -q 4bit
saklas meta-llama/Llama-3.1-8B-Instruct -p affect register
Layout
+-------------------------+----------------------------+------------------------+
| VECTORS | | TRAIT MONITOR |
| > angry.calm +0.30 | Chat | Affect |
| formal_cas +0.10 | | angry.calm #### .42 |
| | | happy.sad ##- -.15 |
| CONFIG | | Epistemic |
| temp ####- 0.7 | | honest_dec ### .31 |
| top-p #### 0.9 | | |
| | Type a message... | |
+-------------------------+----------------------------+------------------------+
Three panels: the vector registry on the left (with live alpha knobs), the chat on the center, the trait monitor on the right (sparklines per probe, sorted by current magnitude or delta). Tab cycles focus; arrow keys navigate and adjust.
TUI flags
| Flag | Description |
|---|---|
model |
HuggingFace ID or local path (optional if supplied by -c) |
-q, --quantize |
4bit or 8bit (CUDA only) |
-d, --device |
auto (default), cuda, mps, cpu |
-p, --probes |
Categories: all, none, affect, epistemic, alignment, register, social_stance, cultural |
-c, --config |
Load setup YAML (repeatable; later files override earlier) |
-s, --strict |
With -c: fail on missing vectors instead of warning |
System prompt, temperature, top-p, and max tokens are set interactively via slash commands — see below.
Keybindings
| Key | Action |
|---|---|
Tab / Shift+Tab |
Cycle panel focus |
Left / Right |
Adjust alpha |
Up / Down |
Navigate vectors / probes |
Enter |
Toggle vector on/off |
Backspace / Delete |
Remove selected vector or probe |
Ctrl+T |
Toggle thinking mode (for models that support it) |
Ctrl+A |
A/B compare (steered vs unsteered) |
Ctrl+R |
Regenerate last response |
Ctrl+S |
Cycle trait sort mode |
Ctrl+Y |
Toggle per-token probe highlighting (uses current trait selection) |
[ / ] |
Adjust temperature |
{ / } |
Adjust top-p |
Escape |
Stop generation |
Ctrl+Q |
Quit |
Chat commands
| Command | Description |
|---|---|
/steer "concept" [alpha] |
Extract and register a steering vector |
/steer "concept" - "baseline" [alpha] |
Contrastive steering against a baseline concept |
/probe "concept" |
Add a monitoring probe |
/probe "concept" - "baseline" |
Contrastive probe |
/clear |
Clear conversation history |
/rewind |
Undo last exchange |
/sys <prompt> |
Set system prompt |
/temp <value> |
Set temperature |
/top-p <value> |
Set top-p |
/max <value> |
Set max tokens per generation |
Commands that touch the model or modify history (/steer, /probe, /clear, /rewind) interrupt any in-progress generation and execute once it stops. Sending a new message mid-generation also interrupts and submits immediately.
Python API
from saklas import SaklasSession, DataSource, ResultCollector
with SaklasSession("google/gemma-3-4b-it", device="auto") as session:
# Load the bundled angry.calm bipolar pack
name, profile = session.extract("angry.calm")
session.steer(name, profile) # register (no alpha yet)
# Generate with steering (positive α = angry pole, negative α = calm pole)
result = session.generate(
"What makes a good day?",
alphas={name: 0.2},
)
print(result.text)
print(result.readings) # probe monitor data
# A/B comparison — omit alphas to get the unsteered baseline
baseline = session.generate("What makes a good day?")
# Alpha sweep across both poles
collector = ResultCollector()
for alpha in [-0.2, -0.1, 0, 0.1, 0.2]:
session.clear_history()
result = session.generate("Describe a sunset.", alphas={name: alpha})
collector.add(result, alpha=alpha)
collector.to_csv("sweep.csv")
Runnable examples in examples/:
sweep_alpha.py— sweep one vector's alpha and dump probe readingsab_compare.py— A/B a prompt with and without steering
Key concepts
Registration is state, alphas are per-call. session.steer("name", profile) stores the vector in the registry. session.generate(input, alphas={"name": 0.5}) applies it for that generation only. No persistent hooks live on the model between calls.
Composition is native. Pass multiple names in alphas={}; co-layer directions sum into a single in-place hook per layer.
Thinking mode is per-call. For models that support it (Qwen 3.5, QwQ, Gemma 4, gpt-oss, etc.), session.generate(input, thinking=True) enables the reasoning trace. Delimiters are detected automatically from the chat template — no hardcoded tokens. result.text contains only the final answer; streaming yields TokenEvent objects with thinking=True for the reasoning trace.
Alphas are backbone-normalized. The same numeric value means the same intensity across architectures. Start at 0.1–0.3 for subtle nudges, 0.4–0.6 for clear shifts, and treat anything past 0.8 as a coherence experiment.
_, ac = session.extract("angry.calm")
_, fc = session.extract("formal.casual")
session.steer("angry.calm", ac)
session.steer("formal.casual", fc)
session.generate("Hello.", alphas={"angry.calm": 0.2, "formal.casual": 0.1}) # both
session.generate("Hello.", alphas={"angry.calm": -0.2}) # steer toward calm
session.generate("Hello.") # neither
SaklasSession reference
session = SaklasSession(
model_id, # HuggingFace ID or local path
device="auto", # "auto", "cuda", "mps", "cpu"
quantize=None, # "4bit", "8bit", or None
probes=None, # list of categories, or None for all
system_prompt=None,
max_tokens=1024,
)
# Vector extraction — returns (canonical_name, profile). For bipolar
# extraction the canonical name is f"{pos}.{neg}"; each pole is slugged
# (hyphens and whitespace collapsed to underscores) and joined with the
# bipolar separator `.`.
name, profile = session.extract("curiosity") # fresh monopolar (generates pairs)
name, profile = session.extract("angry.calm") # bundled bipolar pack
name, profile = session.extract("happy", baseline="sad") # explicit bipolar → "happy.sad"
name, profile = session.extract([("pos", "neg"), ...]) # raw pairs
name, profile = session.extract(DataSource.csv("pairs.csv"))
session.save_profile(profile, "out.safetensors")
profile = session.load_profile("out.safetensors")
pairs = session.generate_pairs("curiosity") # list[(str, str)]
# Registry
session.steer("name", profile)
session.unsteer("name")
session.vectors # dict of registered profiles
# Generation (blocking)
result = session.generate(
"prompt",
alphas={"name": 0.5},
thinking=False,
seed=None,
stop=None,
logprobs=None,
)
# Streaming
for tok in session.generate_stream("prompt", alphas={"name": 0.5}):
print(f"[think] {tok.text}" if tok.thinking else tok.text, end="", flush=True)
# Monitor
session.monitor("honest")
session.monitor("custom", custom_profile)
session.unmonitor("honest")
# State
session.config.temperature = 0.8 # also top_p, max_new_tokens, system_prompt
session.history # conversation messages
session.last_result # most recent GenerationResult
session.stop() # interrupt generation
session.rewind() # drop last exchange
session.clear_history()
GenerationResult
result.text # decoded output (response only — thinking is separate)
result.tokens # token IDs
result.token_count
result.tok_per_sec
result.elapsed
result.finish_reason # "stop" | "length" | "stop_sequence"
result.vectors # {"angry.calm": 0.2} — snapshot of alphas used
result.readings # {"probe_name": ProbeReadings} if probes active
result.to_dict() # JSON-serializable
DataSource formats
from saklas import DataSource
DataSource.curated("angry.calm") # bundled
DataSource.json("pairs.json") # saklas schema
DataSource.csv("pairs.csv", positive_col="pos", negative_col="neg")
DataSource.huggingface("user/dataset", split="train[:100]") # needs datasets
DataSource(pairs=[("positive", "negative")])
ResultCollector
collector = ResultCollector()
collector.add(result, concept="angry.calm", alpha=0.2, run_id=1)
collector.to_dicts()
collector.to_jsonl("results.jsonl")
collector.to_csv("results.csv")
collector.to_dataframe() # needs pandas
Probe readings flatten to columns: probe_honest.deceptive_mean, probe_honest.deceptive_std, etc. Vector alphas flatten to vector_angry.calm_alpha.
OpenAI- and Ollama-compatible API server
Serve a steered model as an HTTP endpoint speaking both the OpenAI /v1/* protocol and the Ollama /api/* protocol on the same port. Works with the OpenAI Python/JS SDKs, LangChain, LlamaIndex, curl, Open WebUI, Enchanted, Msty, ollama-python, LangChain's ChatOllama, or anything that speaks either wire format.
pip install saklas[serve]
saklas serve google/gemma-2-9b-it --steer cheerful:0.2 --port 8000
With the OpenAI SDK
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key="unused")
# Server-default steering (--steer cheerful:0.2)
resp = client.chat.completions.create(
model="google/gemma-2-9b-it",
messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)
# Override steering per-request via extra_body
resp = client.chat.completions.create(
model="google/gemma-2-9b-it",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={
"steer": {
"alphas": {"cheerful": 0.4},
"thinking": True,
}
},
)
# Streaming
for chunk in client.chat.completions.create(
model="google/gemma-2-9b-it",
messages=[{"role": "user", "content": "Tell me a story."}],
stream=True,
):
print(chunk.choices[0].delta.content or "", end="", flush=True)
Serve flags
| Flag | Default | Description |
|---|---|---|
model |
required | HuggingFace ID or local path |
-H, --host |
0.0.0.0 |
Bind address |
-P, --port |
8000 |
Bind port |
-q, --quantize |
None | 4bit or 8bit |
-d, --device |
auto |
auto, cuda, mps, cpu |
-p, --probes |
all |
Probe categories to bootstrap |
-S, --steer |
— | Pre-load a vector, repeatable. name:alpha or name |
-C, --cors |
— | CORS origin, repeatable |
-k, --api-key |
None | Bearer auth token. Falls back to $SAKLAS_API_KEY. Unset = open. |
Endpoints
OpenAI-compatible
GET /v1/models,GET /v1/models/{id}POST /v1/chat/completions(streaming + non-streaming)POST /v1/completions(streaming + non-streaming)
Vector management
GET /v1/saklas/vectorsPOST /v1/saklas/vectors/extract(streams progress via SSE)POST /v1/saklas/vectors/loadDELETE /v1/saklas/vectors/{name}
Probe management
GET /v1/saklas/probesGET /v1/saklas/probes/defaultsPOST /v1/saklas/probes/{name}DELETE /v1/saklas/probes/{name}
Session management
GET /v1/saklas/sessionPATCH /v1/saklas/session— update temperature, top_p, max_tokens, system_promptPOST /v1/saklas/session/clearPOST /v1/saklas/session/rewind
Full interactive docs at http://localhost:8000/docs while the server is running.
OpenAI parity and limits
Chat/completions accept the full standard parameter surface: stop (string or list), seed, logit_bias, presence_penalty, frequency_penalty, logprobs + top_logprobs, stream_options.include_usage, max_completion_tokens, plus accept-and-ignore for user, n, response_format, and messages[].name. Responses include real usage counts, accurate finish_reason, and the first streaming chunk emits {role: "assistant"} per OpenAI convention. Error responses follow the OpenAI shape with type/param/code fields.
Probe readings piggyback as an extra probe_readings field in generation responses — standard clients ignore it, aware clients get inline monitoring data.
The server is stateless by default — each request carries its full message list, and neither conversation history nor probe accumulators persist across requests. The /v1/saklas/session/* routes are stateful by design for single-user workflows. Concurrent requests queue FIFO against a single generation lock.
Not supported: tool calling, strict JSON/json_schema mode, /v1/embeddings.
Ollama protocol (/api/*)
Point any Ollama client at http://localhost:8000 and it just works — no config shim, no proxy. saklas serve mounts the full Ollama route surface alongside the OpenAI routes on the same port, sharing one generation lock, one bearer-auth dependency, and one underlying session.
saklas serve google/gemma-2-9b-it --steer cheerful:0.2 --port 8000
# Open WebUI / Enchanted / any Ollama client: point at http://localhost:8000
# The loaded model appears under both its HF id (google/gemma-2-9b-it) and
# its Ollama alias (gemma2, gemma2:latest, gemma2:9b) in /api/tags.
# Raw curl — NDJSON streaming, matches Ollama wire format exactly
curl -N http://localhost:8000/api/chat -d '{
"model": "gemma2",
"messages": [{"role": "user", "content": "Write me a haiku."}],
"options": {
"temperature": 0.8,
"top_k": 50,
"repeat_penalty": 1.1,
"steer": {"cheerful": 0.3, "formal.casual": -0.2}
}
}'
Steering through Ollama clients. The non-standard steer field inside options carries saklas alphas — clients that don't know about it leave it alone, clients that want it get per-request control. Both flat ({"steer": {"name": alpha}}) and nested ({"steer": {"alphas": {...}, "thinking": true}}) forms are accepted. Merged over any server-side --steer defaults; zero-alphas are stripped.
Advertised endpoints: /api/version, /api/tags, /api/ps, /api/show, /api/chat, /api/generate, /api/pull (no-op success for the loaded model, 404 otherwise), HEAD / for liveness.
Option translation. temperature, top_p, top_k, seed, num_predict, stop, presence_penalty, frequency_penalty, repeat_penalty, and think all pipe through to the underlying session. repeat_penalty maps to saklas's presence_penalty via ln(repeat_penalty) — exact for positive logits, matching Ollama's "divide by penalty" semantics without the unbounded count weighting that plain frequency_penalty would introduce. Unrecognized options (min_p, mirostat*, num_ctx, typical_p, etc.) are logged at debug level and silently dropped.
Model aliasing. A saklas server hosts exactly one model. /api/tags advertises it under its HF id plus a hybrid alias set: an authoritative override table for popular families (where Ollama's catalogue rounds sizes differently — Gemma-2-2b is 2.6B params but Ollama calls it gemma2:2b), with <family>:<size> inference from model_info as a fallback for new architectures. By default the model field on incoming requests is accepted regardless of match, so clients with stale dropdowns don't 404 — set SAKLAS_OLLAMA_STRICT=1 to reject mismatches with a 404 instead.
Thinking. Streams as message.thinking on /api/chat and top-level thinking on /api/generate, matching Ollama's current schema. Open WebUI renders it as a collapsible reasoning panel automatically.
Not supported (Ollama protocol): /api/push, /api/create, /api/copy, /api/delete, /api/embeddings, /api/embed (all return 501). Saklas doesn't manage models the Ollama way — it loads one HF model at startup and serves it. The context field on /api/generate responses is omitted (not an empty list) because saklas can't round-trip Ollama's tokenized continuation state honestly.
The server is designed for trusted networks — see SECURITY.md for the threat model before exposing it beyond your local machine.
Managing concept packs
Saklas stores all state under ~/.saklas/ (override via SAKLAS_HOME):
~/.saklas/
neutral_statements.json # user-editable (copy-on-miss from package)
vectors/
default/<concept>/ # bundled probes
local/<concept>/ # user-authored + merged
<hf_owner>/<concept>/ # HF-pulled
models/<safe_model_id>/layer_means.safetensors
Each concept is a folder with pack.json (metadata + file hashes), statements.json (the contrastive pairs), and zero or more <safe_model_id>.safetensors tensor files (one per model the concept has been extracted against). Tensors are extracted lazily — a pack without tensors is fine; it'll extract on first use.
Packs are distributed as HuggingFace model repos (not datasets — safetensors is model-hub-native, and base_model frontmatter gives reverse-link discoverability from the base model's hub page). Pin any install to a git tag, branch, or commit SHA with @revision; pinned installs are preserved on refresh — pinning means pinning.
Commands
saklas install <target> [-s] [-a NS/NAME] [-f] # from HF coord (ns/name[@rev]) or folder path
saklas refresh <selector> [-m MODEL] # re-pull from source
saklas refresh neutrals # reserved: rewrite neutral_statements.json
saklas clear <selector> [-m MODEL] [-y] # delete per-model tensors, keep statements
saklas uninstall <selector> [-y] # fully remove concept folder
saklas list [selector] [-i] [-j] [-v] # includes HF hub by default
saklas merge <name> <components> [-m] [-f] [-s] # merge: saklas merge bard default/angry.calm:0.3,user/arch:0.4
Selectors (shared grammar): <name>, <ns>/<name>, tag:<tag>, namespace:<ns>, default, all. Bare names resolve across namespaces and error on ambiguity.
install -s / --statements-only keeps only statements.json and drops any tensors that arrived with the pack. The concept folder stays a legitimate standalone pack — tensors re-extract on first use against whatever model you load. Useful when you want the pairs but prefer to extract locally.
refresh neutrals is a reserved form that overwrites ~/.saklas/neutral_statements.json with the bundled package copy. Run this after upgrading across a release that changes the bundled neutrals — materialize_bundled is copy-on-miss so existing users keep their old file by default. Layer means auto-recompute on next session init via the hash check.
clear vs uninstall: clear deletes tensors but keeps statements.json and pack.json (so the concept remains selectable and will re-extract on demand). uninstall removes the whole folder. Uninstalling a bundled concept is allowed — it respawns on the next session init via materialize_bundled. Broad selectors (all, namespace:) require -y on both commands.
list queries the HF hub by default and merges results with local installs. Pass -i for installed-only, -j for JSON output, -v to include descriptions inline.
Python library
All of the above is also available programmatically:
from saklas import cache_ops
from saklas.cli_selectors import parse as sel_parse
cache_ops.install("a9lim/angry.calm@v1.2", as_=None, force=False, statements_only=False)
cache_ops.refresh(sel_parse("tag:affect"), model_scope="google/gemma-2-9b-it")
cache_ops.delete_tensors(sel_parse("angry.calm"), model_scope=None)
cache_ops.uninstall(sel_parse("angry.calm"), yes=False)
cache_ops.list_concepts(sel_parse("tag:affect"), hf=True, installed_only=False)
Tests
pytest tests/ # everything
pytest tests/test_server.py tests/test_results.py tests/test_datasource.py # CPU-only
pytest tests/test_smoke.py # GPU required
GPU tests (test_smoke.py, test_session.py) download google/gemma-3-4b-it (~8 GB) on first run and accept either CUDA or Apple Silicon MPS. Everything else runs anywhere.
Contributing
See CONTRIBUTING.md for dev setup, test layout, and the walkthrough for adding a new architecture. Security issues: SECURITY.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saklas-1.3.0.tar.gz.
File metadata
- Download URL: saklas-1.3.0.tar.gz
- Upload date:
- Size: 229.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6b6a295a06ce13215f7ea2760a58f1deb19d11a922f0547ef06893e690a0bb9
|
|
| MD5 |
3d0a63945df3c643147f3f25f8381d09
|
|
| BLAKE2b-256 |
a27d78ca5eea8721a8e0ad2fef4724bb40542680e965c00f1f26ec6f2016849e
|
Provenance
The following attestation bundles were made for saklas-1.3.0.tar.gz:
Publisher:
release.yml on a9lim/saklas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saklas-1.3.0.tar.gz -
Subject digest:
f6b6a295a06ce13215f7ea2760a58f1deb19d11a922f0547ef06893e690a0bb9 - Sigstore transparency entry: 1291857628
- Sigstore integration time:
-
Permalink:
a9lim/saklas@168d701f966ae45e131ecc22d5a2a52b650ae391 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/a9lim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@168d701f966ae45e131ecc22d5a2a52b650ae391 -
Trigger Event:
push
-
Statement type:
File details
Details for the file saklas-1.3.0-py3-none-any.whl.
File metadata
- Download URL: saklas-1.3.0-py3-none-any.whl
- Upload date:
- Size: 225.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
caecdb78293f73fa2a9ca46972d76c500a747194123bc8b2b40faba50daa1a5d
|
|
| MD5 |
c5dad845739994e0c29fe1fb9a40e11f
|
|
| BLAKE2b-256 |
076f66a3ff81ab1b509f61a70e6eb0a662be1074ba8f45e7af5d349cb8d598b1
|
Provenance
The following attestation bundles were made for saklas-1.3.0-py3-none-any.whl:
Publisher:
release.yml on a9lim/saklas
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saklas-1.3.0-py3-none-any.whl -
Subject digest:
caecdb78293f73fa2a9ca46972d76c500a747194123bc8b2b40faba50daa1a5d - Sigstore transparency entry: 1291857701
- Sigstore integration time:
-
Permalink:
a9lim/saklas@168d701f966ae45e131ecc22d5a2a52b650ae391 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/a9lim
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@168d701f966ae45e131ecc22d5a2a52b650ae391 -
Trigger Event:
push
-
Statement type: