Skip to main content

Conversational AI agent that drives scikit-rec via tool use

Project description

scikit-rec-agent

Conversational AI agent that uses scikit-rec as its tool belt. The agent reasons about the user's data and goals, then calls scikit-rec APIs via structured tool use to build, evaluate, and compare recommendation systems.

Install

pip install scikit-rec-agent[anthropic]     # with Claude
pip install scikit-rec-agent[openai]        # with GPT-4
pip install scikit-rec-agent                # bring your own LLM
pip install scikit-rec-agent[anthropic,torch]  # + deep-learning models

CLI

export ANTHROPIC_API_KEY=...
scikit-rec-agent chat

Auto-detects the provider from env vars. Pass --provider {anthropic,openai} if both are set.

Library

import anthropic
from scikit_rec_agent import Agent
from scikit_rec_agent.llm.anthropic import AnthropicAdapter

agent = Agent(llm=AnthropicAdapter(anthropic.Anthropic()))
for event in agent.chat_turn("I have click data at /data/interactions.csv — help me build a ranker"):
    ...

See examples/ for:

  • custom_tool.py — register a user-defined tool
  • custom_prompt.py — extend or replace the system prompt
  • custom_llm.py — plug in your company's internal LLM via the BaseLLM protocol
  • custom_frontend.py — drive the agent from Jupyter / Slack / web
  • movielens_session.md — annotated end-to-end transcript

What it does

Fourteen tools cover the full scikit-rec workflow — from raw data to a saved, tuned model:

Tool What it does
profile_data Loads a CSV/parquet and reports shape, dtypes, sparsity, target type, and temporal range. Heuristic role detection for USER_ID / ITEM_ID / OUTCOME / TIMESTAMP.
validate_data Checks a file against scikit-rec's required schema. Suggests column-rename mappings when names are close.
transform_data Reshapes a raw file into one of nine scikit-rec contracts (long, long-with-timestamp, long-multi-reward, wide multi-output, multiclass, prebuilt sequences, sessions, users features, items features). Auto-detects source shape; applies pivot, melt, aggregate, dedupe, and cast as needed.
create_datasets Builds scikit-rec Dataset handles from file paths. Auto-generates schemas from dtypes; auto-dispatches to InteractionsDataset / InteractionMultiOutputDataset / InteractionMultiClassDataset.
split_data Splits a bundle into train/valid/test using temporal, leave-last-n-per-user, random-split-per-user, leave-n-users-out, or random-split. Errors loudly on degenerate splits (e.g. per-user split on one-row-per-user data).
train_model Trains a recommender from a RecommenderConfig dict via scikit-rec's factory. Failure envelopes carry a category from the diagnose registry plus a one-line hint.
sweep_methods Trains and evaluates multiple methods on the same bundle and returns a ranked leaderboard. Modes: list (menu only), auto (data-aware filter + hyperparameter resize), all (every entry), broad (every capability-compatible triple), or explicit method dicts / short_names. Idempotent across re-runs.
diagnose_training_failure Pattern-matches a failed train_model envelope against a 14-pattern registry and returns ranked candidate fixes with structured actions. Auto-retries the top safe fix; bounded by max_retries to prevent loops.
evaluate_model Runs offline evaluation on a trained model with any of 7 evaluator types × 9 metrics at multiple k values. Auto-builds eval_kwargs from the bundle's validation interactions for the simple evaluator.
compare_models Renders a markdown leaderboard across all (or a chosen subset of) trained models in the session, sorted by a primary metric.
run_hpo Optuna-driven hyperparameter search over a user-specified search_space. Persists the best config and writes the tuned model into the session.
save_model Persists a trained model to the local file-based registry with optional tags.
list_models Lists saved models in the registry with their metadata and tags.
load_model Restores a saved model into the current session for further use.

The system prompt is built at import time from scikit-rec's live enum maps, so new recommender / scorer / estimator types get picked up automatically.

Hallucination safeguards

The agent runs two deterministic detectors on every turn's output:

  • URL echo check — flags https://... links the model introduces that the user did not supply this session. Shipped adapters have no web retrieval, so model-introduced URLs are common fabrications.
  • Foreign-reference check — scans fenced Python blocks for imports and bare-alias usage outside {skrec, scikit_rec, scikit_rec_agent, stdlib}. Library APIs we own have a runtime backstop via the scikit-rec factory; external libraries don't.

Warnings are emitted as AgentEvent(type="warning") and never enter conversation history. Opt out with Agent(..., enable_safeguards=False).

Scope and limitations

The detectors are deliberately narrow. They catch the common confident-plausible-looking fabrication case with near-zero false positives, not every possible hallucination. What they do not catch:

  • Semantic errors inside trusted APIs (wrong RecommenderConfig shape, poor metric choice). The scikit-rec factory catches bad configs at train_model; the rest is on the user.
  • Invented keyword arguments for external libraries. We flag pandas as unverified, not the specific make_up_kwarg=True.
  • Fabricated dataset names, paper citations, or prose claims. We only inspect URLs and Python code blocks.
  • Adversarial evasion (aliased importlib, f-string import args, triple-backticks inside docstrings, ast.parse-rejecting blocks).

See scikit_rec_agent/safeguards.py for the full contract.

Architecture

See agentic_design.md for the authoritative spec.

Contributing

Contributions welcome — see CONTRIBUTING.md for dev setup, test commands, and where new work fits best.

License

Apache-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit_rec_agent-0.2.0.tar.gz (140.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scikit_rec_agent-0.2.0-py3-none-any.whl (83.6 kB view details)

Uploaded Python 3

File details

Details for the file scikit_rec_agent-0.2.0.tar.gz.

File metadata

  • Download URL: scikit_rec_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 140.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scikit_rec_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 558c0f84b9fb9ed290acb227d36b99abfbeccf524bec4f2977c385c8b4ae9b9f
MD5 330ac7350d4f18bea24395c9d73a2c3a
BLAKE2b-256 0d197ed922d82e1a841ee1f44129485ad6ac8c68a896edbec271b861c2235e70

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_rec_agent-0.2.0.tar.gz:

Publisher: publish.yml on intuit/scikit-rec-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scikit_rec_agent-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for scikit_rec_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e405b8a5fbfba50946da68efbe3e8ffc8d1ff63fd35e85b7353a1099cb3dd766
MD5 8420b6a047eea48f789fc05b60a619d3
BLAKE2b-256 1c17ff6e720d95795062a25eb09407ee78ec2d167dc52b6256c9436346e48a70

See more details on using hashes here.

Provenance

The following attestation bundles were made for scikit_rec_agent-0.2.0-py3-none-any.whl:

Publisher: publish.yml on intuit/scikit-rec-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page