Skip to main content

Infer a full skill profile from a few observations. Few-shot capability estimation for AI agents and humans.

Project description

skillinfer

Infer a full skill profile from a few observations.

Observe a few skills, predict the rest — with calibrated uncertainty. skillinfer learns how capabilities co-vary across a population and uses that structure to infer a full profile from partial observations.

A closed-form Bayesian update — no training loop, no GPU. One matrix operation gives you the exact posterior. Under 1ms per update, scales to 1000+ skills.

Install

pip install skillinfer

Quick start

import skillinfer

pop = skillinfer.datasets.onet()          # 894 occupations x 120 skills
profile = pop.profile()                   # new entity, unknown
profile.observe("Skill:Programming", 0.92)
print(profile.predict())                  # predict all 120 skills
                           feature   mean    std  ci_lower  ci_upper
    Skill:Complex Problem Solving   0.81   0.17      0.47      1.00
          Skill:Critical Thinking   0.73   0.15      0.43      1.00
              Skill:Programming     0.92   0.01      0.90      0.93  ← observed
                Skill:Mathematics   0.67   0.12      0.43      0.91
         Ability:Static Strength   0.10   0.23      0.00      0.55  ← anti-correlated
...
[120 rows x 5 columns]

How it works

When you observe one skill, the Kalman update propagates to every other skill via the learned covariance:

  • Skills with positive covariance move in the same direction (observe high Programming → predict high Analytical Reasoning)
  • Skills with negative covariance move opposite (observe high Programming → predict low Static Strength)
  • Independent skills are unaffected

The update is the standard closed-form Gaussian conditioning rule, and reported predictions are clipped to [0, 1] to match the population's natural scale.

Each observe() call is O(K²) — one matrix-vector product. No iteration, no convergence.

Core API

import skillinfer
from skillinfer import Skill, Task

# Build a population from any entity-feature matrix
pop = skillinfer.Population.from_dataframe(df)

# Create a profile and observe
profile = pop.profile()
profile.observe("math", 0.95)
profile.observe_many({"code": 0.89, "writing": 0.70})

# Predict with uncertainty
profile.predict()                          # all skills, with CIs
profile.predict("reasoning")               # single skill
profile.most_uncertain(k=3)                # what to assess next

# Match agents to tasks
task = Task({"math": 1.0, "reasoning": 0.5})
result = profile.match_score(task, threshold=0.8)

# Rank a pool of agents
ranking = skillinfer.rank_agents(task, profiles, threshold=0.8)

# Summary statistics
profile.summary(true_vector=ground_truth)  # MAE, RMSE, coverage, etc.
pop.summary()                              # condition number, sparsity, etc.

Built-in datasets

Two preprocessed datasets ship with the package (~440 KB total):

# O*NET 30.2 — U.S. Department of Labor
# 894 occupations x 120 features (skills, knowledge, abilities)
pop = skillinfer.datasets.onet()

# ESCO v1.2.1 — European Commission
# 2,999 occupations x 134 skill groups (binary)
pop = skillinfer.datasets.esco()
Dataset Entities Features Scale Source
O*NET 894 occupations 120 (35 skills, 33 knowledge, 52 abilities) Continuous [0, 1] O*NET 30.2, CC BY 4.0
ESCO 2,999 occupations 134 Level-2 skill groups Binary {0, 1} ESCO v1.2.1

Use cases

Domain Observe Predict
AI model selection 1-2 benchmark scores All benchmarks + best model for a task
Human skill profiling A few task observations Full occupational profile (120 skills)
Human-AI orchestration Partial evals for both Who handles which subtask
Worker-task matching Known competencies Fit for new roles and tasks

LLM orchestration

skillinfer profiles are structured context you feed to an LLM orchestrator alongside cost, latency, and business constraints. The LLM reasons about observed vs. inferred skills and applies natural language constraints that no scoring function could replicate:

from openai import OpenAI

# Build profiles from partial evaluations
agents = {
    "gpt-4o":     {"reasoning": 0.92, "code": 0.89},
    "claude-3.5": {"reasoning": 0.90, "writing": 0.95},
    "gemini-pro": {"math": 0.88, "code": 0.82},
}
profiles = {
    name: pop.profile().observe_many(obs)
    for name, obs in agents.items()
}

# Format as context for the orchestrator
agent_context = ""
for name, profile in profiles.items():
    agent_context += f"\n{name}:\n"
    for skill in ["math", "reasoning", "code"]:
        pred = profile.predict(skill)
        source = "observed" if pred["std"] < 0.01 else "inferred"
        agent_context += f"  {skill}: {pred['mean']:.2f} ± {pred['std']:.2f} ({source})\n"

# The LLM decides — not a scoring function
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": f"Pick an agent for this math task.\n{agent_context}"}],
)

Export / import

# Population
pop.to_csv("population.csv")
pop.to_parquet("population.parquet")
pop = skillinfer.Population.from_csv("population.csv")
pop = skillinfer.Population.from_parquet("population.parquet")

# Profile
profile.to_json("profile.json")
restored = skillinfer.Profile.from_json("profile.json")

d = profile.to_dict()   # plain dict, JSON-serialisable
restored = skillinfer.Profile.from_dict(d)

Visualization

Requires pip install skillinfer[viz].

import skillinfer

pop = skillinfer.datasets.onet()
profile = pop.profile()
profile.observe("Skill:Programming", 0.92)

# Population charts
skillinfer.visualization.correlation_heatmap(pop)     # clustered correlation matrix
skillinfer.visualization.scree_plot(pop)               # PCA variance explained
skillinfer.visualization.feature_distributions(pop)    # box plots by variance
skillinfer.visualization.skill_embedding(pop)          # 2D PCA feature map
skillinfer.visualization.convergence_curve(pop)        # MAE vs. observations

# Profile charts
skillinfer.visualization.posterior_profile(profile)    # predicted skills + uncertainty
skillinfer.visualization.prediction_scatter(profile, true_vec)  # predicted vs. true
skillinfer.visualization.uncertainty_waterfall(pop, observations)  # uncertainty per observation
skillinfer.visualization.compare_profiles({"dev": dev, "nurse": nurse})  # side-by-side

Documentation

Full documentation at kostadindev.github.io/skillinfer:

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skillinfer-0.1.1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skillinfer-0.1.1-py3-none-any.whl (539.9 kB view details)

Uploaded Python 3

File details

Details for the file skillinfer-0.1.1.tar.gz.

File metadata

  • Download URL: skillinfer-0.1.1.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for skillinfer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f3e31019f0939c36c0f24abb96e53c10bd3dfa8488e0dbfad397c3b45d956946
MD5 cd02a164c5a9afc3677bbe03beb72ee7
BLAKE2b-256 ba7cda12795cd99aa5a78881b432573b24330b270c85ee8303adb6b4843dc724

See more details on using hashes here.

File details

Details for the file skillinfer-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: skillinfer-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 539.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for skillinfer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3989a9181ea9986797394164462cccf7580f4769bc8e4e00420b5a89aecef968
MD5 3a115dee13f42715a1f2ea5291a4918f
BLAKE2b-256 eaf4fe861afb23793f0124798eabc7734598fceff9ce15edd5b6731327948e08

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page