Skip to main content

Infer a full skill profile from a few observations. Few-shot capability estimation for AI agents and humans.

Project description

skillinfer

Infer a full skill profile from a few observations.

Observe a few skills, predict the rest — with calibrated uncertainty. skillinfer learns how capabilities co-vary across a population and uses that structure to infer a full profile from partial observations.

A closed-form Bayesian update — no training loop, no GPU. One matrix operation gives you the exact posterior. Under 1ms per update, scales to 1000+ skills.

Install

pip install skillinfer

Quick start

import skillinfer

pop = skillinfer.datasets.onet()          # 894 occupations x 120 skills
profile = pop.profile()                   # new entity, unknown
profile.observe("Skill:Programming", 0.92)
print(profile.predict())                  # predict all 120 skills
                           feature   mean    std  ci_lower  ci_upper
    Skill:Complex Problem Solving   0.81   0.17      0.47      1.00
          Skill:Critical Thinking   0.73   0.15      0.43      1.00
              Skill:Programming     0.92   0.01      0.90      0.93  ← observed
                Skill:Mathematics   0.67   0.12      0.43      0.91
         Ability:Static Strength   0.10   0.23      0.00      0.55  ← anti-correlated
...
[120 rows x 5 columns]

How it works

When you observe one skill, the Kalman update propagates to every other skill via the learned covariance:

  • Skills with positive covariance move in the same direction (observe high Programming → predict high Analytical Reasoning)
  • Skills with negative covariance move opposite (observe high Programming → predict low Static Strength)
  • Independent skills are unaffected

The update is the standard closed-form Gaussian conditioning rule, and reported predictions are clipped to [0, 1] to match the population's natural scale.

Each observe() call is O(K²) — one matrix-vector product. No iteration, no convergence.

Core API

import skillinfer
from skillinfer import Skill, Task

# Build a population from any entity-feature matrix
pop = skillinfer.Population.from_dataframe(df)

# Create a profile and observe
profile = pop.profile()
profile.observe("math", 0.95)
profile.observe_many({"code": 0.89, "writing": 0.70})

# Predict with uncertainty
profile.predict()                          # all skills, with CIs
profile.predict("reasoning")               # single skill
profile.most_uncertain(k=3)                # what to assess next

# Match agents to tasks
task = Task({"math": 1.0, "reasoning": 0.5})
result = profile.match_score(task, threshold=0.8)

# Rank a pool of agents
ranking = skillinfer.rank_agents(task, profiles, threshold=0.8)

# Summary statistics
profile.summary(true_vector=ground_truth)  # MAE, RMSE, coverage, etc.
pop.summary()                              # condition number, sparsity, etc.

Built-in datasets

Two preprocessed datasets ship with the package (~440 KB total):

# O*NET 30.2 — U.S. Department of Labor
# 894 occupations x 120 features (skills, knowledge, abilities)
pop = skillinfer.datasets.onet()

# ESCO v1.2.1 — European Commission
# 2,999 occupations x 134 skill groups (binary)
pop = skillinfer.datasets.esco()
Dataset Entities Features Scale Source
O*NET 894 occupations 120 (35 skills, 33 knowledge, 52 abilities) Continuous [0, 1] O*NET 30.2, CC BY 4.0
ESCO 2,999 occupations 134 Level-2 skill groups Binary {0, 1} ESCO v1.2.1

Use cases

Domain Observe Predict
AI model selection 1-2 benchmark scores All benchmarks + best model for a task
Human skill profiling A few task observations Full occupational profile (120 skills)
Human-AI orchestration Partial evals for both Who handles which subtask
Worker-task matching Known competencies Fit for new roles and tasks

LLM orchestration

skillinfer profiles are structured context you feed to an LLM orchestrator alongside cost, latency, and business constraints. The LLM reasons about observed vs. inferred skills and applies natural language constraints that no scoring function could replicate:

from openai import OpenAI

# Build profiles from partial evaluations
agents = {
    "gpt-4o":     {"reasoning": 0.92, "code": 0.89},
    "claude-3.5": {"reasoning": 0.90, "writing": 0.95},
    "gemini-pro": {"math": 0.88, "code": 0.82},
}
profiles = {
    name: pop.profile().observe_many(obs)
    for name, obs in agents.items()
}

# Format as context for the orchestrator
agent_context = ""
for name, profile in profiles.items():
    agent_context += f"\n{name}:\n"
    for skill in ["math", "reasoning", "code"]:
        pred = profile.predict(skill)
        source = "observed" if pred["std"] < 0.01 else "inferred"
        agent_context += f"  {skill}: {pred['mean']:.2f} ± {pred['std']:.2f} ({source})\n"

# The LLM decides — not a scoring function
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": f"Pick an agent for this math task.\n{agent_context}"}],
)

Export / import

# Population
pop.to_csv("population.csv")
pop.to_parquet("population.parquet")
pop = skillinfer.Population.from_csv("population.csv")
pop = skillinfer.Population.from_parquet("population.parquet")

# Profile
profile.to_json("profile.json")
restored = skillinfer.Profile.from_json("profile.json")

d = profile.to_dict()   # plain dict, JSON-serialisable
restored = skillinfer.Profile.from_dict(d)

Visualization

Requires pip install skillinfer[viz].

import skillinfer

pop = skillinfer.datasets.onet()
profile = pop.profile()
profile.observe("Skill:Programming", 0.92)

# Population charts
skillinfer.visualization.correlation_heatmap(pop)     # clustered correlation matrix
skillinfer.visualization.scree_plot(pop)               # PCA variance explained
skillinfer.visualization.feature_distributions(pop)    # box plots by variance
skillinfer.visualization.skill_embedding(pop)          # 2D PCA feature map
skillinfer.visualization.convergence_curve(pop)        # MAE vs. observations

# Profile charts
skillinfer.visualization.posterior_profile(profile)    # predicted skills + uncertainty
skillinfer.visualization.prediction_scatter(profile, true_vec)  # predicted vs. true
skillinfer.visualization.uncertainty_waterfall(pop, observations)  # uncertainty per observation
skillinfer.visualization.compare_profiles({"dev": dev, "nurse": nurse})  # side-by-side

Documentation

Full documentation at kostadindev.github.io/skillinfer:

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

skillinfer-0.1.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

skillinfer-0.1.0-py3-none-any.whl (285.9 kB view details)

Uploaded Python 3

File details

Details for the file skillinfer-0.1.0.tar.gz.

File metadata

  • Download URL: skillinfer-0.1.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for skillinfer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 be263ff0328041a3d2aea79124603919fb437623330534ae91683771746a0e16
MD5 51acfbdbcc7d6d5ccc96720be5576b54
BLAKE2b-256 8c8082d8890852c71f3ba9791d27fa402a1e3e257c72887ce58e430b3cfcb6bf

See more details on using hashes here.

File details

Details for the file skillinfer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: skillinfer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 285.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for skillinfer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 783942e7af3adcad3680f2e04173a2a1a5a51358a729295c22be225eeab55d99
MD5 4d00358e2949ab6c624d54f21604ef52
BLAKE2b-256 4dab0e163dd2af4730a35f27791774cb2e748743b7f2506053177ad3631a1270

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page