Skip to main content

No project description provided

Project description

interview-eval

An automated interview evaluation system that simulates technical interviews using AI agents. The system consists of an AI interviewer and interviewee, conducting structured conversations based on predefined rubrics and strategies.

📦 Installation

pip install interview-eval

🌟 Features

  • AI-powered interviewer and interviewee agents
  • Configurable interview parameters and evaluation rubrics
  • Real-time conversation display with rich formatting
  • Detailed scoring and feedback system
  • Progress tracking and maximum question limits
  • Customizable OpenAI client configuration

🚀 Quick Start

from interview_eval import InterviewRunner, Interviewer, Interviewee
from interview_eval.utils import console, load_config, setup_logging
import logging
import yaml

# Load configuration
config = load_config("config.yaml")

# Setup logging and console
logger = setup_logging("interview.log", verbose=True)

# Initialize agents
interviewer = Interviewer(config)
interviewee = Interviewee(config)

# Create and run interview
runner = InterviewRunner(interviewer, interviewee, config, logger, console)
results = runner.run()

⚙️ Configuration

Create a YAML configuration file with the following structure:

interviewer:
  name: "Technical Interviewer"
  instructions: "Your interview guidelines..."
  rubric: "Evaluation criteria..."
  strategy:
    key_areas: [...]
    scoring_criteria: [...]
  client:  # Optional OpenAI client configuration
    api_key: "your-api-key"

interviewee:
  name: "Candidate"
  instructions: "Interviewee behavior guidelines..."

session:
  initial_message: "Welcome to the interview..."
  max_questions: 10
  max_retries: 2
  initial_context: {}

🎯 Advanced Interview

Custom Interview Flows

Note: This feature is still under development and will be available in future releases.

interview-eval lets you define custom interview flows as directed graphs, where each node represents an interview state (like asking questions or evaluating responses) and edges represent possible transitions between states. With this feature, you can create complex interview scenarios with branching logic, follow-up questions, and adaptive feedback based on the interviewee's responses.

Below is an example flow of interview, where the Interviewer evaluates the Interviewee's response and chooses to do one of the following actions:

  • Ask a follow-up question (Deep Dive)
  • Move on to the next topic (Next Topic)
  • Ask for clarification (Challenge)
  • End the interview (Conclude)

Interview Flow

Question Decontamination

For users conducting benchmark-based interview (like GSM8K, MMLU, etc.), interview-eval provides functions to prevent test set contamination through three transformation strategies:

from interview_eval import decontaminate_question

# Choose from three decontamination methods
question = decontaminate_question(
    question="What is 15% of 200?",
    reference_answer="30",
    method="modifying"  # or "unclarifying" or "paraphrasing"
)
  1. Unclarifying (method="unclarifying")

    • Removes key information while maintaining grammar
    • Forces interviewee to ask clarifying questions
    • Evaluates information-gathering skills
  2. Paraphrasing (method="paraphrasing")

    • Preserves exact meaning with different wording
    • Changes sentence structure
    • Maintains problem complexity
  3. Modifying (method="modifying")

    • Creates new but related questions
    • Keeps similar domain and difficulty
    • Tests same knowledge areas

Batch processing of questions is also supported:

from interview_eval import batch_decontaminate

questions = [
    {"question": "Q1...", "reference_answer": "A1..."},
    {"question": "Q2...", "reference_answer": "A2..."}
]

decontaminated = batch_decontaminate(
    questions,
    method="modifying",
    model="gpt-4"
)

Requirements & TODOs

  • Modifying problem ✔️

    • Python function to modify the problem modify_problem
    • Supported strategies: Unclarifying, Paraphrasing, and Modifying (given seed question, create a new question)
  • Feedback & Editing Loop

    • Proceed to next question if the response is graded as Good
    • 이전에 했던 feedback 주지 말기
  • Followup Questions

    • Problem, Response, Feedback, Followup Question, Response, Feedback, Followup Question, ...
  • Report Card

    • Per seed questions pool
    • Include information about the student's performance on each question that received different scores
  • More strict loading of config.yaml (e.g. check if all required fields are present)

  • Add documentation for the code

  • Support interview_type: "base", "adaptive"

  • Fix the organization for cli support

  • Add tests

  • Hide logging inside the Runner

  • Add support for seed questions

  • Release to PyPI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

interview_eval-0.1.1.tar.gz (11.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

interview_eval-0.1.1-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file interview_eval-0.1.1.tar.gz.

File metadata

  • Download URL: interview_eval-0.1.1.tar.gz
  • Upload date:
  • Size: 11.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.2

File hashes

Hashes for interview_eval-0.1.1.tar.gz
Algorithm Hash digest
SHA256 341a4288b2042a4426e714e616f5add385d2a040bf61ce38e8f62343d2a8e5d7
MD5 e4d7679e7d799c33eb5cd4b2019d12d5
BLAKE2b-256 34cf70eebd16f42f4ec73b1e03875eb7a4ee4fe78f6c2e3092b29e3873bf284f

See more details on using hashes here.

File details

Details for the file interview_eval-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for interview_eval-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0577d4eacac6d8a88716bf2fdf5ca51a4d98a12879edca6223552086218919fd
MD5 9e163f0338c77f86a1eb59802e65f767
BLAKE2b-256 0221b0f217843f5a1479c44d709e0cd305728a25b7b0d4b59abb891abc0ab9a3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page