educhateval

A pipeline and package to implement and evaluate LLM chat bot tutors in education.

These details have not been verified by PyPI

Project links

Project description

logo

🚀 Overview

This package provides an evaluation framework for analyzing interactions between students and LLM-based tutors through classification, simulation, and visualization tools.

The package is designed to:

Provide a customized framework for classification, evaluation, and fine-tuning
Simulate student–tutor interactions using role-based prompts and seed messages when real data is unavailable
Initiate an interface with locally hosted, open-source models (e.g., via LM Studio or Hugging Face)
Log interactions in structured formats (JSON/CSV) for downstream analysis
Train and applu classifiers to predict customized interaction classes and visualize patterns across conversations

Overview of the system architecture:

flowchart

⚙️ Installation

pip install educhateval

🤗 Integration

Note that the framework and dialogue generation is integrated with LM Studio, and the wrapper and classifiers with Hugging Face.

📖 Documentation

Documentation	Description
📚 User Guide	Instructions on how to run the entire pipeline provided in the package
💡 Prompt Templates	Overview of system prompts, role behaviors, and instructional strategies
🧠 API References	Full reference for the `educhateval` API: classes, methods, and usage
🤔 About	Learn more about the thesis project, context, and contributors

⚙️ Usage

from pathlib import Path
from educhateval import FrameworkGenerator, 
                        DialogueSimulator,
                        PredictLabels,
                        Visualizer

1. Generate Label Framework

generator = FrameworkGenerator(
    model_name="llama-3.2-3b-instruct",
    api_url="http://localhost:1234/v1/completions"
)

df_4 = generator.generate_framework(
    prompt_path="outline_prompts/prompt_default_4types.py",
    num_samples=200
)

filtered_df = generator.filter_with_classifier(
    train_data="data/tiny_labeled_default.csv",
    synth_data=df_4
)

2. Synthesize Interaction

simulator = DialogueSimulator(
    backend="mlx",
    model_id="mlx-community/Qwen2.5-7B-Instruct-1M-4bit"
)

seed_message = "Hi, can you please help me with my English course?"

# Simulate a single student-tutor dialogue with a custom YAML file
df_single = simulator.simulate_dialogue(
    mode="general_task_solving",
    turns=10,
    seed_message_input=seed_message,
    custom_prompt_file=Path("prompts/my_custom_prompts.yaml")
)

3. Classify and Predict

predictor = PredictLabels(model_name="distilbert/distilroberta-base")

annotaded_df = predictor.run_pipeline(
    train_data=filtered_df,
    new_data=df_single,
    text_column="text",
    label_column="category",
    columns_to_classify=["student_msg", "tutor_msg"],
    split_ratio=0.2
)

4. Visualize

viz = Visualizer()

summary = viz.create_summary_table(
    df=annotaded_df,
    label_columns=["predicted_labels_student_msg", "predicted_labels_tutor_msg"]
)

viz.plot_category_bars(
    df=annotaded_df,
    label_columns=["predicted_labels_student_msg", "predicted_labels_tutor_msg"],
    use_percent=True,
    title="Distribution of Predicted Classes"
)

viz.plot_turn_trends(
    df=annotaded_df,
    student_col="predicted_labels_student_msg",
    tutor_col="predicted_labels_tutor_msg",
    title="Category Distribution over Turns"
)

viz.plot_history_interaction(
    df=annotaded_df,
    student_col="predicted_labels_student_msg",
    tutor_col="predicted_labels_tutor_msg",
    focus_agent="student",
    use_percent=True
)

🫶🏼 Acknowdledgement

This project builds on existing tools and ideas from the open-source community. While specific references are provided within the relevant scripts throughout the repository, the key sources of inspiration are also acknowledged here to highlight the contributions that have shaped the development of this package.

Constraint-Based Data Generation – Outlines Package: Willard, Brandon T. & Louf, Rémi (2023). Efficient Guided Generation for LLMs.
Chat Interface and Wrapper – Textual: McGugan, W. (2024, Sep). Anatomy of a Textual User Interface.
Package Design Inspiration: Thea Rolskov Sloth & Astrid Sletten Rybner
Code Debugging and Conceptual Feedback: Mina Almasi and Ross Deans Kristensen-McLachlan

📬 Contact

Made by Laura Wulff Paaby
Feel free to reach out via:

Complete overview:

├── data/                                  
│   ├── generated_dialogue_data/           # Generated dialogue samples
│   ├── generated_tuning_data/             # Generated framework data for fine-tuning 
│   ├── logged_dialogue_data/              # Logged real dialogue data
│   ├── Final_output/                      # Final classified data 
│
├── Models/                                # Folder for trained models and checkpoints (ignored)
│
├── src/educhateval/                       # Main source code for all components
│   ├── chat_ui.py                         # CLI interface for wrapping interactions
│   ├── descriptive_results/               # Scripts and tools for result analysis
│   ├── dialogue_classification/           # Tools and models for dialogue classification
│   ├── dialogue_generation/               
│   │   ├── agents/                        # Agent definitions and role behaviors
│   │   ├── models/                        # Model classes and loading mechanisms
│   │   ├── txt_llm_inputs/               # System prompts and structured inputs for LLMs
│   │   ├── chat_instructions.py          # System prompt templates and role definitions
│   │   ├── chat_model_interface.py       # Interface layer for model communication
│   │   ├── chat.py                       # Main script for orchestrating chat logic
│   │   └── simulate_dialogue.py          # Script to simulate full dialogues between agents
│   ├── framework_generation/            
│   │   ├── outline_prompts/              # Prompt templates for outlines
│   │   ├── outline_synth_LMSRIPT.py      # Synthetic outline generation pipeline
│   │   └── train_tinylabel_classifier.py # Training classifier on manually made true data
│
├── .python-version                       # Python version file for (Poetry)
├── poetry.lock                           # Locked dependency versions (Poetry)
├── pyproject.toml                        # Main project config and dependencies

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.10

Jun 1, 2025

0.1.9

Jun 1, 2025

0.1.8

May 7, 2025

0.1.7

Apr 29, 2025

This version

0.1.6

Apr 24, 2025

0.1.5

Apr 23, 2025

0.1.4

Apr 22, 2025

0.1.3

Apr 16, 2025

0.1.2

Apr 16, 2025

0.1.1

Apr 16, 2025

0.1.0

Apr 16, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

educhateval-0.1.6.tar.gz (37.4 kB view details)

Uploaded Apr 24, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

educhateval-0.1.6-py3-none-any.whl (45.6 kB view details)

Uploaded Apr 24, 2025 Python 3

File details

Details for the file educhateval-0.1.6.tar.gz.

File metadata

Download URL: educhateval-0.1.6.tar.gz
Upload date: Apr 24, 2025
Size: 37.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for educhateval-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`3afdd899c265ada608efa337eab5ff0996b96a1e9108c64aeb385da5a9a159f3`
MD5	`eb9183cdf49e5f9b16f2d1d0d55b22a6`
BLAKE2b-256	`1d7d3824df136655c138019e79dcd5baef2c0d210443cc50bd99a94ed6f4c78f`

See more details on using hashes here.

File details

Details for the file educhateval-0.1.6-py3-none-any.whl.

File metadata

Download URL: educhateval-0.1.6-py3-none-any.whl
Upload date: Apr 24, 2025
Size: 45.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.2

File hashes

Hashes for educhateval-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0c4f8c696bff9014ecc688c4062f7f210179e6fa5a6adf59ed9961ce85af21b6`
MD5	`fedf6ae03486e8befe8fbf92cf44a85d`
BLAKE2b-256	`871dc6d8cee7a2f72d6a85a2dd1ed724529a32120a44d124a976b1f529b985c3`

See more details on using hashes here.

educhateval 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🚀 Overview

⚙️ Installation

🤗 Integration

📖 Documentation

⚙️ Usage

🫶🏼 Acknowdledgement

📬 Contact

Complete overview:

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes