A pipeline and package to implement and evaluate LLM chat bot tutors in education.
Project description
๐ Overview
This package provides an evaluation framework for analyzing interactions between students and LLM-based tutors through classification, simulation, and visualization tools.
The package is designed to:
- Provide a customized framework for classification, evaluation, and fine-tuning
- Simulate studentโtutor interactions using role-based prompts and seed messages when real data is unavailable
- Initiate an interface with locally hosted, open-source models (e.g., via LM Studio or Hugging Face)
- Log interactions in structured formats (JSON/CSV) for downstream analysis
- Train and applu classifiers to predict customized interaction classes and visualize patterns across conversations
Overview of the system architecture:
โ๏ธ Installation
pip install educhateval
โ๏ธ Usage
from educhateval import FrameworkGenerator,
DialogueSimulator,
PredictLabels,
Visualizer
๐ค Integration
Note that the framework and dialogue generation is integrated with LM Studio, and the wrapper and classifiers with Hugging Face
๐ Documentation
| Documentation | Description |
|---|---|
| ๐ User Guide | Instructions on how to run the entire pipeline provided in the package |
| ๐ก Prompt Templates | Overview of system prompts, role behaviors, and instructional strategies |
| ๐ง API References | Full reference for the educhateval API: classes, methods, and usage |
| ๐ค About | Learn more about the thesis project, context, and contributors |
๐ซถ๐ผ Acknowdledgement
This project builds on existing tools and ideas from the open-source community. While specific references are provided within the relevant scripts throughout the repository, the key sources of inspiration are also acknowledged here to highlight the contributions that have shaped the development of this package.
-
Constraint-Based Data Generation โ Outlines Package: Willard, Brandon T. & Louf, Rรฉmi (2023). Efficient Guided Generation for LLMs.
-
Chat Interface and Wrapper โ Textual: McGugan, W. (2024, Sep). Anatomy of a Textual User Interface.
-
Package Design Inspiration: Thea Rolskov Sloth & Astrid Sletten Rybner
-
Code Debugging and Conceptual Feedback: Mina Almasi and Ross Deans Kristensen-McLachlan
๐ฌ Contact
Made by Laura Wulff Paaby
Feel free to reach out via:
- ๐ LinkedIn
- ๐ง laurapaaby18@gmail.com
- ๐ GitHub
Complete overview:
โโโ data/
โ โโโ generated_dialogue_data/ # Generated dialogue samples
โ โโโ generated_tuning_data/ # Generated framework data for fine-tuning
โ โโโ logged_dialogue_data/ # Logged real dialogue data
โ โโโ Final_output/ # Final classified data
โ
โโโ Models/ # Folder for trained models and checkpoints (ignored)
โ
โโโ src/educhateval/ # Main source code for all components
โ โโโ chat_ui.py # CLI interface for wrapping interactions
โ โโโ descriptive_results/ # Scripts and tools for result analysis
โ โโโ dialogue_classification/ # Tools and models for dialogue classification
โ โโโ dialogue_generation/
โ โ โโโ agents/ # Agent definitions and role behaviors
โ โ โโโ models/ # Model classes and loading mechanisms
โ โ โโโ txt_llm_inputs/ # System prompts and structured inputs for LLMs
โ โ โโโ chat_instructions.py # System prompt templates and role definitions
โ โ โโโ chat_model_interface.py # Interface layer for model communication
โ โ โโโ chat.py # Main script for orchestrating chat logic
โ โ โโโ simulate_dialogue.py # Script to simulate full dialogues between agents
โ โโโ framework_generation/
โ โ โโโ outline_prompts/ # Prompt templates for outlines
โ โ โโโ outline_synth_LMSRIPT.py # Synthetic outline generation pipeline
โ โ โโโ train_tinylabel_classifier.py # Training classifier on manually made true data
โ
โโโ .python-version # Python version file for (Poetry)
โโโ poetry.lock # Locked dependency versions (Poetry)
โโโ pyproject.toml # Main project config and dependencies
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file educhateval-0.1.5.tar.gz.
File metadata
- Download URL: educhateval-0.1.5.tar.gz
- Upload date:
- Size: 38.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9af2a52443f57938c121d13e05498a4a67f5888e1e5465a0eb2a7ff4341df52d
|
|
| MD5 |
0fe29aa6a3ea1a5a303eeb46728d9829
|
|
| BLAKE2b-256 |
462b4350e6b4fcab58716930223f94875c6728d591169d6d5046e97e331b7d81
|
File details
Details for the file educhateval-0.1.5-py3-none-any.whl.
File metadata
- Download URL: educhateval-0.1.5-py3-none-any.whl
- Upload date:
- Size: 48.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9667ce6bebc1db685d2fe7807c4ca9f1b631057e6f2537cb8dc8fe105cf22c38
|
|
| MD5 |
8545b6cc05f7faa35331018dc570cc30
|
|
| BLAKE2b-256 |
23de2c223df66d2428b6e5cc7db74910b35cfc3868c35177c201c7285a7c9b87
|