Add your description here

Project description

elmes - 教育场景评测系统

Education Language Model Evaluation System (elmes) 是一个 Python 框架，旨在为LLM不同场景下的各种任务提供代理编排和自动评估的功能。它采用模块化架构，基于 YAML 配置，可扩展的实体使得该框架适用于构建、配置和评估复杂的基于代理的工作流。系统采用现代 Python 编程实践。

概述

elmes 使用户能够通过灵活的 YAML 配置系统配置、管理和评估 AI 代理和工作流。应用的核心功能包括：

可配置组件：描述代理、模型、任务、评估和工作流方向。
动态评估与工具：允许将结果导出并通过工具和数据库进行操作。
模块化、可扩展的实体：便于快速原型设计和新评估策略的实验。

安装

环境要求

Python 3.10+
支持OpenAI、Anthropic等主流AI模型
uv 包管理器（推荐）或 pip

快速开始

克隆项目

git clone <repository-url>
cd elmes

安装依赖

# 使用 uv (推荐)
uv sync

# 或使用 pip
pip install -e .

配置教育场景

cp config.yaml.example config.yaml

设置AI模型

编辑 config.yaml 文件，配置您的AI模型：

models:
  math_teacher:
    type: openai
    api_key: your_openai_api_key
    api_base: https://api.openai.com/v1
    model: gpt-4o-mini
    kargs:
      temperature: 0.7

命令行接口

elmes 提供了一个方便的 CLI 工具，定义在 pyproject.toml 中：

[project.scripts]
elmes = "elmes.cli:main"

CLI 功能包括：

generate(config: str, debug: bool)：根据 YAML 配置文件初始化并运行工作流。
export_json(input_dir: str, debug: bool)：导出生成结果为 JSON 格式。
eval(config: str, debug: bool)：评估生成结果。导出JSON及CSV
pipeline(config: str, debug: bool)：执行上述所有步骤。

例如使用方式：

elmes generate --config config.yaml --debug
elmes export_json --config config.yaml --debug
elmes eval --config config.yaml --debug 
elmes pipeline --config config.yaml --debug

核心组件

实体定义

实体（src/elmes/entity.py）为系统的主要概念提供了强类型和模块化支持。关键实体包括：

ElmesConfig：根配置，封装所有工作流、任务、代理、模型和评估配置。

class ElmesConfig(BaseModel):
    globals: GlobalConfig
    models: Dict[str, ModelConfig]
    agents: Dict[str, AgentConfig]
    directions: List[str]
    tasks: TaskConfig
    evaluation: Optional[EvalConfig] = None
    context: ElmesContext = ElmesContext(conns=[])

EvalConfig：描述如何执行评估，支持动态模式生成和多种评估方式。

class EvalConfig(BaseModel):
    model: str
    prompt: List[Prompt]
    format: List[FormatField]
    format_mode: Literal["tool", "prompt"] = "tool"
    def format_to_json_schema(self) -> str: ...
    def get_prompts(self) -> Tuple[str, List[Prompt]]: ...
    def format_to_pydantic(self) -> BaseModel: ...

评估框架

评估工具包封装在 src/elmes/evaluation.py 中。核心功能 generate_evaluation_tool 动态构建用于处理和存储评估结果的工具。

动态工具生成：使用当前评估配置，在运行时定义工具模式，支持可扩展的评估格式。
持久化支持：设计为序列化评估结果（作为 Pydantic 模型），并将其保存到数据库或存储后端（默认逻辑可以扩展以支持其他后端）。

评估工具示例：

def generate_evaluation_tool() -> BaseTool:
    ...
    @tool(
        name_or_callable="save_result_to_database",
        description="Save the evaluation results to a database.",
        return_direct=True,
        args_schema=CONFIG.evaluation.format_to_pydantic(),
    )
    def save_to_db(**kwargs):
        ...
        return kwargs

    return save_to_db

扩展 elmes

添加新代理或模型类型：通过扩展实体类并更新 YAML 配置。
集成新评估格式：通过 EvalConfig 模式和工具生成实用程序。
自定义工作流：编辑配置文件，并根据需要组合代理和工作流。

Project details

Release history Release notifications | RSS feed

1.0.0

Mar 15, 2026

0.1.13

Jul 28, 2025

0.1.12

Jul 25, 2025

0.1.11

Jul 4, 2025

0.1.10

Jun 23, 2025

0.1.9

Jun 20, 2025

This version

0.1.8

Jun 18, 2025

0.1.7

Jun 18, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

elmes-0.1.8.tar.gz (11.8 MB view details)

Uploaded Jun 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

elmes-0.1.8-py3-none-any.whl (11.7 MB view details)

Uploaded Jun 18, 2025 Python 3

File details

Details for the file elmes-0.1.8.tar.gz.

File metadata

Download URL: elmes-0.1.8.tar.gz
Upload date: Jun 18, 2025
Size: 11.8 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for elmes-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`630a28909ac2dd52a0aa464d5e018e0a4ec208e409140de6b7ad3ad6f191ff91`
MD5	`bb19ffb40182c7443747bcde2f6f290d`
BLAKE2b-256	`251e8a0ea7c625c312b815bd9097e9e85f569f22fda75744838bddecef4c2390`

See more details on using hashes here.

File details

Details for the file elmes-0.1.8-py3-none-any.whl.

File metadata

Download URL: elmes-0.1.8-py3-none-any.whl
Upload date: Jun 18, 2025
Size: 11.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for elmes-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0e4ab4246aab8a4e7050ab6568b5db8c8aec45f0a7539c82e2d4bc15d391f424`
MD5	`1f49797a75f8e70e871aad7a40d71bf4`
BLAKE2b-256	`71457bb2655214a67dd5b248c15107ba4516c6688ce298e3b59263249d3aada4`

See more details on using hashes here.

elmes 0.1.8

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

elmes - 教育场景评测系统

目录

概述

安装

环境要求

快速开始

命令行接口

核心组件

实体定义

评估框架

扩展 elmes

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes