Skip to main content

中文法律大模型评测工具

Project description

UniLawBench

UniLawBench · PyPI 版

PyPI EvalScope GitHub


🗂️ 项目简介

UniLawBench 是面向中文法律场景的大模型评测工具。
PyPI 版内置全部 20 个任务数据,并通过 console_scripts 将主程序注册为 unilawbench 命令,真正实现 pip → run 开箱即用。
此外,本版本还集成了 PyQt5 图形界面,用户可通过命令行直接启动图形界面,完成模型选择、数据转换等操作。

⚠️ 包含完整数据集,不必再单独下载。


📦 安装

python -m pip install --upgrade pip
pip install unilawbench

安装时会自动拉取 evalscope[all] 及其依赖(包含 OpenCompass/VLMEvalKit 等后端)。同时,PyQt5 会作为依赖自动安装。


🚀 快速上手

1. 评测 (eval)

# 评估所有选择题数据集
unilawbench eval -form mcq --run-all -model ./weights

# 只评估 1-1、2-5 两个问答数据集
unilawbench eval -form qa -set 1-1 2-5 -model ./weights

2. 数据转换 (convert)

# JSON ↦ CSV(多选)
unilawbench convert --type mcq data/2-2.json data/2-2.csv

# JSON ↦ JSONL(纠纷焦点)
unilawbench convert --type focus data/focus.json data/focus.jsonl

🖥️ 启动图形界面

3. 启动图形界面

unilawbench gui

运行此命令后,会启动一个带有选项卡的 PyQt5 窗口,用户可以通过界面选择模型路径、进行文件转换等操作。


📚 任务列表(完整 20 项)

认知水平 ID 任务名称 数据源 指标 类型
法律知识记忆 1-1 法条背诵 FLK ROUGE-L 生成
1-2 知识问答 JEC_QA Accuracy 单选
法律知识理解 2-1 文件校对 CAIL2022 F0.5 生成
2-2 纠纷焦点识别 LAIC2021 F1 多选
2-3 婚姻纠纷鉴定 AIStudio F1 多选
2-4 问题主题识别 CrimeKgAssitant Accuracy 单选
2-5 阅读理解 CAIL2019 rc-F1 抽取
2-6 命名实体识别 CAIL2021 soft-F1 抽取
2-7 舆情摘要 CAIL2022 ROUGE-L 生成
2-8 论点挖掘 CAIL2022 Accuracy 单选
2-9 事件检测 LEVEN F1 多选
2-10 触发词提取 LEVEN soft-F1 抽取
法律知识应用 3-1 法条预测(基于事实) CAIL2018 F1 多选
3-2 法条预测(基于场景) LawGPT_zh Project ROUGE-L 生成
3-3 罪名预测 CAIL2018 F1 多选
3-4 刑期预测(无法条内容) CAIL2018 Normalized log-distance 回归
3-5 刑期预测(给定法条内容) CAIL2018 Normalized log-distance 回归
3-6 案例分析 JEC_QA Accuracy 单选
3-7 犯罪金额计算 LAIC2021 Accuracy 回归
3-8 咨询 hualv.com ROUGE-L 生成

🛠️ 目录结构

unilawbench/
├─ cli.py                 # 主入口
├─ dataset/               # 内置数据集 (mcq / qa)
├─ utils/                 # 数据转换等工具
├─ gui/                   # PyQt5 图形界面 (window.py)
└─ ...

📜 许可证

  • 代码:Apache-2.0
  • 数据:遵循各上游数据集许可证,详见 dataset/README.md

📑 引用

@article{fei2023lawbench,
  title   = {LawBench: Benchmarking Legal Knowledge of Large Language Models},
  author  = {Fei, Zhiwei and Shen, Xiaoyu and others},
  journal = {arXiv preprint arXiv:2309.16289},
  year    = {2023}
}

如果 UniLawBench 对您的研究或业务有帮助,请引用上文,并在文中注明使用本工具 🙏

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unilawbench-1.5.1.tar.gz (64.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unilawbench-1.5.1-py3-none-any.whl (64.6 MB view details)

Uploaded Python 3

File details

Details for the file unilawbench-1.5.1.tar.gz.

File metadata

  • Download URL: unilawbench-1.5.1.tar.gz
  • Upload date:
  • Size: 64.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unilawbench-1.5.1.tar.gz
Algorithm Hash digest
SHA256 bf84ad989a44017f0e33736b57ba9cd3c66c37bdd4602eb009a833bc84a0ba31
MD5 cc09c661aa405cc0c7db8bd0f3a01d84
BLAKE2b-256 c11f5ea7eed1c8551aa0c54bfa8f3ae4ba32abfbd5e4ca5f9744615a9d20234d

See more details on using hashes here.

File details

Details for the file unilawbench-1.5.1-py3-none-any.whl.

File metadata

  • Download URL: unilawbench-1.5.1-py3-none-any.whl
  • Upload date:
  • Size: 64.6 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for unilawbench-1.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fbb222992de0fee8a6f5cac6c48b1464bba28fc89ee044e9ef95483381aacbbd
MD5 691797c63cfe13663fc4d31973460995
BLAKE2b-256 7458cdd6d91f427c7d5822e687990185a287971113d1f9be35fd7a3c28683765

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page