LLM Chinese Couplet Generation Evaluation Framework

These details have not been verified by PyPI

Project links

Project description

DuiZhang (对仗)

LLM Chinese Couplet Generation Evaluation Framework — Evaluate and compare large language models' Chinese language capabilities through traditional couplet (对联) generation tasks.

Features

Rule-based Evaluation: 6 core metrics (POS matching, structure, rhythm, tone, content relevance, imagery correspondence) + 2 base metrics (length match, no duplicate)
LLM Self-Evaluation: Model self-assessment across all 6 dimensions for meta-cognitive analysis
PDF RAG Pipeline: Build knowledge bases from academic PDFs and generate literature reviews
Multi-Model Comparison: Evaluate multiple Ollama models side by side with radar chart visualization
CLI & Python API: Full command-line interface and programmatic access
Offline Design: All processing runs locally via Ollama — no external API calls required

Requirements

Python 3.10+
Ollama running locally
Chat model (e.g., qwen2.5:7b)
Embedding model (e.g., nomic-embed-text)

Installation

# From PyPI
pip install duizhang

# From source
git clone https://github.com/cycleuser/DuiZhang.git
cd DuiZhang
pip install -e .

Quick Start

# Start Ollama
ollama serve
ollama pull qwen2.5:7b
ollama pull nomic-embed-text

# Evaluate a model
duizhang eval --model qwen2.5:7b --samples 10

# Quick generation
duizhang run --model qwen2.5:7b --input "春风送暖入屠苏"

# Process PDFs
duizhang pdf --force-rebuild

# List available models
duizhang models

# Show configuration
duizhang config show

Python API

from duizhang import evaluate_couplet, index_documents, ToolResult

# Evaluate a single couplet
result = evaluate_couplet(
    input_line="春风送暖",
    expected="冬雪飘香",
    generated="秋月寒江",
)
print(result.metrics)

# Index PDF documents
result = index_documents(["paper1.pdf", "paper2.pdf"])

Evaluation Metrics

Rule-based Metrics

Metric	Range	Description
POS Match	0-1	Part-of-speech correspondence between lines
Structure Match	0-1	Punctuation position alignment
Rhythm Match	0-1	Word segmentation length pattern matching
Tone Match	0-1	Ping/Ze (平仄) tone opposition
Content Relevance	0-1	TF-IDF semantic similarity
Imagery Correspondence	0-1	Noun quantity and reflection matching

Base Metrics

Metric	Range	Description
Length Match	0-1	Generated vs expected length alignment
No Duplicate	0-1	Character non-repetition between lines

LLM Self-Evaluation

The model evaluates its own output across all 6 rule-based dimensions, enabling meta-cognitive analysis by comparing algorithm scores vs self-assessment scores.

Project Structure

DuiZhang/
├── duizhang/
│   ├── __init__.py          # Version & public API
│   ├── __main__.py          # python -m duizhang entry
│   ├── api.py               # Unified Python API
│   ├── cli.py               # Entry point routing
│   ├── cli_app.py           # CLI application
│   ├── tools.py             # OpenAI function-calling tools
│   ├── core/
│   │   ├── __init__.py      # Core module exports
│   │   ├── config.py        # Configuration dataclass
│   │   ├── constants.py     # Application constants
│   │   ├── errors.py        # Custom exception hierarchy
│   │   ├── ollama_client.py # Ollama API client
│   │   ├── pdf_processor.py # PDF text extraction
│   │   ├── kb_builder.py    # FAISS knowledge base builder
│   │   └── summarizer.py    # Document summarization
│   ├── evaluator/
│   │   ├── __init__.py
│   │   ├── metrics.py       # Rule-based metrics
│   │   ├── llm_metrics.py   # LLM self-evaluation
│   │   ├── data_loader.py   # Couplet data loading
│   │   ├── evaluator.py     # Main evaluation engine
│   │   └── visualizer.py    # Radar chart visualization
│   ├── pdf_rag/
│   │   ├── __init__.py
│   │   ├── pipeline.py      # PDF processing pipeline
│   │   └── report_generator.py
│   ├── templates/           # Web UI templates
│   └── static/              # CSS/JS assets
├── tests/                   # Test suite
├── data/                    # Runtime data (auto-created)
└── images/                  # Screenshots

Testing

pip install -e ".[dev]"
pytest -v
pytest -v --cov=duizhang

License

GPL-3.0-or-later. See LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duizhang-0.1.0.tar.gz (46.9 kB view details)

Uploaded Apr 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

duizhang-0.1.0-py3-none-any.whl (46.9 kB view details)

Uploaded Apr 23, 2026 Python 3

File details

Details for the file duizhang-0.1.0.tar.gz.

File metadata

Download URL: duizhang-0.1.0.tar.gz
Upload date: Apr 23, 2026
Size: 46.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for duizhang-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`d851319583acdb70bf603379c70c9cf56a218248453e8d468daa2e061eca6b2f`
MD5	`fa95c37d08f34ec0bc528cf9d0389ee7`
BLAKE2b-256	`c719adeaddf038b525462c79c727a98953c99f916510169e80fabf7c39952086`

See more details on using hashes here.

File details

Details for the file duizhang-0.1.0-py3-none-any.whl.

File metadata

Download URL: duizhang-0.1.0-py3-none-any.whl
Upload date: Apr 23, 2026
Size: 46.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for duizhang-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`01a09e6efdd652cd08d3cb7bfc8d80f991ce778710c6f66ab34046bcb87a2a75`
MD5	`fa8b4fe1f1222e4091698882337d5b1e`
BLAKE2b-256	`7813ea65cf7448aa34ec22558959c7adb6f920b44a55ffd98e0704329fc36ed7`

See more details on using hashes here.

duizhang 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DuiZhang (对仗)

Features

Requirements

Installation

Quick Start

Python API

Evaluation Metrics

Rule-based Metrics

Base Metrics

LLM Self-Evaluation

Project Structure

Testing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes