LLM-based evaluation of multiple-choice items against item-writing guidelines

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

itemwise

LLM-based evaluation of multiple-choice items against the 43 item-writing rules from Haladyna & Downing (1989). Works with any LLM provider via litellm.

Installation

pip install itemwise

Requires Python 3.12+.

Quick Start

from itemwise import evaluate

result = evaluate(
    item={
        "stem": "Which of the following is NOT a characteristic of mammals?",
        "options": [
            "They are warm-blooded",
            "They lay eggs",
            "They have hair or fur",
            "They produce milk",
        ],
        "correct": 1,
    },
    model="azure/gpt-5.1-chat",
)

print(result.score)            # fraction of rules passed
print(result.violations)       # list of failed RuleResult
print(result.usage.cost)       # LLM cost in USD

Usage

from itemwise import evaluate, evaluate_batch, async_evaluate_batch

# Select specific rules
evaluate(item=item, model="azure/gpt-5.1-chat", rules=[22, 28, 37])

# Batch with progress bar (disable via progress=False)
evaluate_batch(items=items, model="azure/gpt-5.1-chat")

# Async / parallel
await async_evaluate_batch(items=items, model="azure/gpt-5.1-chat")

# Extra kwargs are forwarded to litellm
evaluate(item=item, model="azure/gpt-5.1-chat", reasoning_effort="low")

CLI

itemwise evaluate questions.json --model azure/gpt-5.1-chat
itemwise evaluate questions.json --model azure/gpt-5.1-chat --rules 22,28,37 --param reasoning_effort=low

Input JSON format:

[{"stem": "...", "options": ["A", "B", "C", "D"], "correct": 0}]

LLM Configuration

Model names and parameters follow litellm conventions. For Azure OpenAI:

export AZURE_API_KEY=...
export AZURE_API_BASE=https://your-resource.cognitiveservices.azure.com/
export AZURE_API_VERSION=2024-12-01-preview

Item-Writing Rules

43 rules from Haladyna & Downing (1989) across 6 categories:

Category	Rules	Description
General (Procedural)	1-7	Format, grammar, readability
General (Content)	8-17	Objectives, vocabulary, higher-order thinking
Stem Construction	18-23	Clarity, positive wording
General Option	24-35	Count, order, homogeneity, length
Correct Option	36-37	Position distribution, uniqueness
Distractor	38-43	Plausibility, common errors

Rules 11 (item independence) and 36 (answer position distribution) require cross-item analysis and are excluded by default. Pass them explicitly via rules=[11, 36] to include them.

References

Haladyna, T. M., & Downing, S. M. (1989). A taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2(1), 37-50.
Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309-333.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

mathbullet

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.1

Apr 24, 2026

0.1.0

Apr 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

itemwise-0.1.1.tar.gz (8.4 kB view details)

Uploaded Apr 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

itemwise-0.1.1-py3-none-any.whl (10.7 kB view details)

Uploaded Apr 24, 2026 Python 3

File details

Details for the file itemwise-0.1.1.tar.gz.

File metadata

Download URL: itemwise-0.1.1.tar.gz
Upload date: Apr 24, 2026
Size: 8.4 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for itemwise-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`0ee46284e0e3955ee419580806d31f676638abd59a0ea2ebcd25aed588f5a340`
MD5	`69d0835c4e5cf306653baeac60d90b19`
BLAKE2b-256	`2f9ad3802458a592969b6738c6ba88daf4f0a8bff01ce2774f46b19d44339b51`

See more details on using hashes here.

File details

Details for the file itemwise-0.1.1-py3-none-any.whl.

File metadata

Download URL: itemwise-0.1.1-py3-none-any.whl
Upload date: Apr 24, 2026
Size: 10.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for itemwise-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`678d3c09091ab7237b019cc5e368e1e5c9c4613170903c9a22ea55122008ae30`
MD5	`057e3880b5743d957fc80e3d9e69632a`
BLAKE2b-256	`0f774fc495e73b72fe0e81d11e9da5df9994fd0b43c5c6169abae51dbcc49cf7`

See more details on using hashes here.

itemwise 0.1.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

itemwise

Installation

Quick Start

Usage

CLI

LLM Configuration

Item-Writing Rules

References

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes