Automated mutation-based differential testing for Python type checkers
Project description
Pytifex
Automated mutation based differential testing for Python type checkers. Pytifex discovers disagreements between type checkers by mining historical bug reports, generating targeted test cases with an LLM, and establishing ground truth through runtime validation testing.
For more information, see the Pytifex documentation.
Note: Pytifex implements a bug-seeded mutation methodology for proactively finding type checker bugs before users encounter them.
Type Checkers
Pytifex tests the following type checkers:
| Checker | Version | Repository |
|---|---|---|
| mypy | 1.19.0 | python/mypy |
| pyrefly | 0.44.2 | facebook/pyrefly |
| zuban | 0.3.0 | zubanls/zuban |
| ty | 0.0.1-alpha.32 | astral-sh/ty |
Divergence Patterns
| Pattern | Description | PEPs |
|---|---|---|
protocol-defaults |
Protocol methods with different default argument values | 544 |
typed-dict-total |
TypedDict with mixed total/Required/NotRequired inheritance |
589, 655 |
typeguard-narrowing |
TypeGuard/TypeIs with generic type parameters | 647, 742 |
param-spec-decorator |
ParamSpec decorators on classmethods/staticmethods | 612 |
self-generic |
Self type in generic classes with abstract methods | 673 |
newtype-containers |
NewType in containers (covariance/contravariance) | 484 |
overload-literals |
Overloaded functions with Literal type discrimination | 484, 586 |
final-override |
Final attributes overridden by properties | 591 |
keyword-vs-positional |
Protocol callables with keyword-only parameters | 544, 570 |
bounded-typevars |
TypeVar bounds with nested generics | 484 |
Installation
Prerequisites: Python 3.12+, uv
git clone https://github.com/benedekaibas/pytifex-demo.git
cd pytifex-demo/src/tc_disagreement
export GEMINI_API_KEY=your_key # Required
export GITHUB_TOKEN=ghp_your_token # Optional
NOTE: Type checkers are automatically installed by uv when you run the tool.
Usage
# Run the full pipeline: mine → generate → filter → evaluate
uv run main.py
# Generate until N disagreements are found
uv run main.py --num-examples 10
# Use a different model
uv run main.py --model gemini-2.5-pro
# Skip GitHub seed fetching
uv run main.py --no-github
Commands
| Command | Description |
|---|---|
uv run main.py |
Full pipeline (generate + evaluate) |
uv run main.py generate |
Generate disagreements only |
uv run main.py check |
Run type checkers on existing examples |
uv run main.py eval |
Evaluate existing results |
Options
| Option | Default | Description |
|---|---|---|
--num-examples N |
5 | Target disagreement count |
--batch-size N |
15 | Examples per LLM batch |
--max-attempts N |
5 | Max generation attempts |
--max-refinements N |
2 | Refinement attempts per example |
--model MODEL |
gemini-2.5-flash |
Gemini model |
--eval-method METHOD |
comprehensive |
Evaluation method |
--no-github |
— | Skip GitHub seed fetching |
-v, --verbose |
— | Show all examples |
Evaluation
Pytifex uses a multi-phase evaluation oracle to determine which checker is correct:
| Phase | Method | Confidence |
|---|---|---|
| 0 | AST-based PEP specification oracle | 0.85–0.95 |
| 1 | Runtime crash detection | 0.95–1.0 |
| 2 | Hypothesis property-based testing | 0.85 |
| 3 | PEP specification compliance matching | 0.80 |
| 4 | Static flow analysis | 0.80 |
Key insight: Runtime behavior is the ultimate ground truth. If code raises TypeError at runtime, any checker that reported "OK" is definitively wrong.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pytifex-0.1.0.tar.gz.
File metadata
- Download URL: pytifex-0.1.0.tar.gz
- Upload date:
- Size: 151.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1d1d832cf720b582d8189265b8ce494e85397fa8361a86e00fc425f3ad4f5fe7
|
|
| MD5 |
1be268a84cfda1d200c5b5cc3a63896e
|
|
| BLAKE2b-256 |
abcd3602300dd88d9b5265f1e1a6e954678ae586b27b5dd9fe5a9432f0ae7b5d
|
File details
Details for the file pytifex-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pytifex-0.1.0-py3-none-any.whl
- Upload date:
- Size: 193.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
347c10c889c34d3dbe6de4b42b978aac63c71d62a666814343daa6bdf3f36666
|
|
| MD5 |
c0dfeb13917de1793485f329b933a8de
|
|
| BLAKE2b-256 |
a8589cb71b735a40c5e0a4ca3ffd9727a280c06e3c8aef482d26227b9dc973da
|