AST-based linter for Python DTO discipline and facade-ban enforcement — framework-agnostic.
Project description
dto-strict
AST-based linter for Python DTO discipline and facade-ban enforcement — pluggable, framework-agnostic.
Why dto-strict?
Data Transfer Objects (DTOs) provide a critical boundary between services and prevent the fragmentation of business-logic definitions across codebases. However, when function signatures leak Dict[str, Any] or when services build dict literals inline instead of using structured DTOs, code becomes:
- Loosely typed: Shape mismatches only surface at runtime.
- Duplicated: The same business object gets redefined wherever it's used.
- Hard to evolve: Changing a field requires updating dicts in 10+ places.
Facade functions (module-level helpers that wrap framework machinery) similarly tend to proliferate and obscure intent when unmarked. The "facade—celery schedule" pattern makes intent explicit.
Why in healthcare? Healthcare systems (HIPAA/PHI/HIPAA-regulated compliance platforms) benefit from strong DTO boundaries because they force explicit thinking about what data is structured, typed, and auditable. When handling patient records, medical documents, and compliance reports, untyped dicts create liability: a field can be added silently, changed in shape unpredictably, and no type checker catches missing PII handling.
dto-strict enforces DTO and facade discipline via static AST analysis, with 6 focused rules:
- R001 (HIGH): Detect
Dict[str, Any]or baredict/list/tuplein service-layer function signatures (strict mode optional). - R002 (MEDIUM): Flag inline dict literals with 3+ string keys; exception tags can require justification.
- R003 (MEDIUM): Flag
repr=Falsein dataclasses (v0.2 canonical: plain@dataclass(frozen=True, slots=True)withoutrepr=False; legacy mode available). - R004 (HIGH): Demand exception tags on module-level functions (e.g.,
# facade — celery schedule). - R005 (LOW): Encourage validators to use
DTO.from_dict()pattern. - R006 (HIGH): Detect
typing.Anyin function signatures (parameters and return types).
All rules are configurable; violations can be disabled, severity overridden, or paths scoped.
Install
pip install dto-strict
Quick Start
Basic CLI Usage
# Lint a single file
dto-strict apps/compliance/services.py
# Lint a directory
dto-strict apps/
# Output as GitHub Actions annotations
dto-strict apps/ --format github
# Output as JSON
dto-strict apps/ --format json
Configuration (pyproject.toml)
[tool.dto-strict]
service_paths = [
"apps/*/services/*.py",
"**/services/*.py",
]
dto_paths = [
"**/dtos.py",
"**/dtos/*.py",
]
exception_tags = [
"facade — celery schedule",
"FRAMEWORK",
]
disabled_rules = ["R005"] # Disable low-priority rules if desired
severity_overrides = { "R002" = "low" } # Downgrade specific rules
Strict Mode (v0.2)
v0.2 introduces canonical mode alignment with modern DTO practices and strict collection detection:
[tool.dto-strict]
# R001: Catch bare dict/list/tuple without type parameters
strict_collections = true # Default: false. When true, bare collections trigger violations.
# R002: Require justification on exception tags + configurable dict key threshold
exception_tag_requires_justification = true # Default: false.
# Tags must now use format: "tag: explanation" (e.g., "facade — celery schedule: transient event payload")
min_dict_keys = 3 # NEW in v0.2: Threshold for R002 dict literal flagging (default: 3)
# Limit reuse of exception tags in a single file
max_exception_tags_per_file = 3 # Default: null (no limit)
# R003: Canonical mode (v0.2 default) flags repr=False as anti-canonical + strict/relaxed modes
r003_mode = "canonical" # Default: "canonical" (v0.2). Use "legacy" for v0.1 behavior.
# In canonical mode: @dataclass(frozen=True, slots=True) is correct; repr=False is flagged.
# In legacy mode: @dataclass must include frozen=True, slots=True, AND repr=False (v0.1 requirement).
r003_strict_repr = true # NEW in v0.2: In canonical mode, flag repr=False (default: true)
# Set to false for relaxed mode: only checks frozen+slots, ignores repr=False
# R004: NEW auto-detect class-method-wrapping pattern
# Module-level functions that delegate to class methods are now auto-detected
# (no exception tag needed; reduces false positives)
# R006: Scope typing.Any detection to specific paths
r006_paths = [
"apps/*/services/*.py",
"**/services/*.py",
]
Baseline Ratchet Mode (v0.2)
Accept current violations as "baseline" debt and track only new violations:
# Generate baseline from current state
dto-strict apps/ --generate-baseline > .dto-strict-baseline.json
# Subsequent runs accept baseline violations; new ones trigger failure
dto-strict apps/ --baseline .dto-strict-baseline.json
Baseline tracks violations by file, line, and rule ID. When violations are fixed and removed from the codebase, the baseline can be regenerated (exit code 0 + notice on removal).
Why canonical mode? Per 2026-05-09 DTO-strict pivot, the canonical pattern is:
@dataclass(frozen=True, slots=True)— immutability + memory efficiency- NO
repr=False— let repr work normally; custom__repr__not needed - Store values, don't override output; if a field is PII-sensitive, use external redaction tools
GitHub Actions
Create .github/workflows/dto-strict.yml:
name: dto-strict
on:
pull_request:
paths: ['apps/**.py']
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install dto-strict
- run: dto-strict apps/ --format github
Pre-commit Hook
Add to .pre-commit-config.yaml:
- repo: local
hooks:
- id: dto-strict
name: dto-strict
entry: dto-strict
language: python
types: [python]
additional_dependencies: ['dto-strict']
stages: [commit]
Rules
R001: Dict[str, Any] and Bare Collections in Service Signatures (HIGH)
Service-layer functions should not accept or return Dict[str, Any]. With strict_collections=true, bare dict, list, and tuple without type parameters are also flagged.
Fail (always):
def process_user(config: Dict[str, Any]) -> Dict[str, Any]:
return {"status": "ok"}
Fail (with strict_collections=true):
def fetch_users() -> list: # Bare list
return []
def merge_configs(base: dict, overrides: dict) -> dict: # Bare dicts
return {**base, **overrides}
Pass:
from typing import Dict
@dataclass(frozen=True, slots=True)
class UserConfigDTO:
timeout: int
retries: int
def process_user(config: UserConfigDTO) -> Dict[str, str]:
return {"status": "ok"}
def fetch_users() -> list[UserDTO]: # Typed list
return []
def merge_configs(base: dict[str, Any], overrides: dict[str, Any]) -> dict[str, Any]: # Typed dicts
return {**base, **overrides}
Rationale: Typed parameters enable IDE completion and catch shape mismatches early. Bare collections hide shape from static checkers and readers.
R002: Inline Dict Literals (MEDIUM)
Service files with inline dict literals containing 3+ string keys should define a DTO instead. Exception tags allow one-off inline dicts; with exception_tag_requires_justification=true, tags must include a colon-delimited explanation.
Fail (no tag):
def build_response(user_id: int) -> dict:
return {
"user_id": user_id,
"status": "active",
"timestamp": "2025-01-01",
}
Fail (tag without justification, if required):
def build_response(user_id: int) -> dict: # facade — celery schedule
return {
"user_id": user_id,
"status": "active",
"timestamp": "2025-01-01",
}
Pass (with justified tag):
def build_response(user_id: int) -> dict: # facade — celery schedule: SNS event envelope (transient)
return {
"user_id": user_id,
"status": "active",
"timestamp": "2025-01-01",
}
Pass (define DTO instead):
@dataclass(frozen=True, slots=True)
class ResponseDTO:
user_id: int
status: str
timestamp: str
def build_response(user_id: int) -> ResponseDTO:
return ResponseDTO(user_id, "active", "2025-01-01")
Rationale: Shared shapes should live in DTOs. Inline dicts make duplication invisible. Exception tags are for rare transient payloads; they should explain why.
R003: Dataclass Canonical Form (MEDIUM)
Canonical mode (v0.2 default): Dataclasses must use frozen=True, slots=True WITHOUT repr=False.
Legacy mode (v0.1): Requires frozen=True, slots=True, repr=False.
Canonical Mode (v0.2)
Fail (canonical mode):
@dataclass(frozen=True, slots=True, repr=False) # Anti-canonical: has repr=False
class UserDTO:
user_id: int
Pass (canonical mode):
@dataclass(frozen=True, slots=True)
class UserDTO:
user_id: int
@dataclass(frozen=True, slots=True) # Both params present
class ConfigDTO:
timeout: int
Rationale (canonical):
frozen=True: Immutability enforces single-source-of-truth.slots=True: Memory efficiency and prevents attribute typos.- NO
repr=False: Default repr is fine; if a field is sensitive, use external redaction (logging mixin, etc.)
Legacy Mode (v0.1)
Use r003_mode = "legacy" in pyproject.toml if your codebase still requires repr=False:
@dataclass(frozen=True, slots=True, repr=False)
class UserDTO:
user_id: int
R004: Module-Level Functions (HIGH)
Bare module-level functions (facades, framework hooks) must carry an exception tag in a comment or docstring.
Fail:
def process_user(user_id: int):
pass
def send_notification(message: str):
pass
Pass:
def process_user(user_id: int): # facade — celery schedule
pass
def send_notification(message: str): # FRAMEWORK
"""Send via SNS."""
pass
class UserService:
def process(self, user_id: int):
# Class methods don't need tags
pass
Exception Tags: Configurable via pyproject.toml exception_tags list.
Rationale: Facades blur intent. Tags make intent explicit and signal "this is framework-specific, not business logic."
R005: Validator Pattern (LOW)
validate_*() functions should use DTO.from_dict() or raise ValidationError to enforce payload shape.
Fail:
def validate_user_payload(payload: dict) -> bool:
return "user_id" in payload and "email" in payload
Pass:
def validate_user_payload(payload: dict) -> UserDTO:
try:
user = UserDTO(
user_id=payload["user_id"],
email=payload["email"],
)
return user
except (KeyError, TypeError) as e:
raise ValidationError(f"Invalid shape: {e}")
Rationale: Validators should enforce structure, not just presence.
R006: typing.Any in Signatures (HIGH)
Function signatures in service files should not use typing.Any. Build a proper DTO or use narrow type protocols instead.
Fail:
from typing import Any
def process(data: Any) -> Any: # Bad: loses all type info
pass
def fetch_config() -> Optional[Any]: # Bad: Any defeats Optional
return None
Pass:
from typing import Optional, Protocol
class Readable(Protocol):
def read(self) -> bytes:
...
def process(data: dict[str, str]) -> dict[str, int]: # Properly typed
pass
def fetch_config() -> Optional[ConfigDTO]: # Specific type
return None
def read_file(f: Readable) -> bytes: # Protocol for file-like objects
return f.read()
Rationale: Any defeats static type checking and IDE completion. It hides shape assumptions and makes refactoring dangerous. Use protocols for file-like or callback types; use DTOs for business shapes.
PHI / Sensitive Data Handling (Pattern 1)
Why R003 removed blanket repr=False: The v0.2 canonical pivot intentionally moves away from blanket repr=False as a PHI masking mechanism. Instead, use explicit __repr__ overrides on DTOs containing sensitive fields.
Pattern 1: Explicit __repr__ on Sensitive DTOs
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class Patient:
"""Patient DTO with sensitive fields."""
patient_id: str
name: str
ssn: str # Sensitive
date_of_birth: str # Sensitive
def __repr__(self) -> str:
"""Mask PHI fields in repr."""
return f"Patient(patient_id={self.patient_id!r}, name=<redacted>, ssn=<redacted>, date_of_birth=<redacted>)"
When a Patient DTO is logged or printed, only non-sensitive fields appear:
>>> p = Patient(patient_id="P123", name="Alice", ssn="123-45-6789", date_of_birth="1990-01-01")
>>> print(p)
Patient(patient_id='P123', name=<redacted>, ssn=<redacted>, date_of_birth=<redacted>)
Why explicit over blanket?
- Auditable: Developers explicitly decide which fields are sensitive and how to mask them.
- Flexible: Different DTOs can have different masking strategies (redact, hash, truncate, etc.).
- Future-proof: External tools (e.g., AWS Comprehend Medical) can be layered on top for dynamic PHI detection.
- Healthcare / HIPAA: The combination of explicit DTOs + selective
__repr__overrides is a standard privacy-by-design pattern in regulated systems.
Suppressing Violations
Violations can be suppressed using # noqa comments. The linter recognizes:
# noqa— Suppress all rules on this line# noqa: dto-strict— Suppress all dto-strict rules on this line# noqa: dto-strict-R001— Suppress rule R001 only# noqa: dto-strict-R001, dto-strict-R002— Suppress multiple rules
Examples:
# Suppress a Dict[str, Any] violation on a specific function
def legacy_callback(config: Dict[str, Any]) -> None: # noqa: dto-strict-R001
"""Old API we can't change."""
pass
# Suppress all rules on a line
def process() -> dict: # noqa
return {}
# Suppress just R002 (inline dict literal) violation
error_response = { # noqa: dto-strict-R002
"status": "error",
"code": 500,
"message": "Internal server error",
}
Output Formats
Text (default)
app.py:10: R001 Dict[str, Any] in signature: process_user
service.py:20: R002 Inline dict literal with 4 keys
GitHub Actions
::error file=app.py,line=10,col=5::R001 Dict[str, Any] in signature: process_user
::warning file=service.py,line=20,col=0::R002 Inline dict literal with 4 keys
JSON
[
{
"rule_id": "R001",
"severity": "HIGH",
"file": "app.py",
"line": 10,
"col": 5,
"message": "Dict[str, Any] in signature: process_user"
}
]
Exit Codes
| Code | Meaning |
|---|---|
| 0 | No violations |
| 1 | HIGH severity violations present |
| 2 | MEDIUM severity violations only |
| 3 | LOW severity violations only |
Configuration Reference
[tool.dto-strict]
# Paths to check for service-layer violations (R001, R002, R004, R006)
# Default: ["apps/*/services/*.py", "**/services/*.py"]
service_paths = [
"apps/*/services/*.py",
"**/services/*.py",
]
# Paths to check for DTO definitions (R003)
# Default: ["**/dtos.py", "**/dtos/*.py"]
dto_paths = [
"**/dtos.py",
"**/dtos/*.py",
]
# Paths for R006 (typing.Any detection)
# Default: ["apps/*/services/*.py", "**/services/*.py"]
r006_paths = [
"apps/*/services/*.py",
"**/services/*.py",
]
# Allowed exception tags for R004 (module-level facades)
# Default: ["facade — celery schedule", "FRAMEWORK"]
exception_tags = [
"facade — celery schedule",
"FRAMEWORK",
"CUSTOM_TAG",
]
# (v0.2) Bare dict/list/tuple without type parameters flagged as violations
# Default: false
strict_collections = true
# (v0.2) Exception tags must include colon-delimited justification
# Default: false
exception_tag_requires_justification = true
# (v0.2) Maximum exception tags per file (null = unlimited)
# Default: null
max_exception_tags_per_file = 3
# (v0.2) R003 mode: "canonical" (v0.2 default) or "legacy" (v0.1)
# In canonical: repr=False is anti-canonical and flagged
# In legacy: frozen=True, slots=True, repr=False all required
# Default: "canonical"
r003_mode = "canonical"
# Disable specific rules entirely
# Default: []
disabled_rules = ["R005"]
# Override severity for specific rules
# Valid values: "HIGH", "MEDIUM", "LOW"
# Default: {}
severity_overrides = {
"R002" = "low",
}
Design Philosophy
Pluggable, not opinionated. Every rule is:
- Configurable: Path patterns, exception tags, severity levels.
- Disable-able: Set
disabled_rules = ["R001"]to skip it entirely. - Framework-agnostic: No Django/FastAPI/Flask assumptions; adapters for each framework are opt-in extras.
Defaults bundled, not imposed. Out-of-the-box rules target Django + DRF + Celery patterns, but you can customize for your stack.
Development
git clone https://github.com/jekhator/dto-strict.git
cd dto-strict
python3 -m venv .venv && source .venv/bin/activate
pip install -e .[dev]
# Run tests
pytest tests/ -v
# Run linter on itself
dto-strict src/ --format github
License
Apache License 2.0. See LICENSE.
Contributing
Issues and PRs welcome. Please include fixtures (good + bad examples) for new rules.
See Also
- pii-aware-mixin — Auto-hide PII in dataclass repr/logging.
- logging-mixin — Class-bound structured logging with correlation IDs.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dto_strict-0.2.2.tar.gz.
File metadata
- Download URL: dto_strict-0.2.2.tar.gz
- Upload date:
- Size: 33.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a9995bdbf21920348cce2e72122c18e7b56ecaa257d1902b16513448b00e0a32
|
|
| MD5 |
cabfdb78161c663750ea88273f4f373e
|
|
| BLAKE2b-256 |
e76a4d0f7534e3c6da0eb8bde80dc412d1a6acbf1c30a86dcd32edf6a54ee9b4
|
Provenance
The following attestation bundles were made for dto_strict-0.2.2.tar.gz:
Publisher:
publish.yml on jekhator/dto-strict
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dto_strict-0.2.2.tar.gz -
Subject digest:
a9995bdbf21920348cce2e72122c18e7b56ecaa257d1902b16513448b00e0a32 - Sigstore transparency entry: 1611284311
- Sigstore integration time:
-
Permalink:
jekhator/dto-strict@af0fd0b0d4861e8e3ed1bee63e2100659ad13cfb -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/jekhator
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@af0fd0b0d4861e8e3ed1bee63e2100659ad13cfb -
Trigger Event:
release
-
Statement type:
File details
Details for the file dto_strict-0.2.2-py3-none-any.whl.
File metadata
- Download URL: dto_strict-0.2.2-py3-none-any.whl
- Upload date:
- Size: 22.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd346900cc6a6477e22be4cd04c984235f3133132dd43e646fc875573fa579bc
|
|
| MD5 |
881cd87d4fe2146ab1fa827530575c18
|
|
| BLAKE2b-256 |
fbec5818a5f1882721113fe178279e1edd5cdf01e779cb089fe90530e45ea46e
|
Provenance
The following attestation bundles were made for dto_strict-0.2.2-py3-none-any.whl:
Publisher:
publish.yml on jekhator/dto-strict
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dto_strict-0.2.2-py3-none-any.whl -
Subject digest:
fd346900cc6a6477e22be4cd04c984235f3133132dd43e646fc875573fa579bc - Sigstore transparency entry: 1611284501
- Sigstore integration time:
-
Permalink:
jekhator/dto-strict@af0fd0b0d4861e8e3ed1bee63e2100659ad13cfb -
Branch / Tag:
refs/tags/v0.2.2 - Owner: https://github.com/jekhator
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@af0fd0b0d4861e8e3ed1bee63e2100659ad13cfb -
Trigger Event:
release
-
Statement type: