RDF-based reasoner and metamodel for multi-framework, context-dependent data compliance assessments

These details have not been verified by PyPI

Project links

Project description

Parajudica

A metamodel and fixed-point inference system for context-dependent data classification and multi-framework compliance.

Overview

Parajudica enables context-dependent classification, allowing the same data to receive different compliance statuses depending on governance scope and available context. It provides multi-framework reasoning through a uniform representation of divergent regulatory requirements, including GDPR, HIPAA, EMA, and the Italian DPA.

The system uses fixed-point inference to propagate compliance labels through container hierarchies and joinable relationships. Framework-specific rules are expressed declaratively for implication, conditional implication, and propagation. Built-in k-anonymity analysis provides statistical de-identification checks with framework-specific thresholds.

Project Structure

parajudica/
├── src/parajudica/          # Python package (the `parajudica` import package)
│   └── metamodel/           # Packaged ontology data, shipped in the wheel
│       ├── pj/              # Core ontology, rules, SPARQL constructs
│       └── sdc/             # Structured Data Containers extension
├── examples/
│   ├── frameworks/
│   │   ├── base/            # Common facets and labels
│   │   ├── hipaa/           # HIPAA Privacy Rule
│   │   ├── gdpr/            # GDPR
│   │   ├── ema/             # European Medicines Agency
│   │   └── italy/           # Italian DPA
│   ├── db/                  # Sample data (medical, employee, research)
│   └── challenges/          # Paper validation scenarios (1-5)
└── paper/                   # LaTeX source

Core Concepts

The metamodel consists of a small set of components that work together. Data Containers represent hierarchical structures such as databases, tables, and fields, and are linked through containment relations. Governance Scopes describe the operational contexts in which compliance must be evaluated, such as a research or human resources environment. Facets capture properties of data that regulatory frameworks reason about, such as whether the data refers to individuals, healthcare information, or identifiable values. Labels are classifications that frameworks apply within scopes, including designations like PHI, PersonalData, or SpecialCategoryData. Frameworks themselves are represented as rule systems that declare how facets imply or propagate these labels.

Rules take several forms. Simple implication rules derive labels from the presence of certain facets, for example when data is both individual and healthcare-related, it is classified as PHI. Conditional implication rules require additional field-level checks before assigning a label, making them suitable for nuanced cases. Propagation rules specify how labels spread across relationships: downward from a parent container to its children, upward from children to their parent, horizontally between sibling containers, or across joinable relationships between tables.

The inference engine computes a fixed-point result starting from initial assertions. Jena rules are compiled into SPARQL CONSTRUCT queries and blank nodes are skolemized to ensure unique identifiers. The engine iteratively applies rules until no new assertions are produced. The number of rounds is bounded by the size of the derivable assertion set, and is small in practice on the shallow schemas typical of real data. The procedure is deterministic and well-defined across multiple frameworks and scopes: frameworks may diverge, and their differing classifications are retained as parallel outputs rather than reconciled. Data complexity is polynomial in the size of the system.

Example: Healthcare Compliance Scenario

Compliance Environment

The example models a healthcare organization with two databases across three governance scopes:

Data Structures:

MedicalDB contains PatientInfo, PatientEncounters, and PatientTreatments tables
EmployeeDB contains ProvidersInfo table
ResearchDB contains AggregatedHealth (k=3) and AggregatedHealth12 (k=12) tables

Governance Scopes:

MedicalGovernanceScope: Clinical operations accessing medical tables only
HumanResourcesScope: HR operations accessing employee tables only
ResearchScope: Research activities accessing all tables

Joinable Relationships:

PatientInfo joinable with PatientEncounters (via patient ID)
PatientEncounters joinable with ProvidersInfo (via provider ID)

Initial Assertions:

PatientInfo has facets: Individual, Healthcare, DirectIdentifier
PatientEncounters has facets: Individual, Healthcare
PatientTreatments has facets: Individual, Healthcare
ProvidersInfo has facets: Individual, DirectIdentifier
AggregatedHealth has facets: OpenGroup, InternalIdentifier (k=3)

Challenge 1: Context-Dependent Classification

The same data receives different compliance status based on governance scope. ProvidersInfo is classified as Individual data in HumanResourcesScope where only employee data is available, but becomes PHI in ResearchScope where it can be joined with patient data.

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┓
┃ scope               ┃ container      ┃ framework       ┃ label               ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━┩
│ :HumanResourcesSco… │ :ProvidersInfo │ :BaseFramework  │ :Individual         │
│ :HumanResourcesSco… │ :ProvidersInfo │ :GDPRFramework  │ :Individual         │
│ :ResearchScope      │ :ProvidersInfo │ :BaseFramework  │ :Individual         │
│ :ResearchScope      │ :ProvidersInfo │ :GDPRFramework  │ :Individual         │
│ :ResearchScope      │ :ProvidersInfo │ :HIPAAFramework │ :ProtectedHealthIn… │
└─────────────────────┴────────────────┴─────────────────┴─────────────────────┘

The propagation chain in ResearchScope works as follows: PatientInfo with identifiers in healthcare context becomes PHI, PatientEncounters joinable with PatientInfo inherits PHI, and ProvidersInfo joinable with PatientEncounters also inherits PHI.

Challenge 2: Framework Divergence

Different frameworks classify the same joined data differently. HIPAA uses an expansive model where joined data inherits PHI status, while GDPR maintains field-level precision where data retains its original classification.

┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃ scope             ┃ container         ┃ framework        ┃ label             ┃
┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ :ResearchScope    │ :PatientTreatmen… │ :GDPRFramework   │ :PersonalData     │
│ :ResearchScope    │ :PatientTreatmen… │ :GDPRFramework   │ :SpecialCategory… │
│ :ResearchScope    │ :PatientTreatmen… │ :HIPAAFramework  │ :ProtectedHealth… │
│ :ResearchScope    │ :ProvidersInfo    │ :GDPRFramework   │ :PersonalData     │
│ :ResearchScope    │ :ProvidersInfo    │ :HIPAAFramework  │ :ProtectedHealth… │
└───────────────────┴───────────────────┴──────────────────┴───────────────────┘

(Selected rows shown. Full output includes MedicalGovernanceScope and additional base framework labels.)

ProvidersInfo under HIPAA receives PHI via joinable propagation, but under GDPR only gets PersonalData (not SpecialCategoryData). PatientTreatments is PHI under HIPAA but split into PersonalData and SpecialCategoryData under GDPR.

Challenge 3: Propagation Semantics

Joinable relationships fundamentally affect compliance classification. With joinable relationships declared, ProvidersInfo inherits PHI status. Without them, it remains non-PHI.

# WITH joinable relationships:
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ framework       ┃ label                       ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ :GDPRFramework  │ :PersonalData               │
│ :HIPAAFramework │ :ProtectedHealthInformation │
└─────────────────┴─────────────────────────────┘

# WITHOUT joinable relationships:
┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ framework      ┃ label         ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ :GDPRFramework │ :PersonalData │
└────────────────┴───────────────┘

Challenge 4: De-identification Standards

K-anonymity analysis with framework-specific thresholds shows how different standards evaluate re-identification risk. Tables with k<3 trigger HighReidentificationRisk under HIPAA Expert Determination.

┏━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┓
┃ table              ┃ label                     ┃ kValue               ┃
┡━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━┩
│ :AggregatedHealth  │ :KAnonymityAnalysis       │ 3                    │
│ :PatientEncounters │ :HighReidentificationRisk │ (triggered by k < 3) │
│ :PatientEncounters │ :KAnonymityAnalysis       │ 1                    │
│ :PatientInfo       │ :HighReidentificationRisk │ (triggered by k < 3) │
│ :PatientInfo       │ :KAnonymityAnalysis       │ 1                    │
│ :ProvidersInfo     │ :HighReidentificationRisk │ (triggered by k < 3) │
│ :ProvidersInfo     │ :KAnonymityAnalysis       │ 1                    │
└────────────────────┴───────────────────────────┴──────────────────────┘

Challenge 5: Framework Divergence Table

Reproduces the paper's comparison showing how identical k-anonymized data is evaluated differently across frameworks. A dataset with k=3 is acceptable under HIPAA but not under EMA or Italian DPA.

┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━┳━━━━━━━┓
┃ table               ┃ kValue ┃ hipaa ┃ ema ┃ italy ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━╇━━━━━━━┩
│ :AggregatedHealth   │ 3      │ YES   │ NO  │ NO    │
│ :AggregatedHealth12 │ 12     │ YES   │ YES │ NO    │
│ :PatientInfo        │ 1      │ NO    │ NO  │ NO    │
└─────────────────────┴────────┴───────┴─────┴───────┘

HIPAA Expert Determination accepts k>=3, EMA requires k>=12, and the Italian DPA rejects any dataset with unique identifiers that enable singling out, regardless of k value.

Paper

The paper describing the metamodel and formalization is available in paper/main.tex. It presents the formal semantics of the system, explains fixed-point computation, and provides comparative analysis of regulatory frameworks. The paper documents the implementation and includes illustrative healthcare validation scenarios drawn from documented regulatory requirements.

Quick Start

To use the reasoner as a library or command line tool, install the published package from PyPI. The packaged metamodel ontologies are bundled, so no checkout is required.

pip install parajudica   # or: uv pip install parajudica

# Then run the CLI
parajudica --help

To work on the source or reproduce the paper's challenges, clone the repository and do an editable install. Once installed, the challenges can be run through the provided Makefile. These challenges illustrate the main concepts: context-dependent classification, divergence between frameworks such as HIPAA and GDPR, the role of propagation semantics across hierarchical or joinable relationships, and standards for de-identification such as k-anonymity under different frameworks. Running all challenges together reproduces the comparisons described in the accompanying paper.

# Editable install with uv (recommended) or pip
uv pip install -e .
# pip install -e .

# Run all challenges
make challenges

Citation

@article{moreau2025parajudica,
  title={Parajudica: A Metamodel and RDF/SPARQL-Based Reasoning System for Context-Dependent Data Compliance Assessments},
  author={Moreau, Luc and Rossi, Alfred and Stalla-Bourdillon, Sophie},
  year={2025}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parajudica-0.1.0.tar.gz (47.2 kB view details)

Uploaded Jun 19, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parajudica-0.1.0-py3-none-any.whl (77.2 kB view details)

Uploaded Jun 19, 2026 Python 3

File details

Details for the file parajudica-0.1.0.tar.gz.

File metadata

Download URL: parajudica-0.1.0.tar.gz
Upload date: Jun 19, 2026
Size: 47.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for parajudica-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f2a0aca3a4c8b751e6d9adfc44153a39fa5df41d9e121ef730469ddc6102a0d5`
MD5	`fb3c2d2ea7c93498b75ef3a8a219bcf8`
BLAKE2b-256	`b4962af54902a60accccf9fe6f00ec3c255b0c7ea6e32e80d046af87a4759547`

See more details on using hashes here.

File details

Details for the file parajudica-0.1.0-py3-none-any.whl.

File metadata

Download URL: parajudica-0.1.0-py3-none-any.whl
Upload date: Jun 19, 2026
Size: 77.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.24 {"installer":{"name":"uv","version":"0.9.24","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for parajudica-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`589ea7e73369ad263ab752400f2f69a3f892f2871f9d3c10bdee202e9fb1799f`
MD5	`1062afa8f1e7a0042bb1e1e559174fef`
BLAKE2b-256	`2836602eb9ac49271826360e2f6e1b50acf77df7b16cd062d11f687a24a3a1a8`

See more details on using hashes here.

parajudica 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Parajudica

Overview

Project Structure

Core Concepts

Example: Healthcare Compliance Scenario

Compliance Environment

Challenge 1: Context-Dependent Classification

Challenge 2: Framework Divergence

Challenge 3: Propagation Semantics

Challenge 4: De-identification Standards

Challenge 5: Framework Divergence Table

Paper

Quick Start

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes