Skip to main content

Enforce document ACLs after retrieval and before prompting. Python port of @mukundakatta/retrieval-acl-filter.

Project description

retrieval-acl-filter-py

PyPI Python License: MIT

Enforce document ACLs after retrieval and before prompting. Your vector store doesn't know who's asking. Pipe its results through this filter to drop any chunk the current user shouldn't see, before those chunks land in the prompt. Zero runtime dependencies.

Python port of @mukundakatta/retrieval-acl-filter.

Install

pip install retrieval-acl-filter-py

Usage

from retrieval_acl_filter import filter_by_acl, explain_acl_drop

docs = [
    {"id": "public",  "text": "...", "acl": {"public": True}},
    {"id": "team",    "text": "...", "acl": {"groups": ["eng"]}},
    {"id": "personal","text": "...", "acl": {"users":  ["alice"]}},
    {"id": "secret",  "text": "...", "acl": {"groups": ["execs"]}},
    {"id": "no_acl",  "text": "..."},  # missing ACL -> open by default
]

user = {"id": "alice", "groups": ["eng"]}

allowed = filter_by_acl(docs, user)
[d["id"] for d in allowed]   # ["public", "team", "personal", "no_acl"]

# Audit which docs got dropped and why:
report = explain_acl_drop(docs, user)
report[3]   # {"id": "secret", "allowed": False, "reason": "missing_acl_match"}

Custom ACL field name

Some retrievers stamp the ACL field as permissions or similar. Use acl_field=:

filter_by_acl(docs, user, acl_field="permissions")

API

Symbol Behavior
filter_by_acl(docs, user, *, acl_field='acl') Returns the subset of docs user may see.
explain_acl_drop(docs, user, *, acl_field='acl') Per-doc verdict list with id, allowed, reason.

Match semantics (matches the JS sibling)

A document is allowed if any of these is true:

  1. doc.acl.users includes user.id.
  2. doc.acl.groups shares any value with user.groups.
  3. doc.acl.public is True.
  4. The doc has no ACL at all (acl is missing or empty) -- open-by-default.

Otherwise the document is dropped with reason="missing_acl_match".

Heads-up: the open-by-default rule is intentional for back-compat with the JS sibling, but if you're enforcing real ACLs you probably want to flip it. Tag every chunk with an explicit ACL at retrieval time and the open-default never fires.

See the JS sibling's README for the full design notes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

retrieval_acl_filter_py-0.1.0.tar.gz (5.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

retrieval_acl_filter_py-0.1.0-py3-none-any.whl (5.7 kB view details)

Uploaded Python 3

File details

Details for the file retrieval_acl_filter_py-0.1.0.tar.gz.

File metadata

  • Download URL: retrieval_acl_filter_py-0.1.0.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for retrieval_acl_filter_py-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5884da368e073c3698ab4f41b70a48821ee02f95c410c1ce57c237bf37292b04
MD5 d3011a493b60556e08aa99cefb0e21e7
BLAKE2b-256 32a2c4fb7e01fa6513794ffe297da5fbc6de1b6ec315884af266a43451d1201c

See more details on using hashes here.

File details

Details for the file retrieval_acl_filter_py-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for retrieval_acl_filter_py-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0cfd2617bcabe3e6341aec81f6b9e3e070fd9b17b77a7e46dba73120b4031eef
MD5 81a374299057b322d63f4a08b05c1a1c
BLAKE2b-256 2631022b775ade0e12046363b4ffe5b191697faff0de137ace5c11cc98355cd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page