Privacy scanner with GDPR compliance reports - Zero config, instant insights

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Privalyse

These details have not been verified by PyPI

Project description

Privalyse Logo

🔒 Privalyse – Catch Privcay Leaks in AI-Assisted Codebases

Code can be a a black box. Data moves through invisible paths. Privalyse makes these paths explicit.

We are generating code faster than ever, but we are losing sight of where our data actually goes. LLMs write logic, but they don't see the flow. They happily pipe PII into logs, send secrets to third-party APIs, or expose internal state.

Privalyse is not just a linter. It builds a Semantic Data Flow Graph of your application to tell Flow Stories:

❌ Traditional Linter: "Variable user_email used in line 42."
✅ Privalyse: "User Email (Source) → Prompt Template → OpenAI API (Sink) → Logs (Leak)."

With its deterministic static analysis engine, it serves as the perfect counterpart to AI-assisted coding: ensuring reproducible results and providing a safety net to recheck your entire codebase before deployment.

⭐️ Star if you believe in visible data flows.

🚀 Alpha Release - We're building the privacy scanner that modern development deserves. Zero config, instant insights, built for speed.

📚 Quick Start • 🔍 What We Detect • 🗺️ Roadmap • 🐛 Report Bug • ✨ Request Feature

pip install privalyse-cli
privalyse
# ✅ Done. Markdown report ready (scan_results.md).

✨ AI-Native Privacy & Guardrails (New in v0.3.0)

Privalyse now includes specialized features for AI-Model integrations:

🤖 AI Guardrails: Detects PII leaking into LLM prompts (OpenAI, LangChain, etc.).
🌍 Data Sovereignty: Flags data transfers to non-EU providers (AWS, Azure, OpenAI) to help with GDPR compliance.
🛡️ Policy as Code: Enforce blocked countries or providers via privalyse.toml.
🧼 Smart Sanitization: Recognizes hash(), anonymize() and other cleaning functions to reduce false positives.

🔄 Continuous Monitoring (CI/CD)

👁️ Data Flow Visibility & Monitoring

Modern applications are complex webs of data movement. Privalyse provides Data Flow Visibility to help you oversee where sensitive data travels.

Continuous Monitoring

Privalyse is designed to be run in your CI/CD pipeline to provide continuous monitoring of data flows.

Detect: Catch new leaks before they merge.
Visualize: See the path of data from Source to Sink.
Comply: Ensure every data flow has a legal basis.

To achieve true visibility, Privalyse should be part of your continuous integration pipeline. This ensures that every code change is monitored for new data leaks.

GitHub Actions Example

name: Privacy Monitor
on: [push, pull_request]
jobs:
  privalyse-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Run Privalyse Scanner
        uses: privalyse/privalyse-cli@v0.3.0
        with:
          root: '.'
          format: 'markdown'
          output: 'report.md'
          
      - name: Upload Report
        uses: actions/upload-artifact@v4
        with:
          name: privacy-report
          path: report.md

The generated report.md is a human-readable Markdown report that you can view directly in GitHub Actions artifacts or attach to Pull Requests. It provides a clear, visual summary of all findings, compliance risks, and data flow stories.

Installation

pip install privalyse-cli

Quick Start

# Scan current directory (defaults to Markdown output)
privalyse

# Scan specific folder
privalyse --root ./backend

# Output as JSON (Structured)
privalyse --root ./backend --format json --out results.json

# Output as HTML (Visual Dashboard)
privalyse --root ./backend --format html --out report.html

⚙️ Configuration (Policy as Code)

You can enforce privacy policies using a privalyse.toml file in your project root.

# privalyse.toml
[policy]
blocked_countries = ["US", "CN"]  # Fail if data flows to these countries
blocked_providers = ["openai"]    # Fail if data flows to these providers

When a policy violation is detected (e.g., sending PII to a US server), Privalyse will report a CRITICAL finding and exit with a failure code.

🎥 See It In Action

Privalyse CLI Demo

📊 Example Reports

See how Privalyse analyzes different types of projects:

Project Type	Description	Report
Bad Practice App	A vulnerable app full of security holes and GDPR violations.	View Report
Modern Fullstack	A typical React/Node.js stack with some common issues.	View Report
Best Practice App	A secure, compliant application following GDPR standards.	View Report

⚡ Try It Now (30 seconds)

No installation needed - works in any Python project:

pip install privalyse-cli && privalyse --root . --out report.md && cat report.md | head -50

🎯 Boom. Privacy report generated in 3 seconds.

🤖 AI Agent Integration

Privalyse is designed to be "Agent-Ready". If you are building an AI coding agent or using LLMs to fix code, Privalyse provides structured, context-rich output that agents can understand.

For Coding Agents

When using Privalyse as a tool for an agent:

Run with JSON output: privalyse --format json --out report.json
Parse the findings array: Each finding now includes:
- code_context: The actual lines of code (with surrounding context) where the issue was found.
- context_start_line / context_end_line: Precise line numbers.
- suggested_fix: A human-readable suggestion for fixing the issue.
- confidence_score: To help the agent decide whether to act.

Example JSON Output for Agents

{
  "rule": "HARDCODED_SECRET",
  "file": "src/config.py",
  "line": 15,
  "severity": "critical",
  "suggested_fix": "Move secret to environment variable (os.environ.get) or secrets manager.",
  "confidence_score": 1.0,
  "code_context": [
    "def connect_db():",
    "    db_password = \"super_secret_password_123\"  # <--- Finding here",
    "    return connect(password=db_password)"
  ]
}

This allows agents to self-correct code without needing to read the file separately.

JSON Schema for Agents

For strict validation, you can use the official JSON schema located at privalyse_scanner/models/output_schema.json. This helps LLM agents understand the exact structure of the output they are processing.

What It Does

Privalyse performs Static Monitoring (Detection) to ensure data safety:

Data Flow Visualization: Tracking where user data moves across your codebase (Source -> Sink).
Hardcoded Secrets: Detecting API keys, passwords, and tokens.
PII Leakage: Identifying Personal Identifiable Information in logs and external calls.
GDPR Violations: Mapping findings to specific GDPR articles (Art. 5, 6, 9, 32).
Security Misconfigurations: Checking for HTTP vs HTTPS, CORS, and security headers.

Note on Monitoring: In the context of this CLI, "Monitoring" refers to the continuous detection of vulnerabilities in your codebase (e.g., via CI/CD or pre-commit hooks). It does not currently perform live runtime traffic interception.

The scanner uses AST (Abstract Syntax Tree) parsing for both Python and JavaScript/TypeScript to ensure deep understanding of your code structure.

Features

Python & JavaScript/TypeScript support
AST-based analysis for Python and JS/TS (deterministic, deep data flow tracking)
Cross-file taint tracking (follows data flows across imports and modules)
Cross-stack tracing (links Frontend API calls to Backend routes)
GDPR article mapping (Art. 5, 6, 9, 32)
Structured Reports (Executive Summary, Compliance View, File Hotspots)
Multiple output formats (JSON, Markdown, HTML)
Ignore file support (.privalyseignore for false positives)
100% Local Execution (no code leaves your machine)

💡 Why Privalyse?

We believe security shouldn't be a question of price. Everyone deserves data safety and secure code. That's why Privalyse is MIT Licensed and free to use.

1. The "Audit-Ready" Approach

Don't just find bugs—generate documentation. When your CTO asks "Are we GDPR compliant?", you can't send them a JSON file. Privalyse generates reports you can actually hand to your Data Protection Officer (DPO).

2. Focus on Data Flows

We find problems even in massive codebases. Privalyse goes beyond simple pattern matching by implementing Cross-File & Cross-Stack Taint Tracking. It traces the journey of sensitive data throughout your application—from database models to API endpoints, across network calls to the frontend, and finally to sinks like logging or third-party APIs. By understanding how modules and services interact, we can detect when a variable defined in one file is insecurely used in another, effectively connecting the dots across your entire project structure.

Note: Visual data flow graphs are on the Roadmap!

3. The Human-in-the-Loop

The Markdown results are perfect for reviewing AI-generated code before merging. This helps keep control where it really counts. The Problem: ChatGPT just wrote 500 lines. Did it leak user emails into logs? The Solution: privalyse scan ./new-feature --format markdown

🎯 Use Cases

For Developers

✅ Review AI-Generated Code: Catch hardcoded secrets and PII leaks before merging.
✅ Clean Up Debug Code: Find forgotten print() and console.log() statements.
✅ Learn GDPR: Understand privacy requirements while you code.

For Security Teams

✅ Quick Audits: Generate compliance reports in seconds.
✅ Track Progress: Monitor privacy improvements over time.
✅ CI/CD Integration (Roadmap): Catch issues early in the pipeline.

🗺️ Roadmap

Current (Alpha v0.1):

✅ Python & JavaScript/TypeScript analysis
✅ Cross-file taint tracking
✅ GDPR article mapping (Art. 5, 6, 9, 32)
✅ JSON, Markdown, HTML export
✅ .privalyseignore support

Next Up:

🔜 Data Flow display
🔜 Smarter detection Improving the rules and patterns.
🔜 More Compliance Standards (CCPA, HIPAA, etc.)
🔜 GitHub Actions integration (CI/CD ready)
🔜 Enhanced test coverage

Vision (Future):

🎯 Multi-language (Java, Go, Ruby, C#)
🔜 VS Code extension (lint as you code)
🎯 Team features (shared reports, trends)
🎯 AI-assisted fixes (not just detection)
🎯 Pre-commit hooks

Contributing

We're building this in the open. Contributions welcome!

Report bugs or suggest features via Issues
See CONTRIBUTING.md for guidelines

License & Disclaimer

MIT License - See LICENSE for details.

⚠️ Alpha Software: Privalyse helps identify privacy issues but:

Does not guarantee complete GDPR compliance
Not a substitute for legal counsel
Should be part of a broader security strategy
May have false positives/negatives as we improve

Always consult privacy professionals for compliance decisions.

Built by developers who care about privacy.
Report a bug • Request a feature • Contribute

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Privalyse

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.3.3

Dec 26, 2025

0.3.2

Dec 25, 2025

This version

0.3.1

Dec 23, 2025

0.3.0

Dec 23, 2025

0.2.1

Dec 22, 2025

0.2.0

Dec 21, 2025

0.1.0

Dec 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

privalyse_cli-0.3.1.tar.gz (152.9 kB view details)

Uploaded Dec 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

privalyse_cli-0.3.1-py3-none-any.whl (134.8 kB view details)

Uploaded Dec 23, 2025 Python 3

File details

Details for the file privalyse_cli-0.3.1.tar.gz.

File metadata

Download URL: privalyse_cli-0.3.1.tar.gz
Upload date: Dec 23, 2025
Size: 152.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for privalyse_cli-0.3.1.tar.gz
Algorithm	Hash digest
SHA256	`a9b06b2d8e17171c9e360e14ecd2d92bf0e0672d5807ca1b808c979702921bfa`
MD5	`68bb5d57f354f326c8c09c141f74837d`
BLAKE2b-256	`ae5df286fb7ef83c507b77895774f582e3103d1ed9fc83417058d177fb091ddc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for privalyse_cli-0.3.1.tar.gz:

Publisher: publish.yml on Privalyse/privalyse-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: privalyse_cli-0.3.1.tar.gz
- Subject digest: a9b06b2d8e17171c9e360e14ecd2d92bf0e0672d5807ca1b808c979702921bfa
- Sigstore transparency entry: 777568255
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: Privalyse/privalyse-cli@35dcb4585055e846ddc4a74340e4154754c26e9e
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/Privalyse
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@35dcb4585055e846ddc4a74340e4154754c26e9e
- Trigger Event: release

File details

Details for the file privalyse_cli-0.3.1-py3-none-any.whl.

File metadata

Download URL: privalyse_cli-0.3.1-py3-none-any.whl
Upload date: Dec 23, 2025
Size: 134.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for privalyse_cli-0.3.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`57a36126eeb4dbd6358384dcebf309cfcfd3f8754ba46a4b7dd654369f8e439a`
MD5	`58770de15c907147eef57c86ddc6ceb9`
BLAKE2b-256	`8372aabe96f9eb0b3009b21950cf0bc5559fe98be4b759ab78ff163ebe55a8c5`

See more details on using hashes here.

Provenance

The following attestation bundles were made for privalyse_cli-0.3.1-py3-none-any.whl:

Publisher: publish.yml on Privalyse/privalyse-cli

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: privalyse_cli-0.3.1-py3-none-any.whl
- Subject digest: 57a36126eeb4dbd6358384dcebf309cfcfd3f8754ba46a4b7dd654369f8e439a
- Sigstore transparency entry: 777568303
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: Privalyse/privalyse-cli@35dcb4585055e846ddc4a74340e4154754c26e9e
- Branch / Tag: refs/tags/v0.3.1
- Owner: https://github.com/Privalyse
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@35dcb4585055e846ddc4a74340e4154754c26e9e
- Trigger Event: release

privalyse-cli 0.3.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

🔒 Privalyse – Catch Privcay Leaks in AI-Assisted Codebases

✨ AI-Native Privacy & Guardrails (New in v0.3.0)

🔄 Continuous Monitoring (CI/CD)

👁️ Data Flow Visibility & Monitoring

Continuous Monitoring

GitHub Actions Example

Installation

Quick Start

⚙️ Configuration (Policy as Code)

🎥 See It In Action

📊 Example Reports

⚡ Try It Now (30 seconds)

🤖 AI Agent Integration

For Coding Agents

Example JSON Output for Agents

JSON Schema for Agents

What It Does

Features

💡 Why Privalyse?

1. The "Audit-Ready" Approach

2. Focus on Data Flows

3. The Human-in-the-Loop

🎯 Use Cases

For Developers

For Security Teams

🗺️ Roadmap

Contributing

License & Disclaimer

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance