Skip to main content

Open-Source Static Analysis for Privacy Data Flows

Project description

truScanner

Open-Source Static Analysis for Privacy Data Flows

truScanner is a static code analysis tool designed to discover and analyze personal data elements in your source code. It helps developers and security teams identify privacy-related data flows and generate comprehensive reports.

๐Ÿš€ Features

  • Comprehensive Detection: Identifies 110+ personal data elements (PII, financial data, device identifiers, etc.)
  • Interactive Menu: Arrow-key navigable menu for selecting output formats
  • Real-time Progress: Visual progress indicator during scanning
  • Multiple Report Formats: Generate reports in TXT, Markdown, or JSON format
  • Backend Integration: Optional upload to backend API for centralized storage
  • Auto-incrementing Reports: Automatically manages report file naming to prevent overwrites

๐Ÿ“ฆ Installation

Prerequisites

  • Python 3.9 or higher
  • pip or uv package manager

Install from Source

  1. Clone or navigate to the truscanner directory:

    cd truscanner
    
  2. Install dependencies:

    Using pip:

    pip install -r requirements.txt
    

    Or using uv:

    uv pip install -e .
    
  3. Verify installation:

    truScanner --help
    

๐Ÿ› ๏ธ Usage

Basic Usage

Scan a directory with the interactive menu:

truScanner scan <directory_path>

Example

truScanner scan ./src
truScanner scan ./my-project
truScanner scan C:\Users\username\projects\my-app

Interactive Workflow

  1. Select Output Format:

    • Use arrow keys (โ†‘โ†“) to navigate
    • Press Enter to select
    • Options: txt, md, json, or All (generates all three formats)
  2. Scanning Progress:

    • Real-time progress bar shows file count and percentage
    • Example: Scanning: 50/200 (25%) [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘โ–‘] filename.js
  3. Report Generation:

    • Reports are saved in Reports/{directory_name}/ folder
    • Files are named: truscan_report.txt, truscan_report.md, truscan_report.json
    • Subsequent scans auto-increment: truscan_report1.txt, truscan_report2.txt, etc.
  4. Backend Upload (Optional):

    • After reports are saved, you'll be prompted: Do you want to analyze? (Y/n):
    • Enter Y to upload scan results to backend API
    • Requires TRUSCANNER_BACKEND_URL in environment variables

Command Options

truScanner scan <directory> [OPTIONS]

Options:
  --with-presidio    Enable Presidio NLP scanner (requires model download)
  --with-ai          Enable AI/LLM scanner (requires OPENAI_API_KEY)
  --personal-only    Only report personal identifiable information (PII)
  --help             Show help message

Examples with Options

# Scan with only PII data
truScanner scan ./src --personal-only

# Scan with Presidio NLP scanner
truScanner scan ./src --with-presidio

# Scan with AI/LLM scanner
truScanner scan ./src --with-ai

๐Ÿ“Š Report Output

Report Location

Reports are saved in: Reports/{sanitized_directory_name}/

Report Formats

  • TXT Report (truscan_report.txt): Plain text format, easy to read
  • Markdown Report (truscan_report.md): Formatted markdown with headers and code blocks
  • JSON Report (truscan_report.json): Structured JSON data for programmatic access

Report Contents

Each report includes:

  • Scan Report ID: Unique 32-bit hash identifier
  • Summary: Total findings, time taken, files scanned
  • Findings by File: Detailed list of data elements found in each file
  • Summary by Category: Aggregated statistics by data category

Report ID

Each scan generates a unique Scan Report ID (32-bit MD5 hash) that:

  • Appears in the terminal after scanning
  • Is included at the top of all generated report files
  • Can be used to track and reference specific scans

๐Ÿ”ง Configuration

Backend Integration (Optional)

To enable backend upload, create a .env file in your project root or truscanner directory:

TRUSCANNER_BACKEND_URL=http://localhost:8000

Or for production:

TRUSCANNER_BACKEND_URL=https://api.example.com

When backend URL is configured and you answer "Y" to the analysis prompt, scan results will be uploaded to the backend API for storage in S3.

๐Ÿ“ Project Structure

truscanner/
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ main.py              # CLI entry point
โ”‚   โ”œโ”€โ”€ regex_scanner.py     # Core scanning engine
โ”‚   โ”œโ”€โ”€ report_utils.py      # Report utilities
โ”‚   โ””โ”€โ”€ utils.py             # Interactive menu & backend integration
โ”œโ”€โ”€ data_elements/           # Data element definitions (JSON files)
โ”œโ”€โ”€ Reports/                 # Generated reports (created automatically)
โ”œโ”€โ”€ requirements.txt         # Python dependencies
โ””โ”€โ”€ README.md

๐Ÿ› Troubleshooting

Interactive Menu Not Working

If the arrow-key menu doesn't appear, ensure inquirer is installed:

pip install inquirer

Backend Upload Fails

  • Check that TRUSCANNER_BACKEND_URL is set in your .env file
  • Verify the backend server is running
  • Check network connectivity to the backend URL

No Reports Generated

  • Ensure you have write permissions in the current directory
  • Check that the directory you're scanning contains readable files
  • Verify Python version is 3.9 or higher

๐Ÿ“ License

MIT License - see LICENSE file for details

๐Ÿค Support

For issues, questions, or contributions, please contact: hello@truconsent.io

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

truscanner-0.2.1.tar.gz (31.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

truscanner-0.2.1-py3-none-any.whl (42.8 kB view details)

Uploaded Python 3

File details

Details for the file truscanner-0.2.1.tar.gz.

File metadata

  • Download URL: truscanner-0.2.1.tar.gz
  • Upload date:
  • Size: 31.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for truscanner-0.2.1.tar.gz
Algorithm Hash digest
SHA256 8f5fa180cdf107dd30abe7c7384003ee6fa7516ee485edf3e9eb7985c2982698
MD5 e8117c5fb3e9e0bde1354fc97219ee88
BLAKE2b-256 b577410b97590543e1e3a530094061085d2e5f7792bd110f221fd42d96a062f3

See more details on using hashes here.

File details

Details for the file truscanner-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: truscanner-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 42.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.5

File hashes

Hashes for truscanner-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 28e3af8b55d98118be9ccc06ac56bb80702ca878729ceaab4e406eae2752a546
MD5 f77e0fb7dc10bcc956562f5ea25b4932
BLAKE2b-256 40a00df4e2c47f58005d09e2ab7c930fe7dc56cc196d55086975db88f25eb424

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page