Tool for detecting sensitive data leaks in files and cloud storage
Project description
Data Leak Inspector
Find sensitive data. Fix risky permissions.
Data Leak Inspector is a CLI tool that helps you identify potentially exposed files in your storage systems โ starting with Google Drive.
Instead of scanning file contents, DLI analyzes metadata and permissions to quickly highlight files that may be publicly accessible or shared.
๐ Features (v0.1)
- ๐ Metadata-based scanning (no file content access)
- โ๏ธ Google Drive integration
- ๐ Scans all files (including nested ones)
- ๐ Exposure detection based on permissions
- Public (anyone with link)
- Shared (users, groups, domain)
- Private
- ๐ฌ Human-readable explanations
- ๐ Clean CLI output with summaries
- โก Progress bar during scanning
- ๐งช Demo dataset for quick testing
๐ง How It Works
DLI does not read file contents.
Instead, it analyzes:
- File permissions (Google Drive API)
- Sharing settings
- Basic metadata (name, type, timestamps)
This allows:
โ Faster scans
โ Lower permissions required
โ Easier approval for Google APIs
โ Better privacy guarantees
๐ธ Example Output
Scanning 12/120: payroll.xlsx โโโโโโโโโโโโ 100%
SCAN RESULTS (BASIC)
[PUBLIC ] payroll_2024.xlsx
โ anyone with link (reader)
[SHARED ] team_notes.docx
โ shared with 3 user(s)
[PRIVATE] personal.txt
โ only accessible by owner
Summary:
Total files: 120
Public: 5
Shared: 18
Private: 97
โ๏ธ Installation
git clone https://github.com/yourusername/data-leak-inspector.git
cd data-leak-inspector
pip install -e .
๐งช Run with Demo Data
dli scan --demo
โ๏ธ Google Drive Setup
1. Create credentials
- Go to Google Cloud Console
- Enable Google Drive API
- Create OAuth credentials (Desktop app)
- Download
credentials.json
2. Place credentials
Create the directory:
~/Documents/dli/
Add:
credentials.json
3. Run scan
dli scan --gdrive
On first run:
- Browser will open for authentication
- A token.json file will be created
๐งพ CLI Usage
dli scan [OPTIONS]
Options
| Option | Description |
|---|---|
--demo |
Use built-in demo dataset |
--gdrive |
Scan Google Drive |
--verbose |
Show debug logs |
--quiet |
Show only errors |
--report |
Export results to JSON |
๐ Project Structure
leak_inspector/
โโโ application/
โ โโโ scanner.py
โ โโโ risk_evaluator.py
โ โโโ ports/
โโโ domain/
โ โโโ models.py
โ โโโ enums.py
โ โโโ reporting.py
โโโ infrastructure/
โ โโโ storage/
โ โโโ gdrive/
โโโ interfaces/
โ โโโ cli/
๐ Exposure Levels
| Level | Description |
|---|---|
| PUBLIC | Accessible by anyone with link |
| SHARED | Shared with specific users/groups |
| PRIVATE | Only accessible by owner |
โ ๏ธ Limitations (v0.1)
- โ No content scanning (PII detection)
- โ No Google Docs content parsing
- โ Heuristic-based risk (metadata only)
๐ฃ Roadmap
v0.1 (current)
- Metadata scanning
- Google Drive integration
- Exposure detection
๐ง Philosophy
DLI is designed to:
- โ Minimize permissions
- โ Respect user privacy
- โ Deliver fast insights
- โ Be transparent in analysis
๐ค Contributing
Contributions are welcome.
- Fork the repo
- Create a branch
- Submit a PR
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file data_leak_inspector-0.1.0.tar.gz.
File metadata
- Download URL: data_leak_inspector-0.1.0.tar.gz
- Upload date:
- Size: 15.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
759f41b53fc41354a96bcdb8b658e559416138151cc1bf4b42ef7a552a10f7a0
|
|
| MD5 |
62687cb7e3682cdc79ce9e75436edb78
|
|
| BLAKE2b-256 |
4c1ec67566505c5c297159a2d7ad5b098fd946223c3e5d3c6c4aa388103dcd13
|
File details
Details for the file data_leak_inspector-0.1.0-py3-none-any.whl.
File metadata
- Download URL: data_leak_inspector-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.25
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c55db341a9159089c575419978160c535f8c793af1649c4656aadfb89dc23e14
|
|
| MD5 |
8c93326e28247e6485ef5e75102e6078
|
|
| BLAKE2b-256 |
05393025e2f512b2fcaf8e0d352e0b67512a842ab0232dc639b289f85a384843
|