Multi-Cloud Data Asset Control - CLI tool for discovering and analyzing cloud data infrastructure
Project description
Nuvu Scan
Multi-Cloud Data Asset Control CLI - Discover and analyze your cloud data infrastructure across AWS, GCP, Azure, and Databricks.
Installation
pip install nuvu-scan
Usage
# Scan AWS account (uses default credentials)
nuvu scan --provider aws
# Specify credentials via environment variables
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
nuvu scan --provider aws
# Output to JSON
nuvu scan --provider aws --output-format json --output-file report.json
# Scan specific regions
nuvu scan --provider aws --region us-east-1 --region eu-west-1
Features
- Asset Discovery: Automatically discovers S3 buckets, Glue databases/tables, Athena workgroups, Redshift clusters, and more
- Cost Estimation: Estimates monthly costs for all discovered assets
- Risk Detection: Flags public access, PII exposure, and other security risks
- Ownership Inference: Attempts to identify asset owners from tags and metadata
- Multiple Output Formats: HTML (default), JSON, and CSV reports
Output Formats
- HTML: Beautiful, interactive report with summary statistics
- JSON: Machine-readable format for integration with other tools
- CSV: Spreadsheet-friendly format for analysis
Cloud Provider Support
AWS (v1 - Available Now)
Nuvu requires read-only access to your AWS account. The tool uses the following AWS services:
- S3 (list buckets, get bucket metadata)
- Glue (list databases, tables)
- Athena (list workgroups, query history)
- Redshift (describe clusters, namespaces)
- CloudWatch (metrics)
- CloudTrail (audit logs)
See the IAM Policy Documentation for the exact permissions required.
GCP, Azure, Databricks (Coming Soon)
Multi-cloud support is built into the architecture. Additional providers will be added in future releases.
License
Apache 2.0
Website
Visit https://nuvu.dev for the SaaS version with continuous monitoring.
Development
Prerequisites
- Python 3.10+ (Python 3.8 and 3.9 are EOL)
- uv - Fast Python package installer and resolver
Setup Development Environment
# Clone the repository
git clone https://github.com/flexilogix/nuvu-scan.git
cd nuvu-scan
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies (uv automatically creates .venv)
uv sync --dev
Note: With uv, you don't need to manually activate a virtual environment! uv run automatically uses the .venv created by uv sync.
Running Tests
# Run all tests (uv automatically uses .venv)
uv run pytest
# Run with coverage
uv run pytest --cov=nuvu_scan --cov-report=html
# Run specific test file
uv run pytest tests/test_s3_collector.py
Code Quality
# Format code with black
uv run black .
# Lint with ruff
uv run ruff check .
# Type checking with mypy
uv run mypy nuvu_scan
Building the Package
# Build distribution packages (uses pyproject.toml)
uv build
# This creates:
# - dist/nuvu_scan-{version}.tar.gz (source distribution)
# - dist/nuvu_scan-{version}-py3-none-any.whl (wheel)
Note: uv uses pyproject.toml (PEP 621 standard) - no setup.py needed!
Running Locally
# Use uv run (automatically uses .venv, no activation needed)
uv run nuvu scan --provider aws
# Or install in development mode (optional)
uv pip install -e .
nuvu scan --provider aws
Contributing
We welcome contributions! Here's how to get started:
1. Fork and Clone
# Fork the repository on GitHub, then clone your fork
git clone https://github.com/your-username/nuvu-scan.git
cd nuvu-scan
# Add upstream remote
git remote add upstream https://github.com/flexilogix/nuvu-scan.git
2. Create a Branch
# Create a feature branch
git checkout -b feature/your-feature-name
# Or a bugfix branch
git checkout -b fix/your-bug-description
3. Make Changes
- Follow the existing code style (enforced by black and ruff)
- Add tests for new features
- Update documentation as needed
- Ensure all tests pass:
uv run pytest - Run code quality checks:
uv run black . && uv run ruff check .
4. Commit and Push
# Commit your changes
git add .
git commit -m "Description of your changes"
# Push to your fork
git push origin feature/your-feature-name
5. Create a Pull Request
- Go to https://github.com/flexilogix/nuvu-scan
- Click "New Pull Request"
- Select your branch
- Fill out the PR template
- Wait for review and CI checks to pass
Adding a New Cloud Provider
To add support for a new cloud provider (e.g., GCP):
-
Create provider module structure:
mkdir -p nuvu_scan/core/providers/gcp/collectors
-
Implement CloudProviderScan interface:
- Create
nuvu_scan/core/providers/gcp/gcp_scanner.py - Inherit from
CloudProviderScan - Implement
list_assets(),get_usage_metrics(),get_cost_estimate()
- Create
-
Create service collectors:
- One collector per service (e.g.,
gcs.py,bigquery.py) - Follow the pattern from AWS collectors
- One collector per service (e.g.,
-
Register in CLI:
- Update
nuvu_scan/cli/commands/scan.pyto support--provider gcp - Add provider to choices
- Update
-
Add tests:
- Create tests in
tests/providers/gcp/ - Mock API responses
- Create tests in
-
Update documentation:
- Update README.md
- Add provider-specific IAM/permissions docs
Project Structure
nuvu-scan/
├── nuvu_scan/ # Main package
│ ├── core/ # Core scanning engine
│ │ ├── base.py # CloudProviderScan interface
│ │ ├── providers/ # Provider implementations
│ │ │ ├── aws/ # AWS provider (v1)
│ │ │ ├── gcp/ # GCP provider (future)
│ │ │ └── azure/ # Azure provider (future)
│ │ └── models/ # Data models
│ └── cli/ # CLI interface
│ ├── commands/ # CLI commands
│ └── formatters/ # Output formatters
├── tests/ # Test suite
├── .github/
│ └── workflows/ # CI/CD workflows
├── pyproject.toml # Project configuration (uv)
└── README.md
Release Process
Releases are automated via GitHub Actions:
-
Create a release tag:
git tag -a v0.1.0 -m "Release v0.1.0" git push origin v0.1.0
-
Create GitHub Release:
- Go to https://github.com/flexilogix/nuvu-scan/releases
- Click "Draft a new release"
- Select the tag
- Add release notes
- Publish
-
Automated Publishing:
- GitHub Actions will automatically:
- Build the package
- Publish to PyPI
- Use trusted publishing (no API tokens needed)
- GitHub Actions will automatically:
CI/CD
The project uses GitHub Actions for:
-
CI (
.github/workflows/ci.yml):- Runs on every push and PR
- Tests on Python 3.8-3.12
- Runs linters (ruff, black)
- Runs type checker (mypy)
- Runs test suite
- Uploads coverage reports
-
Publish (
.github/workflows/publish.yml):- Triggers on GitHub releases
- Builds package
- Publishes to PyPI using trusted publishing
Questions?
- Open an issue for bugs or feature requests
- Check existing issues before creating new ones
- Join discussions in pull requests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nuvu_scan-0.2.0.tar.gz.
File metadata
- Download URL: nuvu_scan-0.2.0.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8700e0df104a0b8999d27a9eecf5ad079b9f14f7e63b04cc5eaac0a89571d166
|
|
| MD5 |
946b36dfe2c57106002befdb9ea70601
|
|
| BLAKE2b-256 |
33409f735d501c1d6369d312f5e7c20ea4095ce57f3142833e7a3e2e356e132a
|
Provenance
The following attestation bundles were made for nuvu_scan-0.2.0.tar.gz:
Publisher:
publish.yml on flexilogix/nuvu-scan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nuvu_scan-0.2.0.tar.gz -
Subject digest:
8700e0df104a0b8999d27a9eecf5ad079b9f14f7e63b04cc5eaac0a89571d166 - Sigstore transparency entry: 850026179
- Sigstore integration time:
-
Permalink:
flexilogix/nuvu-scan@9d2dd8f36860c86adbb447b3eef3936434b1349a -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/flexilogix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d2dd8f36860c86adbb447b3eef3936434b1349a -
Trigger Event:
release
-
Statement type:
File details
Details for the file nuvu_scan-0.2.0-py3-none-any.whl.
File metadata
- Download URL: nuvu_scan-0.2.0-py3-none-any.whl
- Upload date:
- Size: 23.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fe8ae514805c925400b29ecf10eed243a3bdf7fdfaa24db5691a6e65d298fc75
|
|
| MD5 |
773c3093409708e633db7d3c08f6aee1
|
|
| BLAKE2b-256 |
dbdae7c3c099bec96d21377f213a213928c102ff5fc33d5e6349b7bcbfac4447
|
Provenance
The following attestation bundles were made for nuvu_scan-0.2.0-py3-none-any.whl:
Publisher:
publish.yml on flexilogix/nuvu-scan
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
nuvu_scan-0.2.0-py3-none-any.whl -
Subject digest:
fe8ae514805c925400b29ecf10eed243a3bdf7fdfaa24db5691a6e65d298fc75 - Sigstore transparency entry: 850026181
- Sigstore integration time:
-
Permalink:
flexilogix/nuvu-scan@9d2dd8f36860c86adbb447b3eef3936434b1349a -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/flexilogix
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@9d2dd8f36860c86adbb447b3eef3936434b1349a -
Trigger Event:
release
-
Statement type: