Skip to main content

Multi-Cloud Data Asset Control - CLI tool for discovering and analyzing cloud data infrastructure

Project description

Nuvu Scan

Multi-Cloud Data Asset Control CLI - Discover and analyze your cloud data infrastructure across AWS, GCP, Azure, and Databricks.

Installation

pip install nuvu-scan

Usage

# Scan AWS account (uses default credentials)
nuvu scan --provider aws

# Specify credentials via environment variables
export AWS_ACCESS_KEY_ID=your-key
export AWS_SECRET_ACCESS_KEY=your-secret
nuvu scan --provider aws

# Output to JSON
nuvu scan --provider aws --output-format json --output-file report.json

# Scan specific regions
nuvu scan --provider aws --region us-east-1 --region eu-west-1

Features

  • Asset Discovery: Automatically discovers S3 buckets, Glue databases/tables, Athena workgroups, Redshift clusters, and more
  • Cost Estimation: Estimates monthly costs for all discovered assets
  • Risk Detection: Flags public access, PII exposure, and other security risks
  • Ownership Inference: Attempts to identify asset owners from tags and metadata
  • Multiple Output Formats: HTML (default), JSON, and CSV reports

Output Formats

  • HTML: Beautiful, interactive report with summary statistics
  • JSON: Machine-readable format for integration with other tools
  • CSV: Spreadsheet-friendly format for analysis

Cloud Provider Support

AWS (v1 - Available Now)

Nuvu requires read-only access to your AWS account. The tool uses the following AWS services:

  • S3 (list buckets, get bucket metadata)
  • Glue (list databases, tables)
  • Athena (list workgroups, query history)
  • Redshift (describe clusters, namespaces)
  • CloudWatch (metrics)
  • CloudTrail (audit logs)

See the IAM Policy Documentation for the exact permissions required.

GCP, Azure, Databricks (Coming Soon)

Multi-cloud support is built into the architecture. Additional providers will be added in future releases.

License

Apache 2.0

Website

Visit https://nuvu.dev for the SaaS version with continuous monitoring.


Development

Prerequisites

  • Python 3.10+ (Python 3.8 and 3.9 are EOL)
  • uv - Fast Python package installer and resolver

Setup Development Environment

# Clone the repository
git clone https://github.com/flexilogix/nuvu-scan.git
cd nuvu-scan

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies (uv automatically creates .venv)
uv sync --dev

Note: With uv, you don't need to manually activate a virtual environment! uv run automatically uses the .venv created by uv sync.

Running Tests

# Run all tests (uv automatically uses .venv)
uv run pytest

# Run with coverage
uv run pytest --cov=nuvu_scan --cov-report=html

# Run specific test file
uv run pytest tests/test_s3_collector.py

Code Quality

# Format code with black
uv run black .

# Lint with ruff
uv run ruff check .

# Type checking with mypy
uv run mypy nuvu_scan

Building the Package

# Build distribution packages (uses pyproject.toml)
uv build

# This creates:
# - dist/nuvu_scan-{version}.tar.gz (source distribution)
# - dist/nuvu_scan-{version}-py3-none-any.whl (wheel)

Note: uv uses pyproject.toml (PEP 621 standard) - no setup.py needed!

Running Locally

# Use uv run (automatically uses .venv, no activation needed)
uv run nuvu scan --provider aws

# Or install in development mode (optional)
uv pip install -e .
nuvu scan --provider aws

Contributing

We welcome contributions! Here's how to get started:

1. Fork and Clone

# Fork the repository on GitHub, then clone your fork
git clone https://github.com/your-username/nuvu-scan.git
cd nuvu-scan

# Add upstream remote
git remote add upstream https://github.com/flexilogix/nuvu-scan.git

2. Create a Branch

# Create a feature branch
git checkout -b feature/your-feature-name

# Or a bugfix branch
git checkout -b fix/your-bug-description

3. Make Changes

  • Follow the existing code style (enforced by black and ruff)
  • Add tests for new features
  • Update documentation as needed
  • Ensure all tests pass: uv run pytest
  • Run code quality checks: uv run black . && uv run ruff check .

4. Commit and Push

# Commit your changes
git add .
git commit -m "Description of your changes"

# Push to your fork
git push origin feature/your-feature-name

5. Create a Pull Request

Adding a New Cloud Provider

To add support for a new cloud provider (e.g., GCP):

  1. Create provider module structure:

    mkdir -p nuvu_scan/core/providers/gcp/collectors
    
  2. Implement CloudProviderScan interface:

    • Create nuvu_scan/core/providers/gcp/gcp_scanner.py
    • Inherit from CloudProviderScan
    • Implement list_assets(), get_usage_metrics(), get_cost_estimate()
  3. Create service collectors:

    • One collector per service (e.g., gcs.py, bigquery.py)
    • Follow the pattern from AWS collectors
  4. Register in CLI:

    • Update nuvu_scan/cli/commands/scan.py to support --provider gcp
    • Add provider to choices
  5. Add tests:

    • Create tests in tests/providers/gcp/
    • Mock API responses
  6. Update documentation:

    • Update README.md
    • Add provider-specific IAM/permissions docs

Project Structure

nuvu-scan/
├── nuvu_scan/              # Main package
│   ├── core/               # Core scanning engine
│   │   ├── base.py         # CloudProviderScan interface
│   │   ├── providers/       # Provider implementations
│   │   │   ├── aws/        # AWS provider (v1)
│   │   │   ├── gcp/        # GCP provider (future)
│   │   │   └── azure/      # Azure provider (future)
│   │   └── models/         # Data models
│   └── cli/                # CLI interface
│       ├── commands/       # CLI commands
│       └── formatters/     # Output formatters
├── tests/                  # Test suite
├── .github/
│   └── workflows/         # CI/CD workflows
├── pyproject.toml         # Project configuration (uv)
└── README.md

Release Process

Releases are automated via GitHub Actions:

  1. Create a release tag:

    git tag -a v0.1.0 -m "Release v0.1.0"
    git push origin v0.1.0
    
  2. Create GitHub Release:

  3. Automated Publishing:

    • GitHub Actions will automatically:
      • Build the package
      • Publish to PyPI
      • Use trusted publishing (no API tokens needed)

CI/CD

The project uses GitHub Actions for:

  • CI (.github/workflows/ci.yml):

    • Runs on every push and PR
    • Tests on Python 3.8-3.12
    • Runs linters (ruff, black)
    • Runs type checker (mypy)
    • Runs test suite
    • Uploads coverage reports
  • Publish (.github/workflows/publish.yml):

    • Triggers on GitHub releases
    • Builds package
    • Publishes to PyPI using trusted publishing

Questions?

  • Open an issue for bugs or feature requests
  • Check existing issues before creating new ones
  • Join discussions in pull requests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nuvu_scan-0.2.0.tar.gz (21.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nuvu_scan-0.2.0-py3-none-any.whl (23.3 kB view details)

Uploaded Python 3

File details

Details for the file nuvu_scan-0.2.0.tar.gz.

File metadata

  • Download URL: nuvu_scan-0.2.0.tar.gz
  • Upload date:
  • Size: 21.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nuvu_scan-0.2.0.tar.gz
Algorithm Hash digest
SHA256 8700e0df104a0b8999d27a9eecf5ad079b9f14f7e63b04cc5eaac0a89571d166
MD5 946b36dfe2c57106002befdb9ea70601
BLAKE2b-256 33409f735d501c1d6369d312f5e7c20ea4095ce57f3142833e7a3e2e356e132a

See more details on using hashes here.

Provenance

The following attestation bundles were made for nuvu_scan-0.2.0.tar.gz:

Publisher: publish.yml on flexilogix/nuvu-scan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file nuvu_scan-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: nuvu_scan-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 23.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for nuvu_scan-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe8ae514805c925400b29ecf10eed243a3bdf7fdfaa24db5691a6e65d298fc75
MD5 773c3093409708e633db7d3c08f6aee1
BLAKE2b-256 dbdae7c3c099bec96d21377f213a213928c102ff5fc33d5e6349b7bcbfac4447

See more details on using hashes here.

Provenance

The following attestation bundles were made for nuvu_scan-0.2.0-py3-none-any.whl:

Publisher: publish.yml on flexilogix/nuvu-scan

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page