Skip to main content

Cloud-agnostic resource harvesting with a unified resource model.

Project description

Cloud Harvester

Cloud-agnostic harvesting for AWS and Azure inventories. The collect() entry point fans out to built-in collectors across compute, containers/serverless, networking and edge, storage, databases, identity/security, and observability; limit scope with providers or inject your own boto3/Azure clients.

Every record is normalized into a Resource dataclass with fields like id, provider, kind, resource (service), name, region, status, network_id, subnetwork_id, tags, and the raw source payload for downstream use.

Quickstart

import boto3
from azure.identity import ClientSecretCredential
from cloud_harvester import collect

# AWS: static credentials (replace with real values)
aws_session = boto3.Session(
    aws_access_key_id="FAKEAWSACCESSKEY123",
    aws_secret_access_key="FAKEAWSSECRETKEY456",
)

# Azure: service principal credentials (replace with real values)
azure_credential = ClientSecretCredential(
    tenant_id="00000000-0000-0000-0000-000000000000",
    client_id="11111111-1111-1111-1111-111111111111",
    client_secret="fake-azure-client-secret",
)
azure_subscription_id = "22222222-2222-2222-2222-222222222222"

# Collect from both providers with injected sessions/credentials
resources = collect(
    providers=["aws", "azure"],
    aws_session=aws_session,
    azure_credential=azure_credential,
    azure_subscription_id=azure_subscription_id,
)

for res in resources:
    print(res.to_dict())

AWS collection scans all enabled regions by default. To limit AWS scope, pass an explicit region list:

resources = collect(
    providers=["aws"],
    aws_session=aws_session,
    aws_regions=["eu-west-1", "us-east-1"],
)

If aws_regions is omitted, Cloud Harvester discovers and scans all enabled AWS regions. You can also set CLOUD_HARVESTER_AWS_REGIONS to a comma-separated list.

Collectors run concurrently by default. To tune API pressure, pass max_workers or set CLOUD_HARVESTER_MAX_WORKERS:

resources = collect(
    providers=["aws"],
    aws_session=aws_session,
    max_workers=8,
)

Logging

Cloud Harvester uses Python's standard logging module. Enable INFO logs to see provider progress, AWS region discovery, per-region collector progress, collector result counts, and collector failures:

import logging

logging.basicConfig(level=logging.INFO)

Credentials

  • AWS: In the AWS console, create or reuse an IAM role/user with read permissions. Minimum managed policies to attach:

    • ReadOnlyAccess
    • AmazonEC2ReadOnlyAccess
    • AmazonEKSMCPReadOnlyAccess Generate access keys, then either:
    • Export AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN (if temporary credentials), or
    • Store them in an AWS_PROFILE and point AWS_PROFILE/CLOUD_HARVESTER_AWS_PROFILE at it. Optionally set CLOUD_HARVESTER_AWS_REGIONS to limit collection to specific regions. If it is omitted, Cloud Harvester discovers and scans all enabled account regions.
  • Azure: Create an App Registration (service principal) in Microsoft Entra ID and assign it the required RBAC roles on your subscription (Reader, Security Reader, Key Vault Reader). Capture:

    • tenant_id, client_id, client_secret from the service principal
    • subscription_id for the target subscription If Azure AD collectors are needed, add Microsoft Graph app permissions (e.g., Directory.Read.All) and have an admin grant consent.
      Either set AZURE_SUBSCRIPTION_ID / AZURE_TENANT_ID (or CLOUD_HARVESTER_*) or pass a ClientSecretCredential created from these values.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloud_harvester-0.1.4.tar.gz (30.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloud_harvester-0.1.4-py3-none-any.whl (42.9 kB view details)

Uploaded Python 3

File details

Details for the file cloud_harvester-0.1.4.tar.gz.

File metadata

  • Download URL: cloud_harvester-0.1.4.tar.gz
  • Upload date:
  • Size: 30.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for cloud_harvester-0.1.4.tar.gz
Algorithm Hash digest
SHA256 c3cfcb051a72a0e1b51e745e1017bf1dfa4d9039de36c4fd5bff8f3f9444b43d
MD5 a9b9b7d55669ba9f48015e226518f3d2
BLAKE2b-256 530be5918430c8a0ee85707486addff6a6ae62b553c3dbf5c919893cd49ca5f5

See more details on using hashes here.

File details

Details for the file cloud_harvester-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for cloud_harvester-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 10acf68c818447005d01c2d8a2490145eeb2c1d89689fdfe49074a17abc1fb5f
MD5 9b8561907059ed07f479b2de5f08f700
BLAKE2b-256 f58aa329f00f70db72a57eb3800344170b2ad68d07bef9358dcf7644924c3869

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page