Skip to main content

Cloud-agnostic resource harvesting with a unified resource model.

Project description

Cloud Harvester

Cloud-agnostic harvesting for AWS and Azure inventories. The collect() entry point fans out to built-in collectors across compute, containers/serverless, networking and edge, storage, databases, identity/security, and observability; limit scope with providers or inject your own boto3/Azure clients.

Every record is normalized into a Resource dataclass with fields like id, provider, kind, resource (service), name, region, status, network_id, subnetwork_id, tags, and the raw source payload for downstream use. Resources also include cloud-agnostic graph fields: scope, placements, relationships, and addresses.

Quickstart

import boto3
from azure.identity import ClientSecretCredential
from cloud_harvester import collect

# AWS: static credentials (replace with real values)
aws_session = boto3.Session(
    aws_access_key_id="FAKEAWSACCESSKEY123",
    aws_secret_access_key="FAKEAWSSECRETKEY456",
)

# Azure: service principal credentials (replace with real values)
azure_credential = ClientSecretCredential(
    tenant_id="00000000-0000-0000-0000-000000000000",
    client_id="11111111-1111-1111-1111-111111111111",
    client_secret="fake-azure-client-secret",
)
azure_subscription_id = "22222222-2222-2222-2222-222222222222"

# Collect from both providers with injected sessions/credentials
result = collect(
    providers=["aws", "azure"],
    aws_session=aws_session,
    azure_credential=azure_credential,
    azure_subscription_id=azure_subscription_id,
)

for res in result.resources:
    print(res.to_dict())

for err in result.errors:
    print(err.provider, err.collector, err.region, err.error_code, err.error_message)

collect() returns a CollectionResult with two fields:

  • resources: List[Resource] — every resource successfully gathered.
  • errors: List[CollectorError] — one entry per collector invocation that raised. Each entry carries the provider, collector name, region (AWS only; None for global/Azure collectors), an error_code parsed from the underlying exception when possible (botocore ClientError.response["Error"]["Code"] for AWS, azure.core.exceptions.HttpResponseError.error.code for Azure), and the raw error_message. Failures are still logged at ERROR, but the structured list is the canonical surface for callers that want to show users what went wrong (e.g. an AccessDenied on a specific collector/region).

AWS collection scans all enabled regions by default. To limit AWS scope, pass an explicit region list:

result = collect(
    providers=["aws"],
    aws_session=aws_session,
    aws_regions=["eu-west-1", "us-east-1"],
)

If aws_regions is omitted, Cloud Harvester discovers and scans all enabled AWS regions. You can also set CLOUD_HARVESTER_AWS_REGIONS to a comma-separated list.

Collectors run concurrently by default. To tune API pressure, pass max_workers or set CLOUD_HARVESTER_MAX_WORKERS:

result = collect(
    providers=["aws"],
    aws_session=aws_session,
    max_workers=8,
)

Development

Run the checks with:

make fix
make test

Logging

Cloud Harvester uses Python's standard logging module. Enable INFO logs to see provider progress, AWS region discovery, per-region collector progress, collector result counts, and collector failures:

import logging

logging.basicConfig(level=logging.INFO)

Credentials

  • AWS: In the AWS console, create or reuse an IAM role/user with read permissions. Minimum managed policies to attach:

    • ReadOnlyAccess
    • AmazonEC2ReadOnlyAccess
    • AmazonEKSMCPReadOnlyAccess Generate access keys, then either:
    • Export AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_SESSION_TOKEN (if temporary credentials), or
    • Store them in an AWS_PROFILE and point AWS_PROFILE/CLOUD_HARVESTER_AWS_PROFILE at it. Optionally set CLOUD_HARVESTER_AWS_REGIONS to limit collection to specific regions. If it is omitted, Cloud Harvester discovers and scans all enabled account regions.
  • Azure: Create an App Registration (service principal) in Microsoft Entra ID and assign it the required RBAC roles on your subscription (Reader, Security Reader, Key Vault Reader). Capture:

    • tenant_id, client_id, client_secret from the service principal
    • subscription_id for the target subscription If Azure AD collectors are needed, add Microsoft Graph app permissions (e.g., Directory.Read.All) and have an admin grant consent.
      Either set AZURE_SUBSCRIPTION_ID / AZURE_TENANT_ID (or CLOUD_HARVESTER_*) or pass a ClientSecretCredential created from these values.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cloud_harvester-0.2.0.tar.gz (42.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cloud_harvester-0.2.0-py3-none-any.whl (52.8 kB view details)

Uploaded Python 3

File details

Details for the file cloud_harvester-0.2.0.tar.gz.

File metadata

  • Download URL: cloud_harvester-0.2.0.tar.gz
  • Upload date:
  • Size: 42.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for cloud_harvester-0.2.0.tar.gz
Algorithm Hash digest
SHA256 1ee1116f66d41fe2e1cf790211a9ad1e57f0347c25c3edb161430151c1eddebb
MD5 e2ce427f06d5719b2edfd5a36025a30c
BLAKE2b-256 9634d22dbb5838bfb18f0a7d72f80363ba6b7e27e5d54783180479a025420976

See more details on using hashes here.

File details

Details for the file cloud_harvester-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for cloud_harvester-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3af73a7d74cb1a8a9d097ea0c24b8b9d367a533a74a6e95bf61e3426d4562a6b
MD5 fba66e5b47254f55867ea8f058956480
BLAKE2b-256 4db3af491e208de3f63306bb5bbb0dad6608a8b4b94feeb6530dbb2b51d2be76

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page