GCP Project Inventory Collection Library
Project description
pygcpinventory
GCP Project Inventory Collection Library - A lightweight Python library for collecting metadata from Google Cloud Platform objects across multiple services.
Overview
pygcpinventory provides a unified interface to discover and collect metadata from GCP resources including:
- ⏰ Cloud Scheduler jobs (TRIGGER)
- 🔄 Cloud Workflows (WORKFLOW)
- ⚡ Cloud Functions (FUNCTION)
- 📊 BigQuery datasets (DATASET)
- 🪣 Cloud Storage buckets (BUCKET)
- 📨 Pub/Sub topics (TOPIC)
- 📝 Logging sinks (SINK)
Features
- Stable Object IDs: Consistent ID generation across runs (OBJ0001, OBJ0002, ...)
- Type-Safe: Full type hints and enum-based object types
- Minimal Dependencies: Only requires GCP client libraries
- Test-Driven: 91% code coverage with comprehensive tests
Installation
pip install -e .
Quick Start
from gcpinventory import ETLObject, ObjectType, ObjectIDAssigner
# Create GCP objects
objects = [
ETLObject(
object_id=None,
object_type=ObjectType.TRIGGER,
name="daily-scheduler",
gcp_resource_name="projects/my-project/locations/us-central1/jobs/daily-scheduler"
),
ETLObject(
object_id=None,
object_type=ObjectType.WORKFLOW,
name="etl-workflow",
gcp_resource_name="projects/my-project/locations/us-central1/workflows/etl-workflow"
),
]
# Assign stable IDs
assigner = ObjectIDAssigner()
assigner.assign_ids(objects)
# Use the objects
for obj in objects:
print(f"{obj.object_id}: {obj.name} ({obj.object_type.value})")
Output:
OBJ0001: daily-scheduler (TRIGGER)
OBJ0002: etl-workflow (WORKFLOW)
Core Components
ETLObject
Represents a discovered GCP object with metadata:
object_id: Unique identifier (OBJ0001, OBJ0002, ...)object_type: Type of GCP resource (ObjectType enum)name: Object namegcp_resource_name: Full GCP resource pathmetadata: Additional service-specific metadata (dict)
ObjectType Enum
Seven supported GCP object types:
TRIGGER- Cloud Scheduler jobsWORKFLOW- Cloud WorkflowsFUNCTION- Cloud FunctionsDATASET- BigQuery datasetsBUCKET- Cloud Storage bucketsTOPIC- Pub/Sub topicsSINK- Logging sinks
ObjectIDAssigner
Assigns stable, unique IDs to objects:
- Generates IDs in format OBJ0001, OBJ0002, ...
- Deduplication: same object always gets same ID
- Supports reverse lookup (ID → name)
Development
# Install dev dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Run tests with coverage
pytest --cov=gcpinventory --cov-report=html
# Format code
black gcpinventory tests
isort gcpinventory tests
Testing
Test Coverage: 91% (27/27 tests passing)
# Run all tests (unit + integration)
pytest tests/ -v
# Run unit tests only
pytest tests/test_models.py tests/test_assigner.py -v
# Run integration tests with real GCP credentials
pytest tests/test_integration_gcp.py -v
Integration Tests
Integration tests validate the package with real GCP service accounts:
- Authentication: Verifies service account loading and GCP API connectivity
- BigQuery Collection: Tests fetching real datasets and creating ETLObjects
- Cloud Scheduler: Tests collecting Cloud Scheduler jobs as TRIGGER objects
- ID Assignment: Validates stable ID generation with production data
- Serialization: Tests ETLObject.to_dict() with real GCP metadata
Requirements:
- Service account file at:
E:\A\GCP_ETL_Pipeline\hackathon\SyncFlow_GCP_Intelligence\config\service-account.json - GCP project:
prismatic-smoke-463810-c1 - APIs enabled: BigQuery, Cloud Scheduler
Project Structure
gcpinventory/
├── __init__.py # Public API
├── version.py # Version info
├── models.py # Data models (ETLObject, ObjectType, EdgeType)
└── assigner.py # ID assignment logic
tests/
├── test_models.py # Model tests (11 tests)
├── test_assigner.py # Assigner tests (8 tests)
└── test_integration_gcp.py # Integration tests with real GCP (8 tests)
License
Apache License 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pygcpinventory-0.1.0.tar.gz.
File metadata
- Download URL: pygcpinventory-0.1.0.tar.gz
- Upload date:
- Size: 12.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df5cbbae5ac8c4cbe11dc3dd99ce211a28a5049e766bcccb0fb5d4717c6890de
|
|
| MD5 |
2658229661fbb970e34cba1383ff9b46
|
|
| BLAKE2b-256 |
4b9f4e344f6961cf86c15c452c72479d0c48c91b431b00ff428ebae82142d7c4
|
File details
Details for the file pygcpinventory-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pygcpinventory-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ff3f0aabf591193ad42565515be225d5e733f0a2fc59481818953c4365a989f
|
|
| MD5 |
587a4a97ef9d7c460bbf7586aaf77df6
|
|
| BLAKE2b-256 |
34d6e9f75ce64bb00f211acf5e37490ef842f044a760b10175364c421dc1ab05
|