Skip to main content

AI-enhanced Managed File Transfer for Google Cloud (GCS, Filestore, Storage Transfer Service)

Project description

gcp-mft-ai

PyPI version License

AI-powered Managed File Transfer for Google Cloud (GCS, Filestore, Storage Transfer Service)

gcp-mft-ai is an open-source, production-grade Python library that transforms traditional file transfers on Google Cloud Platform (GCP) into intelligent, ML-optimized, secure operations.

It automates, predicts, protects, and optimizes file movement across:

  • Google Cloud Storage (GCS)
  • Cloud Filestore (NFS-based File System)
  • Storage Transfer Service (STS API)

Designed for large-scale enterprises, DevOps engineers, and AI/ML pipelines, gcp-mft-ai brings the future of AI-enhanced Managed File Transfer (MFT) into your cloud workflows.

Features

  • Upload/download large files intelligently
  • AES-256 encryption support
  • Predict transfer time with ML
  • Optimize best transfer windows
  • Detect anomalies in transfer logs

Core Capabilities

  • Multi-Service MFT: GCS bucket transfers, Filestore filesystem moves, GCP Storage Transfer Service orchestration
  • Encryption at Source: AES-256-GCM authenticated encryption (optional per transfer)
  • ML-Based Transfer Time Prediction: Predict upload/download times using Linear Regression or Random Forest models
  • Anomaly Detection: Identify unusual slowdowns, spikes, or transfer errors automatically using Isolation Forest
  • Transfer Window Optimization: Find the best network window (hour of day) to minimize congestion and maximize throughput
  • Resilient Transfers: Automatic retries, resumable uploads for large objects, GCP API throttling handling
  • Config-Driven Automation: Manage all settings via simple YAML or JSON configuration files

Internal Architecture

  • GCS Transfers: Built atop the google-cloud-storage SDK for resumable, secure, and reliable object transfers.

  • Filestore Transfers: Abstracted over NFS filesystem mounts, allowing simple shutil-based secure moves between instances or buckets.

  • Storage Transfer Service API: Dynamically creates and monitors cloud-to-cloud transfer jobs via authenticated REST API calls (fully IAM compliant).

  • Prediction Engine: i) Trained on historical transfer data (file_size_mb, transfer_time_sec), supports: ii) Linear Regression (lightweight, fast) iii) Random Forest (higher-accuracy, non-linear patterns)

  • Anomaly Detection: Isolation Forest model isolates unusual file size vs time behavior — flagging spikes, failures, and risks early.

  • Encryption Layer: AES-256 encryption with GCM mode ensures data integrity and confidentiality before movement.

  • Optimization Layer: Hour-by-hour analysis of historical transfer speeds to recommend the best operational windows.

Security-First Design

  • Encryption: Native AES-256-GCM encryption/decryption for any file before or after cloud storage.

  • Token Management: Secure OAuth2 token usage for Storage Transfer Service API access.

  • No plaintext secrets: Designed for service account usage via environment or config.

Usage Overview

  • Upload to GCS: upload_to_gcs(source_path, bucket, destination_path)
  • Download from GCS: download_from_gcs(blob_name, bucket, destination_path)
  • Upload to Filestore: upload_to_filestore(source_path, mount_point, relative_path)
  • Launch Storage Transfer Job: launch_storage_transfer_job(source_bucket, destination_bucket, project_id)
  • Predict Transfer Time: predict_transfer_time(file_size_mb)
  • Detect Anomalies: detect_transfer_anomalies(csv_log_path)
  • Find Best Transfer Window: find_best_transfer_window(csv_log_path)

Real-World Use Cases

  • Media & Entertainment: Migrate large UHD videos to GCS for editing pipelines
  • AI/ML Model Training: Transfer terabyte datasets securely and predictably to TPU training zones
  • Backup & Disaster Recovery: Automate and encrypt cross-region backup uploads with anomaly alerting
  • Healthcare & Finance: Securely move critical records across cloud environments with end-to-end encryption
  • Retail Analytics: Optimize massive log file ingestion pipelines to GCP data lakes

Technology Stack

  • Python 3.7+

  • Google Cloud SDKs (google-cloud-storage, requests)

  • Cryptography (AES-256-GCM secure cipher)

  • scikit-learn (ML Models: Linear Regression, Random Forest, Isolation Forest)

  • pandas (Data preparation for ML)

  • pyyaml (Config loading)

  • joblib (Model persistence)

MIT License

Author: Raghava Chellu

MIT License is freely usable for academic, personal, and commercial projects.

Installation

pip install gcp-mft-ai

Deployment Readiness

  • PyPI-ready (setup.py, pyproject.toml)

  • Full unit testing (unittest framework)

  • Full documentation (README.md, examples/)

  • Cloud deployment friendly (Docker/CI/CD pipelines)

Conclusion

Traditional file transfers are simple. Modern file transfers must be intelligent, secure, and predictive. gcp-mft-ai brings cutting-edge AI and cloud-native automation to Managed File Transfer on Google Cloud securing your data, optimizing your operations, and helping you move smarter, stronger, and faster.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcp_mft_ai-0.1.3.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gcp_mft_ai-0.1.3-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file gcp_mft_ai-0.1.3.tar.gz.

File metadata

  • Download URL: gcp_mft_ai-0.1.3.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for gcp_mft_ai-0.1.3.tar.gz
Algorithm Hash digest
SHA256 079ef9875c0497d147e06518a622508f36e90382a9ccf3cec5ed23325c64f533
MD5 2ae40466fff51e2b35a7eeb76089cbc0
BLAKE2b-256 c379ce8ff3b43b063ece4fd92229530ec2467d5e739daf33a248b036f5195072

See more details on using hashes here.

File details

Details for the file gcp_mft_ai-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: gcp_mft_ai-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for gcp_mft_ai-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 8c062984c16cd9f374b824100a74fc3d944035fc4d8a48a96b9920f54d26857f
MD5 cacf93db97695625b2c27d1999fc8b93
BLAKE2b-256 7714d22a3d889e6894dc1e0d173fc3f0fc52354a26b58a8d9d2dc8c7f7bdea6f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page