Skip to main content

AI-enhanced Managed File Transfer for Google Cloud (GCS, Filestore, Storage Transfer Service)

Project description

gcp-mft-ai

PyPI version License

AI-powered Managed File Transfer for Google Cloud (GCS, Filestore, Storage Transfer Service)

gcp-mft-ai is an open-source, production-grade Python library that transforms traditional file transfers on Google Cloud Platform (GCP) into intelligent, ML-optimized, secure operations.

It automates, predicts, protects, and optimizes file movement across:

  • Google Cloud Storage (GCS)
  • Cloud Filestore (NFS-based File System)
  • Storage Transfer Service (STS API)

Designed for large-scale enterprises, DevOps engineers, and AI/ML pipelines, gcp-mft-ai brings the future of AI-enhanced Managed File Transfer (MFT) into your cloud workflows.

Features

  • Upload/download large files intelligently
  • AES-256 encryption support
  • Predict transfer time with ML
  • Optimize best transfer windows
  • Detect anomalies in transfer logs

Core Capabilities

  • Multi-Service MFT: GCS bucket transfers, Filestore filesystem moves, GCP Storage Transfer Service orchestration
  • Encryption at Source: AES-256-GCM authenticated encryption (optional per transfer)
  • ML-Based Transfer Time Prediction: Predict upload/download times using Linear Regression or Random Forest models
  • Anomaly Detection: Identify unusual slowdowns, spikes, or transfer errors automatically using Isolation Forest
  • Transfer Window Optimization: Find the best network window (hour of day) to minimize congestion and maximize throughput
  • Resilient Transfers: Automatic retries, resumable uploads for large objects, GCP API throttling handling
  • Config-Driven Automation: Manage all settings via simple YAML or JSON configuration files

Internal Architecture

  • GCS Transfers: Built atop the google-cloud-storage SDK for resumable, secure, and reliable object transfers.

  • Filestore Transfers: Abstracted over NFS filesystem mounts, allowing simple shutil-based secure moves between instances or buckets.

  • Storage Transfer Service API: Dynamically creates and monitors cloud-to-cloud transfer jobs via authenticated REST API calls (fully IAM compliant).

  • Prediction Engine: i) Trained on historical transfer data (file_size_mb, transfer_time_sec), supports: ii) Linear Regression (lightweight, fast) iii) Random Forest (higher-accuracy, non-linear patterns)

  • Anomaly Detection: Isolation Forest model isolates unusual file size vs time behavior — flagging spikes, failures, and risks early.

  • Encryption Layer: AES-256 encryption with GCM mode ensures data integrity and confidentiality before movement.

  • Optimization Layer: Hour-by-hour analysis of historical transfer speeds to recommend the best operational windows.

Security-First Design

  • Encryption: Native AES-256-GCM encryption/decryption for any file before or after cloud storage.

  • Token Management: Secure OAuth2 token usage for Storage Transfer Service API access.

  • No plaintext secrets: Designed for service account usage via environment or config.

Usage Overview

  • Upload to GCS: upload_to_gcs(source_path, bucket, destination_path)
  • Download from GCS: download_from_gcs(blob_name, bucket, destination_path)
  • Upload to Filestore: upload_to_filestore(source_path, mount_point, relative_path)
  • Launch Storage Transfer Job: launch_storage_transfer_job(source_bucket, destination_bucket, project_id)
  • Predict Transfer Time: predict_transfer_time(file_size_mb)
  • Detect Anomalies: detect_transfer_anomalies(csv_log_path)
  • Find Best Transfer Window: find_best_transfer_window(csv_log_path)

Real-World Use Cases

  • Media & Entertainment: Migrate large UHD videos to GCS for editing pipelines
  • AI/ML Model Training: Transfer terabyte datasets securely and predictably to TPU training zones
  • Backup & Disaster Recovery: Automate and encrypt cross-region backup uploads with anomaly alerting
  • Healthcare & Finance: Securely move critical records across cloud environments with end-to-end encryption
  • Retail Analytics: Optimize massive log file ingestion pipelines to GCP data lakes

Technology Stack

  • Python 3.7+

  • Google Cloud SDKs (google-cloud-storage, requests)

  • Cryptography (AES-256-GCM secure cipher)

  • scikit-learn (ML Models: Linear Regression, Random Forest, Isolation Forest)

  • pandas (Data preparation for ML)

  • pyyaml (Config loading)

  • joblib (Model persistence)

MIT License

Author: Raghava Chellu MIT License is freely usable for academic, personal, and commercial projects.

Installation

pip install gcp-mft-ai

Deployment Readiness

  • PyPI-ready (setup.py, pyproject.toml)

  • Full unit testing (unittest framework)

  • Full documentation (README.md, examples/)

  • Cloud deployment friendly (Docker/CI/CD pipelines)

Conclusion

Traditional file transfers are simple. Modern file transfers must be intelligent, secure, and predictive. gcp-mft-ai brings cutting-edge AI and cloud-native automation to Managed File Transfer on Google Cloud securing your data, optimizing your operations, and helping you move smarter, stronger, and faster.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gcp_mft_ai-0.1.2.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gcp_mft_ai-0.1.2-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file gcp_mft_ai-0.1.2.tar.gz.

File metadata

  • Download URL: gcp_mft_ai-0.1.2.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for gcp_mft_ai-0.1.2.tar.gz
Algorithm Hash digest
SHA256 febd51c22ac6dfc8b498f2518c29d5958728bc5c34b0140f822738c339eb7e0f
MD5 d664fb4ac8e9df7de49221e7e172bf7d
BLAKE2b-256 f0bcd57d3bd1627c52f66046f7adb252581cfa1b488b41c103408c2f898e6561

See more details on using hashes here.

File details

Details for the file gcp_mft_ai-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: gcp_mft_ai-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for gcp_mft_ai-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3444315bfd9d3cbe05e422a3b5b66a4480bb1281d83207767dcb477d00d4653f
MD5 25106da61598690f5285836cc1e1e299
BLAKE2b-256 7c5d3ee39d563c2abb4e26a5810cb0b2233d79cc956e1b95d23d2a73ff10d40f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page