Auto-sync Kaggle notebook outputs to Google Drive or local machine via ngrok
Project description
kgout
Auto-sync Kaggle notebook outputs to Google Drive or your local machine.
When running long ML experiments on Kaggle, kernels can time out or sessions expire — and your output files disappear. kgout watches /kaggle/working/ in the background and automatically syncs new or modified files to Google Drive or exposes them via an ngrok tunnel for instant local download.
Drop it into any notebook as a single cell.
Install
pip install kgout[gdrive] # Google Drive (recommended)
pip install kgout[local] # ngrok tunnel (quick experiments < 2h)
pip install kgout[all] # both
Quick Start
Google Drive (Recommended)
Works for runs of any length. Survives session disconnects. Files auto-upload the moment they're saved.
One-time setup (5 minutes, on your local machine):
pip install kgout[gdrive]
kgout-auth --client-secrets /path/to/client_secrets.json
This opens a browser, you log into Google, and it saves kgout_token.json. Upload that file to Kaggle as a private dataset.
How to get
client_secrets.json:
- Go to Google Cloud Console → Credentials
- Click Create Credentials → OAuth client ID
- Application type: Desktop app
- Download the JSON
In your Kaggle notebook:
!pip install kgout[gdrive] -q
from kgout import KgOut
kg = KgOut(
folder_id="1aBcDeFgHiJkLmNoPqRsTuVwXyZ", # from Drive folder URL
credentials="/kaggle/input/kgout-credentials/kgout_token.json",
).start()
# ... your training code ...
# Every new file auto-uploads to Google Drive.
# No kg.stop() needed — uploads continue until the kernel ends.
Local Download via ngrok
Exposes /kaggle/working/ as a browsable URL. Good for quick experiments.
import os
os.environ["NGROK_AUTH_TOKEN"] = "your_token" # free at ngrok.com
from kgout import KgOut
kg = KgOut("local").start()
# Open the printed URL in your browser.
# ⚠️ ngrok free tier: tunnel disconnects after ~2 hours.
Both at Once
Google Drive for persistence, ngrok for instant browsing while it lasts:
kg = KgOut(
dest=["gdrive", "local"],
folder_id="1aBcDeFgHiJkLmNoPqRsTuVwXyZ",
credentials="/kaggle/input/kgout-credentials/kgout_token.json",
).start()
Context manager vs manual start
# ✅ RECOMMENDED — stays alive after training ends
kg = KgOut(...).start()
train_model()
# ← still running, syncing continues
# ⚠️ Context manager — STOPS when the block ends
with KgOut(...) as kg:
train_model()
# ← dead here, no more syncing
For Kaggle, always use .start(). The context manager kills everything when your code finishes.
Setting Up Google Drive
Step 1: Create OAuth2 Credentials (one-time)
- Go to Google Cloud Console
- Create a project (or use existing) and enable the Google Drive API
- Go to APIs & Services → Credentials
- Click Create Credentials → OAuth client ID
- Application type: Desktop app → Create
- Download the JSON (this is your
client_secrets.json)
Step 2: Generate Token (one-time, on your local machine)
pip install kgout[gdrive]
kgout-auth --client-secrets /path/to/client_secrets.json
A browser opens. Log in with your Google account and grant access. A file called kgout_token.json is saved.
Step 3: Upload Token to Kaggle
- Go to https://www.kaggle.com/datasets/new
- Name:
kgout-credentials→ make it Private - Upload
kgout_token.json→ Create
Step 4: Get Your Folder ID
In Google Drive, create a folder for outputs. The folder ID is in the URL:
https://drive.google.com/drive/folders/1aBcDeFgHiJkLmNoPqRsTuVwXyZ
└──── this is folder_id ────┘
Step 5: Use in Notebook
!pip install kgout[gdrive] -q
from kgout import KgOut
kg = KgOut(
folder_id="1aBcDeFgHiJkLmNoPqRsTuVwXyZ",
credentials="/kaggle/input/kgout-credentials/kgout_token.json",
).start()
Done. Every file saved to /kaggle/working/ auto-uploads to your Drive folder.
Service Accounts (Alternative)
Service accounts still work for Google Workspace Shared Drives. If you have a Workspace account (university, company), you can use a service account JSON directly:
kg = KgOut(
folder_id="SHARED_DRIVE_FOLDER_ID",
credentials="/kaggle/input/my-creds/service_account.json",
).start()
Note: Service accounts cannot upload to regular (personal) Google Drive folders — Google returns storageQuotaExceeded. Use OAuth2 credentials for personal Drive.
Setting Up ngrok
- Create a free account at ngrok.com
- Copy your auth token from the dashboard
- In your notebook:
import os os.environ["NGROK_AUTH_TOKEN"] = "your_token"
Tip: Store the token as a Kaggle Secret:
from kaggle_secrets import UserSecretsClient
os.environ["NGROK_AUTH_TOKEN"] = UserSecretsClient().get_secret("NGROK_AUTH_TOKEN")
Configuration
| Parameter | Default | Description |
|---|---|---|
dest |
"gdrive" |
"gdrive", "local", or ["gdrive", "local"] |
watch_dir |
/kaggle/working |
Directory to watch (recursive) |
interval |
30 |
Seconds between scans (min: 5) |
ignore |
see below | Glob patterns for files to skip |
snapshot_existing |
True |
If True, skip files that exist before start() |
folder_id |
— | Google Drive folder ID |
credentials |
— | Path to credentials JSON (OAuth2 token or service account) |
ngrok_token |
— | ngrok auth token |
port |
8384 |
Local file server port |
verbose |
True |
Enable logging output |
Environment Variables
| Variable | Description |
|---|---|
KGOUT_GDRIVE_FOLDER_ID |
Google Drive folder ID |
KGOUT_GDRIVE_CREDENTIALS |
Path to credentials JSON |
NGROK_AUTH_TOKEN |
ngrok authentication token |
Default Ignore Patterns
These files are never synced: *.ipynb, *.pyc, *.tmp, *.lock, *.log, *.swp, *.swo, .DS_Store, Thumbs.db, hidden files (starting with .), and directories .ipynb_checkpoints, __pycache__, .git.
Override with ignore=["*.csv"] or pass ignore=[] to sync everything.
How It Works
- Snapshot: On
start(), kgout fingerprints all existing files so they don't trigger syncs - Poll: A daemon thread scans the watch directory every N seconds
- Settle check: Files modified in the last 2 seconds are skipped (still being written)
- Compare: Each file's fingerprint is compared against the snapshot
- Sync: New or modified files are sent to the configured destination(s)
- Cleanup: On
stop(), watcher thread and connections shut down
Known Limitations
- Polling-based, not instant: Scans every N seconds (default 30). Not real-time.
- ngrok free tier disconnects after ~2 hours: Use
gdrivefor long runs. kgout warns when the tunnel dies. - Restricted networks: University/corporate firewalls may block ngrok. Use
gdriveinstead. - Public ngrok URL: Anyone with the URL can download your files. Don't share it.
- GDrive flat upload: Subdirectories are flattened to filenames (e.g.,
subdir/file.csv→subdir_file.csv). - Partial file risk: For multi-GB files, write to a temp name and rename when complete.
- Kaggle internet required: Settings → Internet → On.
Security
See SECURITY.md for the full security policy.
Development
git clone https://github.com/vybhav72954/kgout
cd kgout
pip install -e ".[dev,all]"
pytest tests/ -v
License
MIT — see LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kgout-1.2.0.tar.gz.
File metadata
- Download URL: kgout-1.2.0.tar.gz
- Upload date:
- Size: 24.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a6ec7ab980f357d1ca77894c16cffad0ee9776914a6e3613021c1dad30e8b1f
|
|
| MD5 |
18ca59f6e26fbdd2a7009e869875da21
|
|
| BLAKE2b-256 |
49a843032ccb34552cda6032309727304d908db8051739bc512f89aaa5671c80
|
File details
Details for the file kgout-1.2.0-py3-none-any.whl.
File metadata
- Download URL: kgout-1.2.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e3a80a4a105936f4e61df241f610fa778f826a64af8f2adc6852ad533a95e63c
|
|
| MD5 |
db6c6f929c261d293fbe53865d3c32c0
|
|
| BLAKE2b-256 |
baa98bb8982df7738de8843933fa1a84682d92a659c5b0e4fe195f390419229a
|