Auto-sync Kaggle notebook outputs to Google Drive or local machine via ngrok
Project description
kgout
Auto-sync Kaggle notebook outputs to Google Drive or your local machine.
When running long ML experiments on Kaggle, kernels can time out or sessions expire — and your output files disappear. kgout watches /kaggle/working/ in the background and automatically syncs new or modified files to Google Drive or exposes them via an ngrok tunnel for instant local download.
Drop it into any notebook as a single cell.
Install
# With local/ngrok tunnel support (recommended)
pip install kgout[local]
# With Google Drive support
pip install kgout[gdrive]
# Everything
pip install kgout[all]
Quick Start
Local Download via ngrok (Recommended)
Exposes your /kaggle/working/ directory as a public URL — open it in any browser on your phone, laptop, anywhere. Every new file appears instantly.
import os
os.environ["NGROK_AUTH_TOKEN"] = "your_token_here" # free at ngrok.com
from kgout import KgOut
with KgOut("local") as kg:
# ┌────────────────────────────────────────────────┐
# │ kgout — files available at: │
# │ https://abc123.ngrok-free.app │
# └────────────────────────────────────────────────┘
# ... your training code ...
# Every new file saved to /kaggle/working/ is instantly
# browsable and downloadable from the URL above.
pass
How it works: kgout starts a file server on localhost, creates an ngrok tunnel to it, and gives you the public URL. The file server serves your watch directory live — any file your notebook saves appears immediately in the browser. A background watcher thread logs every new file and its direct download link.
Google Drive Auto-Upload
Every new CSV, checkpoint, or plot auto-uploads to a Drive folder the moment it's saved.
from kgout import KgOut
with KgOut(
"gdrive",
folder_id="1ABCxyz_your_drive_folder_id",
credentials="/kaggle/input/my-secrets/service_account.json",
) as kg:
# ... your training code ...
pass
Both at Once
with KgOut(
dest=["local", "gdrive"],
folder_id="1ABCxyz",
credentials="/path/to/sa.json",
) as kg:
pass
Manual start/stop (no context manager)
kg = KgOut("local")
kg.start()
# ... long training ...
print(kg.stats) # {'files_tracked': 12, 'events_fired': 5}
kg.stop()
Configuration
| Parameter | Default | Description |
|---|---|---|
dest |
"local" |
"local", "gdrive", or ["local", "gdrive"] |
watch_dir |
/kaggle/working |
Directory to watch (recursive) |
interval |
30 |
Seconds between scans (min: 5) |
ignore |
see below | Glob patterns for files to skip |
snapshot_existing |
True |
If True, skip files that exist before start() |
folder_id |
— | Google Drive folder ID (required for gdrive) |
credentials |
— | Service account JSON path (required for gdrive) |
ngrok_token |
— | ngrok auth token (or set NGROK_AUTH_TOKEN env var) |
port |
8384 |
Local file server port |
verbose |
True |
Enable logging output |
Environment Variables
Instead of passing tokens directly, you can set these environment variables:
| Variable | Used by | Description |
|---|---|---|
NGROK_AUTH_TOKEN |
local destination |
ngrok authentication token |
KGOUT_GDRIVE_CREDENTIALS |
gdrive destination |
Path to service account JSON |
See .env.example in the repo for a template.
Default Ignore Patterns
These files are never synced:
*.ipynb,*.pyc,*.tmp,*.lock,*.log,*.swp,*.swo.DS_Store,Thumbs.db- Hidden files (starting with
.) - Directories:
.ipynb_checkpoints,__pycache__,.git
Override with ignore=["*.csv"] or pass ignore=[] to sync everything.
Setting Up ngrok (for local destination)
- Create a free account at ngrok.com
- Copy your auth token from the dashboard
- In your Kaggle notebook:
import os os.environ["NGROK_AUTH_TOKEN"] = "your_token"
Or pass it directly:KgOut("local", ngrok_token="your_token")
Tip: On Kaggle, you can store the token as a Kaggle Secret and load it with:
from kaggle_secrets import UserSecretsClient
os.environ["NGROK_AUTH_TOKEN"] = UserSecretsClient().get_secret("NGROK_AUTH_TOKEN")
Setting Up Google Drive (for gdrive destination)
- Go to Google Cloud Console
- Create a project (or use existing) and enable the Google Drive API
- Go to IAM & Admin > Service Accounts > Create a service account
- Create a key (JSON) > download it
- Upload the JSON to Kaggle as a private dataset (e.g.,
my-secrets) - In Google Drive, right-click your target folder > Share > paste the service account email (the
client_emailfield in the JSON) > give it Editor access - Copy the folder ID from the Drive URL:
https://drive.google.com/drive/folders/THIS_PART_IS_THE_ID
Security
kgout takes the following security measures:
- Localhost-only binding: The HTTP file server binds to
127.0.0.1, not0.0.0.0. Only the ngrok tunnel can reach it — not other devices on the same network. - Path traversal protection: Requests that attempt to escape the served directory (e.g.,
/../../../etc/passwd) are blocked. - Security headers: All HTTP responses include
X-Content-Type-Options: nosniff,X-Frame-Options: DENY, and a Content Security Policy. - No symlink following: The watcher uses
followlinks=Falseto prevent symlink-based escapes. - Dangerous directory guard: Attempting to watch
/,/etc,/home, or other sensitive paths raises aValueError. - Credential masking: ngrok tokens are redacted from error messages.
- Partial file guard: Files are only synced after they haven't been modified for 2 seconds, preventing sync of half-written files.
- Minimal GDrive scope: Uses
drive.filescope — the service account can only access files it created, not your entire Drive.
See SECURITY.md for the full security policy and vulnerability reporting.
How It Works
- Snapshot: On
start(), kgout fingerprints all existing files (mtime + size) so they don't trigger syncs - Poll: A daemon thread scans the watch directory every N seconds
- Settle check: Files modified in the last 2 seconds are skipped (still being written)
- Compare: Each file's fingerprint is compared against the snapshot
- Sync: New or modified files are sent to the configured destination(s)
- Cleanup: On
stop()(or context manager exit), watcher thread and tunnels shut down
The watcher runs as a daemon thread — it won't block your notebook or prevent kernel shutdown.
Known Limitations
- Polling-based: Uses periodic scanning, not filesystem events — there's a configurable delay (
interval) - ngrok free tier: Limited to 1 tunnel; sessions may disconnect after ~2 hours
- GDrive flat upload: Subdirectories are flattened to filenames (e.g.,
subdir_file.csv) in v1.0 - Public URL: Anyone with the ngrok URL can download files. Don't share it with untrusted parties.
Development
git clone https://github.com/vybhavchaturvedi/kgout
cd kgout
pip install -e ".[dev,all]"
pytest tests/ -v
License
MIT — see LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kgout-1.0.0.tar.gz.
File metadata
- Download URL: kgout-1.0.0.tar.gz
- Upload date:
- Size: 21.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1dd8044a72d8ca1dd1a23f889ef110fb4dd7c4ab6800179a7c518b8a609f60d1
|
|
| MD5 |
b12ac0bc87bf0780e2f26f278a08a49f
|
|
| BLAKE2b-256 |
abbb829b4f6cb3d8f203f1bbcc532d10a689ff341d48941c1fd06ce57bf90355
|
File details
Details for the file kgout-1.0.0-py3-none-any.whl.
File metadata
- Download URL: kgout-1.0.0-py3-none-any.whl
- Upload date:
- Size: 18.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
08444c7165e0c8461e7df96557f0f834c75b60a9b111639c190a099a0032bace
|
|
| MD5 |
a26eb7fbc575ce5e6a00cba3f7baa6c8
|
|
| BLAKE2b-256 |
efc458454ddbbf7bbe6b270ae546f2bc88ad951f9ad4df15b9b60e1f5e866992
|