Auto-sync Kaggle notebook outputs to Google Drive or local machine via ngrok
Project description
kgout
Auto-sync Kaggle notebook outputs to Google Drive or your local machine.
When running long ML experiments on Kaggle, kernels can time out or sessions expire — and your output files disappear. kgout watches /kaggle/working/ in the background and automatically syncs new or modified files to Google Drive or exposes them via an ngrok tunnel for instant local download.
Drop it into any notebook as a single cell.
Install
# With local/ngrok tunnel support (recommended)
pip install kgout[local]
# With Google Drive support
pip install kgout[gdrive]
# Everything
pip install kgout[all]
Quick Start
Local Download via ngrok (Recommended)
Exposes your /kaggle/working/ directory as a public URL — open it in any browser on your phone, laptop, anywhere. Every new file appears instantly.
import os
os.environ["NGROK_AUTH_TOKEN"] = "your_token_here" # free at ngrok.com
from kgout import KgOut
with KgOut("local") as kg:
# ┌────────────────────────────────────────────────┐
# │ kgout — files available at: │
# │ https://abc123.ngrok-free.app │
# └────────────────────────────────────────────────┘
# ... your training code ...
# Every new file saved to /kaggle/working/ is instantly
# browsable and downloadable from the URL above.
pass
How it works: kgout starts a file server on localhost, creates an ngrok tunnel to it, and gives you the public URL. The file server serves your watch directory live — any file your notebook saves appears immediately in the browser. A background watcher thread logs every new file and its direct download link.
Google Drive Auto-Upload
Every new CSV, checkpoint, or plot auto-uploads to a Drive folder the moment it's saved.
from kgout import KgOut
with KgOut(
"gdrive",
folder_id="1ABCxyz_your_drive_folder_id",
credentials="/kaggle/input/my-secrets/service_account.json",
) as kg:
# ... your training code ...
pass
Both at Once
with KgOut(
dest=["local", "gdrive"],
folder_id="1ABCxyz",
credentials="/path/to/sa.json",
) as kg:
pass
Manual start/stop (no context manager)
kg = KgOut("local")
kg.start()
# ... long training ...
print(kg.stats) # {'files_tracked': 12, 'events_fired': 5}
kg.stop()
Configuration
| Parameter | Default | Description |
|---|---|---|
dest |
"local" |
"local", "gdrive", or ["local", "gdrive"] |
watch_dir |
/kaggle/working |
Directory to watch (recursive) |
interval |
30 |
Seconds between scans (min: 5) |
ignore |
see below | Glob patterns for files to skip |
snapshot_existing |
True |
If True, skip files that exist before start() |
folder_id |
— | Google Drive folder ID (required for gdrive) |
credentials |
— | Service account JSON path (required for gdrive) |
ngrok_token |
— | ngrok auth token (or set NGROK_AUTH_TOKEN env var) |
port |
8384 |
Local file server port |
verbose |
True |
Enable logging output |
Environment Variables
Instead of passing tokens directly, you can set these environment variables:
| Variable | Used by | Description |
|---|---|---|
NGROK_AUTH_TOKEN |
local destination |
ngrok authentication token |
KGOUT_GDRIVE_CREDENTIALS |
gdrive destination |
Path to service account JSON |
See .env.example in the repo for a template.
Default Ignore Patterns
These files are never synced:
*.ipynb,*.pyc,*.tmp,*.lock,*.log,*.swp,*.swo.DS_Store,Thumbs.db- Hidden files (starting with
.) - Directories:
.ipynb_checkpoints,__pycache__,.git
Override with ignore=["*.csv"] or pass ignore=[] to sync everything.
Setting Up ngrok (for local destination)
- Create a free account at ngrok.com
- Copy your auth token from the dashboard
- In your Kaggle notebook:
import os os.environ["NGROK_AUTH_TOKEN"] = "your_token"
Or pass it directly:KgOut("local", ngrok_token="your_token")
Tip: On Kaggle, you can store the token as a Kaggle Secret and load it with:
from kaggle_secrets import UserSecretsClient
os.environ["NGROK_AUTH_TOKEN"] = UserSecretsClient().get_secret("NGROK_AUTH_TOKEN")
Setting Up Google Drive (for gdrive destination)
- Go to Google Cloud Console
- Create a project (or use existing) and enable the Google Drive API
- Go to IAM & Admin > Service Accounts > Create a service account
- Create a key (JSON) > download it
- Upload the JSON to Kaggle as a private dataset (e.g.,
my-secrets) - In Google Drive, right-click your target folder > Share > paste the service account email (the
client_emailfield in the JSON) > give it Editor access - Copy the folder ID from the Drive URL:
https://drive.google.com/drive/folders/THIS_PART_IS_THE_ID
Security
See SECURITY.md for the full security policy and vulnerability reporting.
How It Works
- Snapshot: On
start(), kgout fingerprints all existing files (mtime + size) so they don't trigger syncs - Poll: A daemon thread scans the watch directory every N seconds
- Settle check: Files modified in the last 2 seconds are skipped (still being written)
- Compare: Each file's fingerprint is compared against the snapshot
- Sync: New or modified files are sent to the configured destination(s)
- Cleanup: On
stop()(or context manager exit), watcher thread and tunnels shut down
The watcher runs as a daemon thread — it won't block your notebook or prevent kernel shutdown.
Known Limitations
- Polling-based, not instant: kgout scans the directory every N seconds (default 30). Files won't appear until the next scan completes. Not suitable for real-time streaming.
- ngrok free tier: Limited to 1 tunnel at a time. Sessions may disconnect after ~2 hours. URL changes every time kgout starts.
- Restricted networks: ngrok requires outbound internet access on ports 443/4443. Institutional networks (university campuses, corporate firewalls, research lab servers) may block ngrok traffic. If the tunnel fails to start, your network likely blocks it — use the
gdrivedestination instead. - Public URL: Anyone with the ngrok URL can browse and download your files. Don't share it with untrusted parties. The URL is random and temporary, but not password-protected.
- GDrive flat upload: Subdirectories are flattened into filenames (e.g.,
subdir/file.csvbecomessubdir_file.csv) in v1.x. - Partial file risk: If a very large file is still being written when a scan occurs, it may sync an incomplete version. kgout waits 2 seconds after last modification (settle time), but for multi-GB files, write to a temp name and rename when complete.
- No resumable downloads: If the ngrok tunnel disconnects mid-download, you need to re-download. There's no resume support.
- Kaggle internet required: The Kaggle notebook must have internet access enabled (Settings → Internet → On) for both
localandgdrivedestinations.
Development
git clone https://github.com/vybhavchaturvedi/kgout
cd kgout
pip install -e ".[dev,all]"
pytest tests/ -v
License
MIT — see LICENSE
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file kgout-1.0.1.tar.gz.
File metadata
- Download URL: kgout-1.0.1.tar.gz
- Upload date:
- Size: 21.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2bf10366497ecb415e8c7dfe06a29d74a130baf113ce6774e23544a68a52aa3
|
|
| MD5 |
e1993f645947e382bfeeed63d457175d
|
|
| BLAKE2b-256 |
5c65252c184a4ae28a036467634d5d61e2bb838961e1bde178b20a3470a0fbab
|
File details
Details for the file kgout-1.0.1-py3-none-any.whl.
File metadata
- Download URL: kgout-1.0.1-py3-none-any.whl
- Upload date:
- Size: 18.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ebfb75beb8d3e0b10fd3de13a784f6b2e68141c7e91b7e60019a7ba45251a07
|
|
| MD5 |
c0348c639238ef0808cdf183e4a32a90
|
|
| BLAKE2b-256 |
443d4ff48894396e6328efe1e168b27203538e351c1ccd10a0d046ed923357b0
|