Run Ollama coding models on your own Kaggle GPU instance

These details have not been verified by PyPI

Project links

Project description

KaggleLLM-IDE

Run frontier coding models on your own free Kaggle GPU — no shared server, no VRAM limits on your laptop.

Each user brings their own Kaggle account. The app spins up a private GPU kernel, installs Ollama, pulls your chosen model, and tunnels the inference endpoint directly to your browser via ngrok. Your prompts never touch anyone else's server.

You → Setup form → Kaggle API → Your GPU kernel → ngrok tunnel → Chat UI

Why this exists

Most coding LLMs worth running (Qwen2.5-Coder 32B, DeepSeek-Coder-V2) need 16–20 GB of VRAM. Most laptops have 4–8 GB. Kaggle gives every verified account 30 free GPU hours per week on P100/T4 instances. This app is the bridge.

Supported models

Model	VRAM	Best for
`qwen2.5-coder:7b`	~5 GB	Fast completions, everyday coding
`qwen2.5-coder:14b`	~9 GB	Balanced speed + quality
`qwen2.5-coder:32b`	~20 GB	Best quality (P100 only)
`deepseek-coder-v2:16b`	~10 GB	Strong reasoning + code
`codellama:13b`	~8 GB	General purpose

Prerequisites

You need three things before starting:

1. Kaggle account + API key

Sign up at kaggle.com
Go to Settings → API → Create New Token — this downloads kaggle.json
Your username and key are inside that file

2. Phone verification (required for GPU)

Kaggle requires phone verification to unlock GPU kernels. Complete it at: kaggle.com/settings/phone-verification

3. ngrok auth token

Sign up at ngrok.com (free)
Go to Dashboard → Your Authtoken
Copy the token

Getting started

Backend

cd backend
pip install -r requirements.txt
uvicorn main:app --reload
# Runs on http://localhost:8000

Frontend

cd frontend
npm install
npm run dev
# Runs on http://localhost:5173

Open http://localhost:5173, fill in your credentials, pick a model, and hit Launch GPU Kernel.

First launch takes 5–8 minutes (Ollama install + model download). Subsequent launches of the same model are faster since Kaggle caches the environment.

Project structure

kaggle-gpu-ide/
├── backend/
│   ├── main.py              # FastAPI app — session lifecycle endpoints
│   ├── kaggle_runner.py     # Kaggle API client — push, poll, stop kernels
│   ├── auth.py              # Credential + GPU access verification
│   └── requirements.txt
├── frontend/
│   ├── index.html
│   ├── vite.config.js
│   ├── package.json
│   └── src/
│       ├── App.jsx          # Setup / launching / chat screens
│       ├── index.css
│       ├── main.jsx
│       ├── hooks/
│       │   ├── useSession.js   # Session state + polling
│       │   └── useChat.js      # Message history + inference
│       └── lib/
│           └── api.js          # Backend + tunnel API clients
├── kernel/
│   └── kernel_bootstrap.py  # Runs ON Kaggle GPU — installs Ollama + ngrok
└── pyproject.toml

How it works

1. You enter Kaggle credentials + ngrok token in the setup form

2. Backend calls POST /auth/verify
   ├── Checks API key is valid
   └── Confirms GPU access is enabled on your account

3. Backend calls Kaggle API to push kernel_bootstrap.py
   to your Kaggle account with GPU + internet enabled

4. kernel_bootstrap.py runs on Kaggle's GPU:
   ├── Installs Ollama
   ├── Pulls your chosen model
   ├── Starts a FastAPI inference server on port 8000
   └── Opens an ngrok tunnel → prints [TUNNEL_URL]https://...[/TUNNEL_URL]

5. Backend polls the kernel log every 8s until the URL appears

6. Frontend receives the tunnel URL → all prompts go DIRECTLY
   to your Kaggle kernel — no inference traffic through any shared server

API reference

Backend (`http://localhost:8000`)

Method	Path	Description
`POST`	`/auth/verify`	Validate credentials + GPU access
`POST`	`/session/start`	Push kernel + begin polling
`GET`	`/session/{id}/status`	`launching` / `ready` / `error`
`DELETE`	`/session/{id}`	Stop kernel + clear session
`GET`	`/health`	Liveness check

Kernel tunnel (`https://<ngrok-url>`)

Method	Path	Description
`GET`	`/health`	Kernel liveness
`GET`	`/models`	List pulled Ollama models
`POST`	`/generate`	Run inference

Environment variables

Create a .env file in backend/ (see .env.example):

# Not strictly required — credentials are passed per-request from the frontend.
# Set these if you want to hardcode a single-user deployment.
KAGGLE_USERNAME=
KAGGLE_KEY=
NGROK_AUTHTOKEN=
SECRET_KEY=

Kaggle GPU quota

30 hours/week per verified account
Resets every Sunday
Kernels auto-terminate after 12 hours of continuous runtime
GPU types: Tesla P100 (16 GB VRAM) or Tesla T4 (15 GB VRAM) — assigned automatically

Contributing

# Fork → clone → create a feature branch
git checkout -b feat/your-feature

# Make changes, then open a PR against dev (not main)

Please keep PRs focused. One feature or fix per PR.

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 21, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kaggle_gpu_ide-0.1.0.tar.gz (42.8 kB view details)

Uploaded May 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

kaggle_gpu_ide-0.1.0-py3-none-any.whl (28.5 kB view details)

Uploaded May 21, 2026 Python 3

File details

Details for the file kaggle_gpu_ide-0.1.0.tar.gz.

File metadata

Download URL: kaggle_gpu_ide-0.1.0.tar.gz
Upload date: May 21, 2026
Size: 42.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kaggle_gpu_ide-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f4438459ed481c876344edb45625f0b35440101c86332d6a98da9cc423c915ae`
MD5	`8f24a4776fed5db08404df350b04f722`
BLAKE2b-256	`05a776e210b6ac17b9f95da0d61fb65fdf7aa22cb18f8cfa79d3a423e47cf9d9`

See more details on using hashes here.

File details

Details for the file kaggle_gpu_ide-0.1.0-py3-none-any.whl.

File metadata

Download URL: kaggle_gpu_ide-0.1.0-py3-none-any.whl
Upload date: May 21, 2026
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for kaggle_gpu_ide-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9b2b7f40f6decf31204e4c5d944438e6e17c345a2e6a879fdfde67befbc8abaf`
MD5	`db513426b5b0e09185def0c325b4af03`
BLAKE2b-256	`f4a56cc35d14c8ace90b880689f7a83d4eb02f87ba6f2e0b22ed4536c393a941`

See more details on using hashes here.

kaggle-gpu-ide 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

KaggleLLM-IDE

Why this exists

Supported models

Prerequisites

1. Kaggle account + API key

2. Phone verification (required for GPU)

3. ngrok auth token

Getting started

Backend

Frontend

Project structure

How it works

API reference

Backend (http://localhost:8000)

Kernel tunnel (https://<ngrok-url>)

Environment variables

Kaggle GPU quota

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Backend (`http://localhost:8000`)

Kernel tunnel (`https://<ngrok-url>`)