Plan local and free-tier GPU workflows around llmfit and curated provider data.
Project description
free-gpu
free-gpu is a terminal-first planner for free and near-free compute.
It is designed to sit on top of llmfit:
llmfitanswers: what models fit my local hardware?free-gpuanswers: given this workload and compute need, which providers should I use, and how should I split work across local plus remote stages?
The point is not to clone llmfit. The point is to use llmfit as the local-fit engine, then add provider filtering, role-aware ranking, and workflow planning around free, cheap, and grant-style compute.
What users actually get
free-gpu helps answer questions like:
- "I need quick inference for a small coding task. Which free-tier provider is the least painful?"
- "I need a few hours of GPU time for LoRA fine-tuning. Should I look at credits, trials, or a cloud free tier?"
- "This task is too heavy for casual free tiers. Which grant or program lane should I think about instead?"
- "What should stay local, and what should move to remote compute?"
What the repo includes
- The original provider dataset in
free_gpu/gpu_compute_database.csv - A Python CLI for provider ranking and workflow planning
- A Textual TUI focused on provider selection rather than local model browsing
- A small MCP server so external agents can ask for provider plans programmatically
- A GitHub Pages-ready project page in
docs/index.html
Core product rules
- Role is a ranking lens, not a hard exclusion filter.
- Budget buckets are semantic UX buckets, not literal accounting truth.
- Grant-like providers behave like card-required options.
- The planner should surface the right provider lane for the task instead of treating every task as the same generic ranking problem.
Provider lanes
free-gpu is not only about "free" in the narrow sense. It plans across several practical lanes:
free tier: browser notebooks, starter quotas, short session accessunder-25: credits, trials, starter plans, light paid-but-cheap accessgrant: startup programs, research allocations, application-based access, and other heavier support paths
That matters because different tasks naturally fall into different lanes:
- quick demos and lightweight inference often fit the free-tier lane
- medium notebook work and moderate fine-tunes often fit the under-25 or credit lane
- heavier training and long-running work often belong in the grant lane
Workflow logic
The planner estimates a compute lane from:
- workload
- model size
- target VRAM
- estimated task hours
- parallel jobs
- API needs
Then it schedules providers accordingly:
burst: short runs, quick inference, fast-start optionssession: notebook or credit-backed work that lasts longerheavy: bigger VRAM or sustained remote computegrant-scale: tasks that look more like allocations, programs, or heavy research/startup support
Each workflow step carries its own compute summary, so a multi-stage plan can recommend different provider types for prep, fine-tune, eval, and serving.
How Pages and MCP fit together
The project has two different surfaces:
- GitHub Pages hosts the public project site and docs
- the MCP server runs locally on the user's own machine
GitHub Pages cannot run the planner logic or host the Python MCP server. It is only the public website.
The actual MCP workflow is:
- a user installs
free-gpu - the user runs
free-gpu-mcplocally - their coding agent connects to that local MCP server
- the agent can call planner tools such as
plan_provider_workflow
That means:
- no hosting cost for you
- no central backend to maintain
- users keep control because the tool runs locally
- any MCP-capable coding agent can use it if it supports local MCP servers
This repository also supports an optional hosted HTTP deployment. If you deploy it on Vercel, the MCP endpoint is exposed at /mcp.
Install
python -m pip install free-gpu
Quick Start
All user-facing modes start from the same install:
python -m pip install free-gpu
TUI
Best when you want an interactive local browser for provider lanes, filters, and workflow summaries.
free-gpu ui
MCP remoto
Best when your MCP client supports remote HTTP MCP and you do not want to run the server yourself.
Endpoint:
https://free-gpu.vercel.app/mcp
MCP locale
Best when you want your coding agent to talk to free-gpu on your own machine over stdio.
free-gpu-mcp
Conceptual client config:
{
"mcpServers": {
"free-gpu": {
"command": "free-gpu-mcp"
}
}
}
Terminal view
Best when you want direct terminal commands and scriptable output.
free-gpu local
free-gpu providers --workload inference --budget free
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
For local development from the repository:
python -m pip install -e .
CLI
Local profile
free-gpu local
free-gpu local --ram-gb 32 --vram-gb 12
Provider ranking
free-gpu providers --workload inference --budget free
free-gpu providers --workload agent-loop --budget under-25 --task-hours 3 --parallel-jobs 4 --requires-api
Workflow planning
free-gpu plan --workload inference --model qwen2.5-coder-7b --ram-gb 32 --vram-gb 8
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
free-gpu plan --workload scratch-train --budget grant --task-hours 24 --min-vram-gb 40
Useful planning flags:
--task-hours--min-vram-gb--parallel-jobs--requires-api--budget any|free|under-25|grant
Every command also accepts --json.
Terminal UI
Run:
free-gpu ui
The TUI is inspired by llmfit's visual grammar, but it stays focused on provider planning:
- a top system bar with local hardware context from
llmfit - broad provider browsing by default
- role, workload, budget, and payment filters
- a central provider table
- bottom panes for links, recommendation context, and workflow summary
Current budget options in the TUI:
Budget AnyFree<25Grant
llmfit integration
If llmfit is installed, free-gpu will try to use:
llmfit system --jsonllmfit recommend -n N --json
The adapter uses structured JSON output rather than scraping terminal text. If llmfit is missing or parsing fails, free-gpu continues in provider-first mode and reports what it could not infer.
MCP server
Run:
free-gpu-mcp
The MCP server exposes tools for compute-aware planning, including:
plan_provider_workflowrank_providers_for_taskassess_task_compute
It also exposes a small dataset summary resource:
providers://snapshot
What the MCP is for
The MCP lets an agent ask questions such as:
- "Plan a cheap inference workflow for this task"
- "Rank providers for a 6-hour fine-tune that needs about 16 GB VRAM"
- "Does this task look like free-tier, credit-tier, or grant-scale work?"
Generic local MCP setup
If your coding agent supports local MCP servers over stdio, the setup is conceptually:
{
"mcpServers": {
"free-gpu": {
"command": "free-gpu-mcp"
}
}
}
The exact config file depends on the agent, but the idea is the same: point the client at the local free-gpu-mcp command.
Hosted HTTP MCP on Vercel
This repository also includes a Vercel-friendly HTTP entrypoint via app.py.
When deployed on Vercel:
/returns a small service description/healthreturns a simple health check/mcpis the MCP endpoint to connect to- the live hosted endpoint for this repo is
https://free-gpu.vercel.app/mcp
That means an MCP-capable client that supports remote HTTP MCP can connect to:
https://free-gpu.vercel.app/mcp
If you open /mcp in a browser, it may return a protocol-level error such as 406 Not Acceptable. That is expected: the route is meant for MCP clients, not normal browser navigation.
Example MCP-style request shape:
{
"tool": "plan_provider_workflow",
"arguments": {
"workload": "agent-loop",
"budget": "under-25",
"task_hours": 3,
"parallel_jobs": 4,
"requires_api": true
}
}
Example agent flow:
- You ask your coding agent: "I need to fine-tune an 8B model for about 6 hours and want to stay near free."
- The agent calls
plan_provider_workflow. free-gpuestimates the compute lane.- The agent gets back a structured plan with:
- local vs remote recommendation
- stage-by-stage workflow
- top providers for that compute need
- whether the task fits free tier, cheap credits, or a grant-style path
GitHub Pages
A project page is included in docs/index.html.
On GitHub, enable Pages and point it at:
- Branch:
main - Folder:
/docs
Tests
Run:
python -m unittest tests.test_planner -v
Packaging and publishing
The project is structured so end users do not need to clone the repository.
After publishing to PyPI, users can install it with:
pip install free-gpu
To build distribution artifacts locally:
python -m pip install ".[publish]"
python -m build
python -m twine check dist/*
The repository also includes a GitHub Actions Trusted Publishing workflow for PyPI releases.
To publish manually with Twine as a fallback:
python -m twine upload dist/*
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file free_gpu-0.1.2.tar.gz.
File metadata
- Download URL: free_gpu-0.1.2.tar.gz
- Upload date:
- Size: 30.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00db6dec16f9de6189d76916f5c58f6a76c7ec526be13aedddf9270b8294b287
|
|
| MD5 |
9804ffe4587b09254b213c45c6b00fb3
|
|
| BLAKE2b-256 |
493d9fd73dc8cc4de63532e190e5ab7d84d8a977449ecfba4a3d18f537fe5f13
|
Provenance
The following attestation bundles were made for free_gpu-0.1.2.tar.gz:
Publisher:
pypi-publish.yml on francescoopiccolo/free-gpu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
free_gpu-0.1.2.tar.gz -
Subject digest:
00db6dec16f9de6189d76916f5c58f6a76c7ec526be13aedddf9270b8294b287 - Sigstore transparency entry: 1285904578
- Sigstore integration time:
-
Permalink:
francescoopiccolo/free-gpu@db5f641e2e155d14805dab977bb1c7e749780912 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/francescoopiccolo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@db5f641e2e155d14805dab977bb1c7e749780912 -
Trigger Event:
release
-
Statement type:
File details
Details for the file free_gpu-0.1.2-py3-none-any.whl.
File metadata
- Download URL: free_gpu-0.1.2-py3-none-any.whl
- Upload date:
- Size: 29.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c85c195e04b6dfd9f91b2266e9f20a3074295275b6517637a02f4c02b4822df
|
|
| MD5 |
e4de1e9ce4b55fa75c820c16937f778c
|
|
| BLAKE2b-256 |
4996b39cf5708023c2bdc5f9de5e6f89b1f0f28a5f89b178648b49fd7b1f1b26
|
Provenance
The following attestation bundles were made for free_gpu-0.1.2-py3-none-any.whl:
Publisher:
pypi-publish.yml on francescoopiccolo/free-gpu
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
free_gpu-0.1.2-py3-none-any.whl -
Subject digest:
3c85c195e04b6dfd9f91b2266e9f20a3074295275b6517637a02f4c02b4822df - Sigstore transparency entry: 1285904665
- Sigstore integration time:
-
Permalink:
francescoopiccolo/free-gpu@db5f641e2e155d14805dab977bb1c7e749780912 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/francescoopiccolo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@db5f641e2e155d14805dab977bb1c7e749780912 -
Trigger Event:
release
-
Statement type: