Plan local and free-tier GPU workflows around llmfit and curated provider data.

Project description

free-gpu

free-gpu is a terminal-first planner for free and near-free compute.

It is designed to sit on top of llmfit:

llmfit answers: what models fit my local hardware?
free-gpu answers: given this workload and compute need, which providers should I use, and how should I split work across local plus remote stages?

The point is not to clone llmfit. The point is to use llmfit as the local-fit engine, then add provider filtering, role-aware ranking, and workflow planning around free, cheap, and grant-style compute.

What users actually get

free-gpu helps answer questions like:

"I need quick inference for a small coding task. Which free-tier provider is the least painful?"
"I need a few hours of GPU time for LoRA fine-tuning. Should I look at credits, trials, or a cloud free tier?"
"This task is too heavy for casual free tiers. Which grant or program lane should I think about instead?"
"What should stay local, and what should move to remote compute?"

What the repo includes

The original provider dataset in free_gpu/gpu_compute_database.csv
A Python CLI for provider ranking and workflow planning
A Textual TUI focused on provider selection rather than local model browsing
A small MCP server so external agents can ask for provider plans programmatically
A GitHub Pages-ready project page in docs/index.html

Core product rules

Role is a ranking lens, not a hard exclusion filter.
Budget buckets are semantic UX buckets, not literal accounting truth.
Grant-like providers behave like card-required options.
The planner should surface the right provider lane for the task instead of treating every task as the same generic ranking problem.

Provider lanes

free-gpu is not only about "free" in the narrow sense. It plans across several practical lanes:

free tier: browser notebooks, starter quotas, short session access
under-25: credits, trials, starter plans, light paid-but-cheap access
grant: startup programs, research allocations, application-based access, and other heavier support paths

That matters because different tasks naturally fall into different lanes:

quick demos and lightweight inference often fit the free-tier lane
medium notebook work and moderate fine-tunes often fit the under-25 or credit lane
heavier training and long-running work often belong in the grant lane

Workflow logic

The planner estimates a compute lane from:

workload
model size
target VRAM
estimated task hours
parallel jobs
API needs

Then it schedules providers accordingly:

burst: short runs, quick inference, fast-start options
session: notebook or credit-backed work that lasts longer
heavy: bigger VRAM or sustained remote compute
grant-scale: tasks that look more like allocations, programs, or heavy research/startup support

Each workflow step carries its own compute summary, so a multi-stage plan can recommend different provider types for prep, fine-tune, eval, and serving.

How Pages and MCP fit together

The project has two different surfaces:

GitHub Pages hosts the public project site and docs
the MCP server runs locally on the user's own machine

GitHub Pages cannot run the planner logic or host the Python MCP server. It is only the public website.

The actual MCP workflow is:

a user installs free-gpu
the user runs free-gpu-mcp locally
their coding agent connects to that local MCP server
the agent can call planner tools such as plan_provider_workflow

That means:

no hosting cost for you
no central backend to maintain
users keep control because the tool runs locally
any MCP-capable coding agent can use it if it supports local MCP servers

This repository also supports an optional hosted HTTP deployment. If you deploy it on Vercel, the MCP endpoint is exposed at /mcp.

Install

python -m pip install free-gpu

Quick Start

All user-facing modes start from the same install:

python -m pip install free-gpu

TUI

Best when you want an interactive local browser for provider lanes, filters, and workflow summaries.

free-gpu ui

MCP remoto

Best when your MCP client supports remote HTTP MCP and you do not want to run the server yourself.

Endpoint:

https://free-gpu.vercel.app/mcp

MCP locale

Best when you want your coding agent to talk to free-gpu on your own machine over stdio.

free-gpu-mcp

Conceptual client config:

{
  "mcpServers": {
    "free-gpu": {
      "command": "free-gpu-mcp"
    }
  }
}

Terminal view

Best when you want direct terminal commands and scriptable output.

free-gpu local
free-gpu providers --workload inference --budget free
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16

For local development from the repository:

python -m pip install -e .

CLI

Local profile

free-gpu local
free-gpu local --ram-gb 32 --vram-gb 12

Provider ranking

free-gpu providers --workload inference --budget free
free-gpu providers --workload agent-loop --budget under-25 --task-hours 3 --parallel-jobs 4 --requires-api

Workflow planning

free-gpu plan --workload inference --model qwen2.5-coder-7b --ram-gb 32 --vram-gb 8
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
free-gpu plan --workload scratch-train --budget grant --task-hours 24 --min-vram-gb 40

Useful planning flags:

--task-hours
--min-vram-gb
--parallel-jobs
--requires-api
--budget any|free|under-25|grant

Every command also accepts --json.

Terminal UI

Run:

free-gpu ui

The TUI is inspired by llmfit's visual grammar, but it stays focused on provider planning:

a top system bar with local hardware context from llmfit
broad provider browsing by default
role, workload, budget, and payment filters
a central provider table
bottom panes for links, recommendation context, and workflow summary

Current budget options in the TUI:

Budget Any
Free
<25
Grant

llmfit integration

If llmfit is installed, free-gpu will try to use:

llmfit system --json
llmfit recommend -n N --json

The adapter uses structured JSON output rather than scraping terminal text. If llmfit is missing or parsing fails, free-gpu continues in provider-first mode and reports what it could not infer.

In that provider-first mode:

the planner still ranks providers and builds workflows normally
local machine characteristics are simply left undescribed
the result focuses on provider logic rather than pretending the local machine is inadequate

MCP server

Run:

free-gpu-mcp

The MCP server exposes tools for compute-aware planning, including:

plan_provider_workflow
rank_providers_for_task
assess_task_compute

It also exposes a small dataset summary resource:

providers://snapshot

What the MCP is for

The MCP lets an agent ask questions such as:

"Plan a cheap inference workflow for this task"
"Rank providers for a 6-hour fine-tune that needs about 16 GB VRAM"
"Does this task look like free-tier, credit-tier, or grant-scale work?"

Generic local MCP setup

If your coding agent supports local MCP servers over stdio, the setup is conceptually:

{
  "mcpServers": {
    "free-gpu": {
      "command": "free-gpu-mcp"
    }
  }
}

The exact config file depends on the agent, but the idea is the same: point the client at the local free-gpu-mcp command.

Hosted HTTP MCP on Vercel

This repository also includes a Vercel-friendly HTTP entrypoint via app.py.

When deployed on Vercel:

/ returns a small service description
/health returns a simple health check
/mcp is the MCP endpoint to connect to
the live hosted endpoint for this repo is https://free-gpu.vercel.app/mcp

That means an MCP-capable client that supports remote HTTP MCP can connect to:

https://free-gpu.vercel.app/mcp

If you open /mcp in a browser, it may return a protocol-level error such as 406 Not Acceptable. That is expected: the route is meant for MCP clients, not normal browser navigation.

Example MCP-style request shape:

{
  "tool": "plan_provider_workflow",
  "arguments": {
    "workload": "agent-loop",
    "budget": "under-25",
    "task_hours": 3,
    "parallel_jobs": 4,
    "requires_api": true
  }
}

Example agent flow:

You ask your coding agent: "I need to fine-tune an 8B model for about 6 hours and want to stay near free."
The agent calls plan_provider_workflow.
free-gpu estimates the compute lane.
The agent gets back a structured plan with:
- local vs remote recommendation
- stage-by-stage workflow
- top providers for that compute need
- whether the task fits free tier, cheap credits, or a grant-style path

GitHub Pages

A project page is included in docs/index.html.

On GitHub, enable Pages and point it at:

Branch: main
Folder: /docs

Tests

Run:

python -m unittest tests.test_planner -v

Packaging and publishing

The project is structured so end users do not need to clone the repository.

After publishing to PyPI, users can install it with:

pip install free-gpu

To build distribution artifacts locally:

python -m pip install ".[publish]"
python -m build
python -m twine check dist/*

The repository also includes a GitHub Actions Trusted Publishing workflow for PyPI releases.

To publish manually with Twine as a fallback:

python -m twine upload dist/*

Project details

Release history Release notifications | RSS feed

0.1.4

Apr 13, 2026

This version

0.1.3

Apr 13, 2026

0.1.2

Apr 13, 2026

0.1.1

Apr 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

free_gpu-0.1.3.tar.gz (31.5 kB view details)

Uploaded Apr 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

free_gpu-0.1.3-py3-none-any.whl (29.6 kB view details)

Uploaded Apr 13, 2026 Python 3

File details

Details for the file free_gpu-0.1.3.tar.gz.

File metadata

Download URL: free_gpu-0.1.3.tar.gz
Upload date: Apr 13, 2026
Size: 31.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for free_gpu-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`2a94fd93ba4f34147b8183ee5501c5f6fc15a2ec15f99f54fef05ed138698ff7`
MD5	`e02d30519296a28c1c0956cbd88f168a`
BLAKE2b-256	`dbbd9515e41fa631305c7aa0c253234f73dd022ebedf2ceb3d0bc98ed3b7226d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for free_gpu-0.1.3.tar.gz:

Publisher: pypi-publish.yml on francescoopiccolo/free-gpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: free_gpu-0.1.3.tar.gz
- Subject digest: 2a94fd93ba4f34147b8183ee5501c5f6fc15a2ec15f99f54fef05ed138698ff7
- Sigstore transparency entry: 1286123805
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: francescoopiccolo/free-gpu@cce0361f096ba9b0c645c907076153b722d98ff7
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/francescoopiccolo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@cce0361f096ba9b0c645c907076153b722d98ff7
- Trigger Event: release

File details

Details for the file free_gpu-0.1.3-py3-none-any.whl.

File metadata

Download URL: free_gpu-0.1.3-py3-none-any.whl
Upload date: Apr 13, 2026
Size: 29.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for free_gpu-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f723c87f51bd87863a95010ba84d9115d65bb0d15d1e320cb2f7eff1ba17b0a4`
MD5	`09fc14fb97f3dec3ad6ac7be2b37f8f2`
BLAKE2b-256	`b3b52fb1db0044b5431e17c455503c36f1237adc452ffeab1286536abf768253`

See more details on using hashes here.

Provenance

The following attestation bundles were made for free_gpu-0.1.3-py3-none-any.whl:

Publisher: pypi-publish.yml on francescoopiccolo/free-gpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: free_gpu-0.1.3-py3-none-any.whl
- Subject digest: f723c87f51bd87863a95010ba84d9115d65bb0d15d1e320cb2f7eff1ba17b0a4
- Sigstore transparency entry: 1286123920
- Sigstore integration time: Apr 13, 2026
Source repository:
- Permalink: francescoopiccolo/free-gpu@cce0361f096ba9b0c645c907076153b722d98ff7
- Branch / Tag: refs/tags/v0.1.3
- Owner: https://github.com/francescoopiccolo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@cce0361f096ba9b0c645c907076153b722d98ff7
- Trigger Event: release

free-gpu 0.1.3

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

free-gpu

What users actually get

What the repo includes

Core product rules

Provider lanes

Workflow logic

How Pages and MCP fit together

Install

Quick Start

TUI

MCP remoto

MCP locale

Terminal view

CLI

Local profile

Provider ranking

Workflow planning

Terminal UI

llmfit integration

MCP server

What the MCP is for

Generic local MCP setup

Hosted HTTP MCP on Vercel

GitHub Pages

Tests

Packaging and publishing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance