Skip to main content

Plan local and free-tier GPU workflows around llmfit and curated provider data.

Project description

free-gpu

free-gpu is a terminal-first planner for free and near-free compute.

It is designed to sit on top of llmfit:

  • llmfit answers: what models fit my local hardware?
  • free-gpu answers: given this workload and compute need, which providers should I use, and how should I split work across local plus remote stages?

The point is not to clone llmfit. The point is to use llmfit as the local-fit engine, then add provider filtering, role-aware ranking, and workflow planning around free, cheap, and grant-style compute.

What users actually get

free-gpu helps answer questions like:

  • "I need quick inference for a small coding task. Which free-tier provider is the least painful?"
  • "I need a few hours of GPU time for LoRA fine-tuning. Should I look at credits, trials, or a cloud free tier?"
  • "This task is too heavy for casual free tiers. Which grant or program lane should I think about instead?"
  • "What should stay local, and what should move to remote compute?"

What the repo includes

  • The original provider dataset in free_gpu/gpu_compute_database.csv
  • A Python CLI for provider ranking and workflow planning
  • A Textual TUI focused on provider selection rather than local model browsing
  • A small MCP server so external agents can ask for provider plans programmatically
  • A GitHub Pages-ready project page in docs/index.html

Core product rules

  • Role is a ranking lens, not a hard exclusion filter.
  • Budget buckets are semantic UX buckets, not literal accounting truth.
  • Grant-like providers behave like card-required options.
  • The planner should surface the right provider lane for the task instead of treating every task as the same generic ranking problem.

Provider lanes

free-gpu is not only about "free" in the narrow sense. It plans across several practical lanes:

  • free tier: browser notebooks, starter quotas, short session access
  • under-25: credits, trials, starter plans, light paid-but-cheap access
  • grant: startup programs, research allocations, application-based access, and other heavier support paths

That matters because different tasks naturally fall into different lanes:

  • quick demos and lightweight inference often fit the free-tier lane
  • medium notebook work and moderate fine-tunes often fit the under-25 or credit lane
  • heavier training and long-running work often belong in the grant lane

Workflow logic

The planner estimates a compute lane from:

  • workload
  • model size
  • target VRAM
  • estimated task hours
  • parallel jobs
  • API needs

Then it schedules providers accordingly:

  • burst: short runs, quick inference, fast-start options
  • session: notebook or credit-backed work that lasts longer
  • heavy: bigger VRAM or sustained remote compute
  • grant-scale: tasks that look more like allocations, programs, or heavy research/startup support

Each workflow step carries its own compute summary, so a multi-stage plan can recommend different provider types for prep, fine-tune, eval, and serving.

How Pages and MCP fit together

The project has two different surfaces:

  • GitHub Pages hosts the public project site and docs
  • the MCP server runs locally on the user's own machine

GitHub Pages cannot run the planner logic or host the Python MCP server. It is only the public website.

The actual MCP workflow is:

  1. a user installs free-gpu
  2. the user runs free-gpu-mcp locally
  3. their coding agent connects to that local MCP server
  4. the agent can call planner tools such as plan_provider_workflow

That means:

  • no hosting cost for you
  • no central backend to maintain
  • users keep control because the tool runs locally
  • any MCP-capable coding agent can use it if it supports local MCP servers

This repository also supports an optional hosted HTTP deployment. If you deploy it on Vercel, the MCP endpoint is exposed at /mcp.

Install

python -m pip install free-gpu

Quick Start

All user-facing modes start from the same install:

python -m pip install free-gpu

TUI

Best when you want an interactive local browser for provider lanes, filters, and workflow summaries.

free-gpu ui

MCP remoto

Best when your MCP client supports remote HTTP MCP and you do not want to run the server yourself.

Endpoint:

https://free-gpu.vercel.app/mcp

MCP locale

Best when you want your coding agent to talk to free-gpu on your own machine over stdio.

free-gpu-mcp

Conceptual client config:

{
  "mcpServers": {
    "free-gpu": {
      "command": "free-gpu-mcp"
    }
  }
}

Terminal view

Best when you want direct terminal commands and scriptable output.

free-gpu local
free-gpu providers --workload inference --budget free
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16

For local development from the repository:

python -m pip install -e .

CLI

Local profile

free-gpu local
free-gpu local --ram-gb 32 --vram-gb 12

Provider ranking

free-gpu providers --workload inference --budget free
free-gpu providers --workload agent-loop --budget under-25 --task-hours 3 --parallel-jobs 4 --requires-api

Workflow planning

free-gpu plan --workload inference --model qwen2.5-coder-7b --ram-gb 32 --vram-gb 8
free-gpu plan --workload finetune-lora --model llama-3.1-8b --budget under-25 --task-hours 6 --min-vram-gb 16
free-gpu plan --workload scratch-train --budget grant --task-hours 24 --min-vram-gb 40

Useful planning flags:

  • --task-hours
  • --min-vram-gb
  • --parallel-jobs
  • --requires-api
  • --budget any|free|under-25|grant

Every command also accepts --json.

Terminal UI

Run:

free-gpu ui

The TUI is inspired by llmfit's visual grammar, but it stays focused on provider planning:

  • a top system bar with local hardware context from llmfit
  • broad provider browsing by default
  • role, workload, budget, and payment filters
  • a central provider table
  • bottom panes for links, recommendation context, and workflow summary

Current budget options in the TUI:

  • Budget Any
  • Free
  • <25
  • Grant

llmfit integration

If llmfit is installed, free-gpu will try to use:

  • llmfit system --json
  • llmfit recommend -n N --json

The adapter uses structured JSON output rather than scraping terminal text. If llmfit is missing or parsing fails, free-gpu continues in provider-first mode and reports what it could not infer.

In that provider-first mode:

  • the planner still ranks providers and builds workflows normally
  • local machine characteristics are simply left undescribed
  • the result focuses on provider logic rather than pretending the local machine is inadequate

MCP server

Run:

free-gpu-mcp

The MCP server exposes tools for compute-aware planning, including:

  • plan_provider_workflow
  • rank_providers_for_task
  • assess_task_compute

It also exposes a small dataset summary resource:

  • providers://snapshot

What the MCP is for

The MCP lets an agent ask questions such as:

  • "Plan a cheap inference workflow for this task"
  • "Rank providers for a 6-hour fine-tune that needs about 16 GB VRAM"
  • "Does this task look like free-tier, credit-tier, or grant-scale work?"

Generic local MCP setup

If your coding agent supports local MCP servers over stdio, the setup is conceptually:

{
  "mcpServers": {
    "free-gpu": {
      "command": "free-gpu-mcp"
    }
  }
}

The exact config file depends on the agent, but the idea is the same: point the client at the local free-gpu-mcp command.

Hosted HTTP MCP on Vercel

This repository also includes a Vercel-friendly HTTP entrypoint via app.py.

When deployed on Vercel:

  • / returns a small service description
  • /health returns a simple health check
  • /mcp is the MCP endpoint to connect to
  • the live hosted endpoint for this repo is https://free-gpu.vercel.app/mcp

That means an MCP-capable client that supports remote HTTP MCP can connect to:

https://free-gpu.vercel.app/mcp

If you open /mcp in a browser, it may return a protocol-level error such as 406 Not Acceptable. That is expected: the route is meant for MCP clients, not normal browser navigation.

Example MCP-style request shape:

{
  "tool": "plan_provider_workflow",
  "arguments": {
    "workload": "agent-loop",
    "budget": "under-25",
    "task_hours": 3,
    "parallel_jobs": 4,
    "requires_api": true
  }
}

Example agent flow:

  1. You ask your coding agent: "I need to fine-tune an 8B model for about 6 hours and want to stay near free."
  2. The agent calls plan_provider_workflow.
  3. free-gpu estimates the compute lane.
  4. The agent gets back a structured plan with:
    • local vs remote recommendation
    • stage-by-stage workflow
    • top providers for that compute need
    • whether the task fits free tier, cheap credits, or a grant-style path

GitHub Pages

A project page is included in docs/index.html.

On GitHub, enable Pages and point it at:

  • Branch: main
  • Folder: /docs

Tests

Run:

python -m unittest tests.test_planner -v

Packaging and publishing

The project is structured so end users do not need to clone the repository.

After publishing to PyPI, users can install it with:

pip install free-gpu

To build distribution artifacts locally:

python -m pip install ".[publish]"
python -m build
python -m twine check dist/*

The repository also includes a GitHub Actions Trusted Publishing workflow for PyPI releases.

To publish manually with Twine as a fallback:

python -m twine upload dist/*

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

free_gpu-0.1.3.tar.gz (31.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

free_gpu-0.1.3-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file free_gpu-0.1.3.tar.gz.

File metadata

  • Download URL: free_gpu-0.1.3.tar.gz
  • Upload date:
  • Size: 31.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for free_gpu-0.1.3.tar.gz
Algorithm Hash digest
SHA256 2a94fd93ba4f34147b8183ee5501c5f6fc15a2ec15f99f54fef05ed138698ff7
MD5 e02d30519296a28c1c0956cbd88f168a
BLAKE2b-256 dbbd9515e41fa631305c7aa0c253234f73dd022ebedf2ceb3d0bc98ed3b7226d

See more details on using hashes here.

Provenance

The following attestation bundles were made for free_gpu-0.1.3.tar.gz:

Publisher: pypi-publish.yml on francescoopiccolo/free-gpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file free_gpu-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: free_gpu-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 29.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for free_gpu-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f723c87f51bd87863a95010ba84d9115d65bb0d15d1e320cb2f7eff1ba17b0a4
MD5 09fc14fb97f3dec3ad6ac7be2b37f8f2
BLAKE2b-256 b3b52fb1db0044b5431e17c455503c36f1237adc452ffeab1286536abf768253

See more details on using hashes here.

Provenance

The following attestation bundles were made for free_gpu-0.1.3-py3-none-any.whl:

Publisher: pypi-publish.yml on francescoopiccolo/free-gpu

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page