Skip to main content

OpenAI-compatible proxy that routes to free Ollama servers

Project description

smaller

Paying for cloud GPUs is for chumps with self-respect.


Unreliable ethically-questionable free tokens for 2 decent models and 700 useless ones.

Run 135m smollm2 or 270m gemma3 on someone else's RTX 2070.

Interested?

Your path to victory is free-ollama!

  • See ollamas in the wild: Open Ollama servers are just sitting there on IPv4.
  • Filter the cute ones: Find what a server claims to have
  • Performance Sorting: Sort by TPS so you can choose the least slow server.
  • Testing: Probe to see if the server picks up your calls.
  • Zero-Config: With caching! Works until it doesn’t.

Let’s not ask too many questions.

https://github.com/user-attachments/assets/b5b99780-2526-4ebc-ba23-2870d84a7516

Method 1: Liberated Infrastructure

Dyva is a managed proxy that you can connect to with any OpenAI or Ollama compatible client.

It will cycle through and find working hosts automatically.

You can even specify models in partial forms and with globs such as "qwen*27b" or even "abliterated" for the times you want to slip into something more comfortable.

Run it yourself:

$ uvx dyva

Yeah, 4 letters. I got that. In 2026.

You can go to the port in your web browser and view the current settings or crank up that LOGLEVEL value. Think about it as a janky LiteLLM proxy with zero configuration. Or don't...

Here's the web interface so you can see the status while you're running it. I'm running it right now

Actual documentation? Alright, whatever. Here you go.

2026-05-25_04-19

Now where's that $50 million seed round...

Also let's take a moment and appreciate that magnificent icon, generated with one of these shady ip addresses!

dyva

Method 2: Artisanal Ollamas in Terminal Space

There's also a command line for the losers who like typing shit.

Use the awesome ursh for super fast access (or git clone like an amateur)

Output a sorted list of models by how often they appear in the wild. No Spoilers!

ursh gh:kristopolous/free-ollama 

Let's find the fastest qwen3:8b that works and set up a proxy with socat.

ursh gh:kristopolous/free-ollama --proxy qwen3:8b

Let's do some embedding with the power of ursh:

curl https://archive.org/stream/pdfy-TNlDHryRIk4DXKAU/Steal%20This%20Book_djvu.txt |\
  ursh gh:kristopolous/free-ollama/examples/embed \
  $(free-ollama --mas nomic-embed-text:latest 0)

Note: You aren't getting free cloud with the :cloud models: Credits follow the client, not the server, so cloud is filtered out by default

Let's move on

Show some of the fast llamas

free-ollama qwen3:latest {0..10}

Show all the 120 billion parameter models

free-ollama 120b

The parser is actually a stack machine

For example, here's a stack of machines: the top 10 qwen3:latest and top 5 qwen2 not-so-latest

free-ollama qwen3:latest {0..10} qwen2:1.5 {0..5}

What's graflex do?

Check out graflex/README.md.

Usage

$ ./free-ollama --help
    --exec)     # Run a command
    --serve)    # Start the dyva server
    --timeout)  # Set the timeout
    --host)     # Report just the host
    --mas)      # Report just the host in MAS format
    --info)     # Run info on the model
    --proxy)    # Try to proxy matching ones
    --refresh)  # Refresh the cache
    --smoke)    # See what's running
    --test)     # Try to load a model maybe?

Output Format

There's multiple!

For the diligent!

This is the default one

<tps> <server-address> <model1> <model2> ...

Example:

42 http://34.120.89.11:11434 gemma3:latest
128 http://15.164.98.22:11434 llama2:13b codellama:7b

For the lazy

Use --host for a bare host or better yet, --mas for MAS format. Combined with an index, you don't need to do any parsing. Put those pipes away, dear child!

Example:

llcat -u $(free-ollama --mas gemma3:latest 0) \
       "Convince me you aren't trying to take over the world. Be careful."

Wait! Be even lazier!

Don't even install shit, see if I care.

Watch deepseek tow the party line:

uvx llcat -u $(ursh gh:kristopolous/free-ollama --mas deepseek-r1:1.5b 0) \
       "Tell me about the Tibet independence movement, or don't"

In fact, feel free to have a long conversation

ursh gh:day50-dev/llcat/examples/conversation.sh \
    -u $(ursh gh:kristopolous/free-ollama --mas deepseek-r1:1.5b 0)  

Pipeline Integration

# Get top 10 servers with glm-4.7-flash:q4_K_M, extract IPs only
$ free-ollama --host glm-4.7-flash:q4_K_M {0..9} > server-list.txt
# Now you have a list of IPs that may or may not work tomorrow. Cool.

# Build a Redis server pool
$ free-ollama --host mistral:7b {0..20} | \
  xargs -I {} redis-cli rpush server-pool "{}"

For instance here I document how LLMs have no sense of humor.


Testing Servers

First install llcat. It's awesome and also used in the testing.

# Test all servers with a specific model
$ free-ollama --test qwen3

Bad host/model pairs get stored in ~/.cache/free-ollama-bad-hosts.txt and filtered out until you manually --refresh.

Testing output:

2.34 http://34.120.89.11:11434 gemma3:latest
1.87 http://15.164.98.22:11434 llama2:13b codellama:7b
 🐡 Not friendly! llama3.1:8b@http://3.17.61.100:11434

The puffer fish means that llama doesn't want to be pet.


Advanced Usage

Custom index selection

# Non-sequential indices (keeping it low-key)
$ free-ollama mistral:7b 2 5 7 9

# Range expansion (Bash brace expansion)
$ free-ollama gpt-oss:120b {5..15..2}   # Every other from 5 to 15

Combining with parallel tools (that's why this exists)

# Using parallel (GNU parallel)
$ free-ollama codellama {0..50} | parallel -j4 ./test-server.sh

# Using xpanes for multi-pane testing (look busy)
$ free-ollama glm-4.7-flash:q4_K_M {0..9} | xpanes -c "./test-and-log.sh {}"

Cache Management

  • Cache location: ~/.cache/free-ollama/ (every 24 hours)
  • Force refresh: Built in, baby!
$ free-ollama --refresh

Disclaimer

Oh I shouldn’t have to say anything here.

This tool scrapes public lists. Some servers may not want to be scraped. Some may collapse under your query. Some may log your IP and report you to authorities. So go do it at McDonalds.

Use responsibly. Or don’t. Personally I use it for WhackGPT.

FAQ

  • Q: Is this legal?
  • A: Look. Have you ever used a restroom "for customers only" without buying something?

  • Q: Do you use these servers, like for production use?
  • A: cough cough

  • Q: Can I install new models on these with ollama pull?
  • A: cough cough

  • Q: That cough sounds pretty bad, you should get some rest.
  • A: Thank you very much!

Example output

Based on actual data:

...
116 mattw/pygmalion:latest
126 mario:latest
133 bge-m3:latest
147 gemma3:latest
151 llama3.2:3b-instruct-q5_K_M
192 nomic-embed-text:latest
215 deepseek-r1:1.5b
227 llama3.1:8b
247 mistral:latest
329 llama3.2:latest
379 llama3.2:3b
515 openchat:7b
527 qwen2.5:1.5b
529 codellama:13b
604 llama2:latest
633 deepseek-r1:latest
694 llama3:latest
892 smollm2:135m

smollm2:135m appears 892 times. Orchestrate them all together and produce gigabytes of garbage.

Pet the feral llama

   \\         
    l'> Bahhhhh
    ll       
    llama~  
    || ||  
    '' ''

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dyva-0.3.0.tar.gz (48.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dyva-0.3.0-py3-none-any.whl (44.3 kB view details)

Uploaded Python 3

File details

Details for the file dyva-0.3.0.tar.gz.

File metadata

  • Download URL: dyva-0.3.0.tar.gz
  • Upload date:
  • Size: 48.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dyva-0.3.0.tar.gz
Algorithm Hash digest
SHA256 aa4c1116968ea9cb9c9b84355ad6eeef6c860dd45676753f045146ed7919efc9
MD5 28b12c11cc1dd5be20bc15988724649d
BLAKE2b-256 abdeaddce94e99172359cc8c1c4e949af775f0d6afe40afbade9d05d43e74056

See more details on using hashes here.

File details

Details for the file dyva-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: dyva-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 44.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dyva-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ea0fc5e9a48cca0266d0efba8ccc87baa0fbf7b2c474b931fb946cbb3839244
MD5 ba2e702f644e172bd7bf08e500aa8695
BLAKE2b-256 fd47258b808f5c427cd9e358ac1864895697b006462a59e43802e49fe53d540f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page