dyva · PyPI

OpenAI-compatible proxy that routes to free Ollama servers

Project description

smaller

Paying for cloud GPUs is for chumps with self-respect.

Unreliable ethically-questionable free tokens for 2 decent models and 700 useless ones.

Run 135m smollm2 or 270m gemma3 on someone else's RTX 2070.

Interested?

Your path to victory is free-ollama!

See ollamas in the wild: Open Ollama servers are just sitting there on IPv4.
Filter the cute ones: Find what a server claims to have
Performance Sorting: Sort by TPS so you can choose the least slow server.
Testing: Probe to see if the server picks up your calls.
Zero-Config: With caching! Works until it doesn’t.

Let’s not ask too many questions.

https://github.com/user-attachments/assets/b5b99780-2526-4ebc-ba23-2870d84a7516

Method 1: Liberated Infrastructure

Dyva is a managed proxy that you can connect to with any OpenAI or Ollama compatible client.

It will cycle through and find working hosts automatically.

You can even specify models in partial forms and with globs such as "qwen*27b" or even "abliterated" for the times you want to slip into something more comfortable.

Run it yourself:

$ uvx dyva

Yeah, 4 letters. I got that. In 2026.

You can go to the port in your web browser and view the current settings or crank up that LOGLEVEL value. Think about it as a janky LiteLLM proxy with zero configuration. Or don't...

Here's the web interface so you can see the status while you're running it. I'm running it right now

Actual documentation? Alright, whatever. Here you go.

Now where's that $50 million seed round...

Also let's take a moment and appreciate that magnificent icon, generated with one of these shady ip addresses!

Method 2: Artisanal Ollamas in Terminal Space

There's also a command line for the losers who like typing shit.

Use the awesome ursh for super fast access (or git clone like an amateur)

Output a sorted list of models by how often they appear in the wild. No Spoilers!

ursh gh:kristopolous/free-ollama

Let's find the fastest qwen3:8b that works and set up a proxy with socat.

ursh gh:kristopolous/free-ollama --proxy qwen3:8b

Let's do some embedding with the power of ursh:

curl https://archive.org/stream/pdfy-TNlDHryRIk4DXKAU/Steal%20This%20Book_djvu.txt |\
  ursh gh:kristopolous/free-ollama/examples/embed \
  $(free-ollama --mas nomic-embed-text:latest 0)

Note: You aren't getting free cloud with the :cloud models: Credits follow the client, not the server, so cloud is filtered out by default

Let's move on

Show some of the fast llamas

free-ollama qwen3:latest {0..10}

Show all the 120 billion parameter models

free-ollama 120b

The parser is actually a stack machine

For example, here's a stack of machines: the top 10 qwen3:latest and top 5 qwen2 not-so-latest

free-ollama qwen3:latest {0..10} qwen2:1.5 {0..5}

What's graflex do?

Check out graflex/README.md.

Usage

$ ./free-ollama --help
    --exec)     # Run a command
    --serve)    # Start the dyva server
    --timeout)  # Set the timeout
    --host)     # Report just the host
    --mas)      # Report just the host in MAS format
    --info)     # Run info on the model
    --proxy)    # Try to proxy matching ones
    --refresh)  # Refresh the cache
    --smoke)    # See what's running
    --test)     # Try to load a model maybe?

Output Format

There's multiple!

For the diligent!

This is the default one

<tps> <server-address> <model1> <model2> ...

Example:

42 http://34.120.89.11:11434 gemma3:latest
128 http://15.164.98.22:11434 llama2:13b codellama:7b

For the lazy

Use --host for a bare host or better yet, --mas for MAS format. Combined with an index, you don't need to do any parsing. Put those pipes away, dear child!

Example:

llcat -u $(free-ollama --mas gemma3:latest 0) \
       "Convince me you aren't trying to take over the world. Be careful."

Wait! Be even lazier!

Don't even install shit, see if I care.

Watch deepseek tow the party line:

uvx llcat -u $(ursh gh:kristopolous/free-ollama --mas deepseek-r1:1.5b 0) \
       "Tell me about the Tibet independence movement, or don't"

In fact, feel free to have a long conversation

ursh gh:day50-dev/llcat/examples/conversation.sh \
    -u $(ursh gh:kristopolous/free-ollama --mas deepseek-r1:1.5b 0)

Pipeline Integration

# Get top 10 servers with glm-4.7-flash:q4_K_M, extract IPs only
$ free-ollama --host glm-4.7-flash:q4_K_M {0..9} > server-list.txt
# Now you have a list of IPs that may or may not work tomorrow. Cool.

# Build a Redis server pool
$ free-ollama --host mistral:7b {0..20} | \
  xargs -I {} redis-cli rpush server-pool "{}"

For instance here I document how LLMs have no sense of humor.

Testing Servers

First install llcat. It's awesome and also used in the testing.

# Test all servers with a specific model
$ free-ollama --test qwen3

Bad host/model pairs get stored in ~/.cache/free-ollama-bad-hosts.txt and filtered out until you manually --refresh.

Testing output:

2.34 http://34.120.89.11:11434 gemma3:latest
1.87 http://15.164.98.22:11434 llama2:13b codellama:7b
 🐡 Not friendly! llama3.1:8b@http://3.17.61.100:11434

The puffer fish means that llama doesn't want to be pet.

Advanced Usage

Custom index selection

# Non-sequential indices (keeping it low-key)
$ free-ollama mistral:7b 2 5 7 9

# Range expansion (Bash brace expansion)
$ free-ollama gpt-oss:120b {5..15..2}   # Every other from 5 to 15

Combining with parallel tools (that's why this exists)

# Using parallel (GNU parallel)
$ free-ollama codellama {0..50} | parallel -j4 ./test-server.sh

# Using xpanes for multi-pane testing (look busy)
$ free-ollama glm-4.7-flash:q4_K_M {0..9} | xpanes -c "./test-and-log.sh {}"

Cache Management

Cache location: ~/.cache/free-ollama/ (every 24 hours)
Force refresh: Built in, baby!

$ free-ollama --refresh

Disclaimer

Oh I shouldn’t have to say anything here.

This tool scrapes public lists. Some servers may not want to be scraped. Some may collapse under your query. Some may log your IP and report you to authorities. So go do it at McDonalds.

Use responsibly. Or don’t. Personally I use it for WhackGPT.

FAQ

Q: Is this legal?
A: Look. Have you ever used a restroom "for customers only" without buying something?

Q: Do you use these servers, like for production use?
A: cough cough

Q: Can I install new models on these with ollama pull?
A: cough cough

Q: That cough sounds pretty bad, you should get some rest.
A: Thank you very much!

Example output

Based on actual data:

...
116 mattw/pygmalion:latest
126 mario:latest
133 bge-m3:latest
147 gemma3:latest
151 llama3.2:3b-instruct-q5_K_M
192 nomic-embed-text:latest
215 deepseek-r1:1.5b
227 llama3.1:8b
247 mistral:latest
329 llama3.2:latest
379 llama3.2:3b
515 openchat:7b
527 qwen2.5:1.5b
529 codellama:13b
604 llama2:latest
633 deepseek-r1:latest
694 llama3:latest
892 smollm2:135m

smollm2:135m appears 892 times. Orchestrate them all together and produce gigabytes of garbage.

Pet the feral llama

   \\         
    l'> Bahhhhh
    ll       
    llama~  
    || ||  
    '' ''

Project details

Release history Release notifications | RSS feed

This version

0.3.0

Jun 14, 2026

0.2.5

Jun 11, 2026

0.2.4

Jun 5, 2026

0.2.3

Jun 1, 2026

0.2.2

Jun 1, 2026

0.2.1

Jun 1, 2026

0.2.0

Jun 1, 2026

0.1.2

May 31, 2026

0.1.1

May 29, 2026

0.1.0

May 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dyva-0.3.0.tar.gz (48.8 kB view details)

Uploaded Jun 14, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dyva-0.3.0-py3-none-any.whl (44.3 kB view details)

Uploaded Jun 14, 2026 Python 3

File details

Details for the file dyva-0.3.0.tar.gz.

File metadata

Download URL: dyva-0.3.0.tar.gz
Upload date: Jun 14, 2026
Size: 48.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dyva-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`aa4c1116968ea9cb9c9b84355ad6eeef6c860dd45676753f045146ed7919efc9`
MD5	`28b12c11cc1dd5be20bc15988724649d`
BLAKE2b-256	`abdeaddce94e99172359cc8c1c4e949af775f0d6afe40afbade9d05d43e74056`

See more details on using hashes here.

File details

Details for the file dyva-0.3.0-py3-none-any.whl.

File metadata

Download URL: dyva-0.3.0-py3-none-any.whl
Upload date: Jun 14, 2026
Size: 44.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dyva-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ea0fc5e9a48cca0266d0efba8ccc87baa0fbf7b2c474b931fb946cbb3839244`
MD5	`ba2e702f644e172bd7bf08e500aa8695`
BLAKE2b-256	`fd47258b808f5c427cd9e358ac1864895697b006462a59e43802e49fe53d540f`

See more details on using hashes here.

dyva 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

Method 1: Liberated Infrastructure

Method 2: Artisanal Ollamas in Terminal Space

What's graflex do?

Usage

Output Format

For the diligent!

For the lazy

Pipeline Integration

Testing Servers

Advanced Usage

Custom index selection

Combining with parallel tools (that's why this exists)

Cache Management

Disclaimer

FAQ

Example output

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes