The simplest way to chat with Llama-3.3-70B for free
Project description
FreeLlama
The simplest Python client for free Llama-3.3-70B access via public proxies
freellama is a lightweight, easy-to-use Python package that gives you instant access to the powerful Llama-3.3-70B-Instruct model — completely free, no API key, no registration required.
It works by communicating directly with public LLM web interfaces, delivering high-quality AI responses with minimal setup.
Features
- Zero setup — no accounts, no keys
- Simple
.ask("your message")interface - Quality modes:
fast(default, quicker) orbest(higher quality) - Optional conversation memory (sends full history when
limitis enabled) - Per-conversation message limit with automatic reset
- Streaming support (token-by-token output)
- Clean and intuitive CLI
- Minimal dependencies (only
requests)
Installation
pip install freellama
Requires Python 3.8+
Quick Start
Programmatic Use
from freellama import FreeLlama
# One-shot query (fast mode by default)
print(FreeLlama().ask("Tell me a joke"))
# Best quality mode
bot = FreeLlama(mode="best")
print(bot.ask("Write a detailed poem about the ocean"))
# With memory + limit
bot = FreeLlama(mode="fast", limit=20)
bot.ask("My name is Alice")
print(bot.ask("What is my name?"))
Interactive Chat (CLI)
freellama # fast mode, no memory
freellama --mode best # higher quality responses
freellama --limit 15 # enable memory (up to 15 messages)
freellama --stream # token-by-token streaming
freellama --mode best --limit 20 --stream # all features combined
freellama "Hello world!" # one-shot message
Usage Examples
# High-quality persistent chat
bot = FreeLlama(mode="best", limit=10)
bot.ask("Explain quantum entanglement in detail")
bot.ask("Now give a real-world analogy")
bot.ask("Make it even simpler")
# Fast stateless queries
questions = ["Capital of France?", "2+2?", "Best pizza topping?"]
for q in questions:
print(FreeLlama(mode="fast").ask(q))
CLI Options
freellama --help
usage: freellama [-h] [--mode {fast,best}] [--limit LIMIT] [--stream] [message]
FreeLlama - Free Llama-3.3-70B Chat Client
positional arguments:
message Send a single message and exit
options:
-h, --help show this help message
--mode {fast,best} Quality mode: fast (default) or best
--limit LIMIT Enable memory: max user messages before conversation reset
--stream Show response token-by-token (streaming)
Important Note on Memory
The underlying public services are free anonymous proxies and do not officially guarantee multi-turn conversation memory.
When you set --limit or limit=N, FreeLlama sends the full conversation history with every request — this is the correct format and gives the best possible chance of context retention.
Memory works best for short conversations and may vary depending on backend routing and load.
Author
IMApurbo
GitHub: @IMApurbo
License
MIT License — free to use, modify, and distribute.
Enjoy frontier-level AI for free — no barriers, no costs! 🚀
Made with ❤️ by IMApurbo
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file freellama-1.0.1-py3-none-any.whl.
File metadata
- Download URL: freellama-1.0.1-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0471055d8e8fe061db05550acbc16e0465147168bd5b011708bfdf741a140d30
|
|
| MD5 |
268dd1a1ed1fb0062b7f25408e4c307e
|
|
| BLAKE2b-256 |
4a3302630916e9286babf7747310f0e7a609f1104a60687c1e8cc7b93dd225b6
|