Skip to main content

The simplest way to chat with Llama-3.3-70B for free

Project description

FreeLlama

The simplest Python client for free Llama-3.3-70B access via public proxies

freellama is a lightweight, easy-to-use Python package that gives you instant access to the powerful Llama-3.3-70B-Instruct model — completely free, no API key, no registration required.

It works by communicating directly with public LLM web interfaces, delivering high-quality AI responses with minimal setup.

Features

  • Zero setup — no accounts, no keys
  • Simple .ask("your message") interface
  • Quality modes: fast (default, quicker) or best (higher quality)
  • Optional conversation memory (sends full history when limit is enabled)
  • Per-conversation message limit with automatic reset
  • Streaming support (token-by-token output)
  • Clean and intuitive CLI
  • Minimal dependencies (only requests)

Installation

pip install freellama

Requires Python 3.8+

Quick Start

Programmatic Use

from freellama import FreeLlama

# One-shot query (fast mode by default)
print(FreeLlama().ask("Tell me a joke"))

# Best quality mode
bot = FreeLlama(mode="best")
print(bot.ask("Write a detailed poem about the ocean"))

# With memory + limit
bot = FreeLlama(mode="fast", limit=20)
bot.ask("My name is Alice")
print(bot.ask("What is my name?")) 

Interactive Chat (CLI)

freellama                    # fast mode, no memory
freellama --mode best        # higher quality responses
freellama --limit 15         # enable memory (up to 15 messages)
freellama --stream           # token-by-token streaming
freellama --mode best --limit 20 --stream   # all features combined
freellama "Hello world!"     # one-shot message

Usage Examples

# High-quality persistent chat
bot = FreeLlama(mode="best", limit=10)
bot.ask("Explain quantum entanglement in detail")
bot.ask("Now give a real-world analogy")
bot.ask("Make it even simpler")
# Fast stateless queries
questions = ["Capital of France?", "2+2?", "Best pizza topping?"]
for q in questions:
    print(FreeLlama(mode="fast").ask(q))

CLI Options

freellama --help
usage: freellama [-h] [--mode {fast,best}] [--limit LIMIT] [--stream] [message]

FreeLlama - Free Llama-3.3-70B Chat Client

positional arguments:
  message               Send a single message and exit

options:
  -h, --help            show this help message
  --mode {fast,best}    Quality mode: fast (default) or best
  --limit LIMIT         Enable memory: max user messages before conversation reset
  --stream              Show response token-by-token (streaming)

Important Note on Memory

The underlying public services are free anonymous proxies and do not officially guarantee multi-turn conversation memory.

When you set --limit or limit=N, FreeLlama sends the full conversation history with every request — this is the correct format and gives the best possible chance of context retention.

Memory works best for short conversations and may vary depending on backend routing and load.

Author

IMApurbo
GitHub: @IMApurbo

License

MIT License — free to use, modify, and distribute.


Enjoy frontier-level AI for free — no barriers, no costs! 🚀

Made with ❤️ by IMApurbo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

freellama-1.0.1-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file freellama-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: freellama-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for freellama-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0471055d8e8fe061db05550acbc16e0465147168bd5b011708bfdf741a140d30
MD5 268dd1a1ed1fb0062b7f25408e4c307e
BLAKE2b-256 4a3302630916e9286babf7747310f0e7a609f1104a60687c1e8cc7b93dd225b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page