Skip to main content

The simplest way to chat with Llama-3.3-70B for free

Project description

FreeLlama

The simplest Python client for free Llama-3.3-70B access via public proxies

freellama is a lightweight, easy-to-use Python package that gives you instant access to the powerful Llama-3.3-70B-Instruct model — completely free, no API key, no registration required.

It works by communicating directly with public LLM web interfaces, making high-quality AI available to everyone with just a few lines of code.

Features

  • Zero setup — no accounts, no keys
  • Simple .ask("your message") interface
  • Optional conversation memory (sends full history when limit is enabled)
  • Per-conversation message limit with automatic reset
  • Clean CLI for interactive chatting
  • Streaming support
  • Minimal dependencies (only requests)

Installation

pip install freellama

Requires Python 3.8+

Quick Start

Programmatic Use

from freellama import FreeLlama

# One-shot queries (no memory)
print(FreeLlama().ask("Tell me a joke"))

# With conversation memory
bot = FreeLlama(limit=20)  # remembers up to 20 user messages
bot.ask("My name is Alice")
print(bot.ask("What is my name?"))  # → Should recall "Alice" if backend supports it

Interactive Chat (CLI)

freellama                 # starts interactive chat
freellama --limit 15      # enables memory, resets after 15 messages
freellama "Hello!"        # one-shot message
freellama --stream        # streaming output

Usage Examples

# Persistent conversation
bot = FreeLlama(limit=10)
bot.ask("Explain quantum entanglement")
bot.ask("Now give a real-world analogy")
bot.ask("Make it simpler")
# ... continues with context until limit reached
# Quick stateless queries
questions = ["Capital of Japan?", "2+2?", "Best pizza topping?"]
for q in questions:
    print(FreeLlama().ask(q))

CLI Options

freellama --help
usage: freellama [-h] [--limit LIMIT] [--stream] [message]

FreeLlama - Free Llama-3.3-70B Chat Client

positional arguments:
  message         Send a single message and exit

options:
  -h, --help      show this help message
  --limit LIMIT   Enable memory: max user messages before conversation reset
  --stream        Show response token-by-token

Important Note on Memory

The underlying public services are free anonymous proxies and do not officially guarantee multi-turn conversation memory.

When you set --limit or limit=N, FreeLlama sends the full conversation history with every request — this is the correct format and gives the best possible chance of context retention.

Memory works best for short conversations and may vary depending on the backend load and model routing.

Author

IMApurbo
GitHub: @IMApurbo

License

MIT License — free to use, modify, and distribute.


Enjoy frontier-level AI for free — no barriers, no costs! 🚀

Made with ❤️ by IMApurbo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

freellama-1.0.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file freellama-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: freellama-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.5

File hashes

Hashes for freellama-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5cbecdb80c4594d8ff40c81d3f651ded71a25f2e50cc0c78aeaf512d9f83db52
MD5 2b999ca890321187ded98bc5366606d8
BLAKE2b-256 2472afa9591702ef4f91568b423aec871522d44c832fe8817a175b6699bb2126

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page