Skip to main content

Minimal OpenAI GPT client for Python - chat, embeddings, functions, streaming, no heavy deps

Project description

gpt-client-lite

Minimal OpenAI GPT client for Python — chat, embeddings, function calling, and streaming with zero heavy dependencies.

Built entirely on the Python standard library (urllib, json, dataclasses). Works on Python 3.9 and above.

Install

pip install gpt-client-lite

Quick start

from gpt_client_lite import GPTClient

client = GPTClient()              # reads OPENAI_API_KEY from environment
# client = GPTClient("sk-...")   # or pass the key directly

Chat

Single-turn prompt

reply = client.chat.simple("What is the capital of Japan?")
print(reply)  # "Tokyo"

With a system prompt

reply = client.chat.simple(
    "Tell me a joke.",
    system="You are a stand-up comedian who only tells puns.",
)

Full response object

from gpt_client_lite import Message

resp = client.chat.complete(
    [Message.user("Explain black holes in one sentence.")],
    model="gpt-4o",
    temperature=0.7,
    max_tokens=120,
)

print(resp.text)          # assistant reply
print(resp.usage)         # Usage(prompt=..., completion=..., total=...)
print(resp.finish_reason) # "stop"

JSON mode

data = client.chat.json("List 5 world capitals as a JSON array of strings.")
# data is already a parsed Python object: ["Tokyo", "Paris", ...]

Summarise / translate helpers

short = client.chat.summarise(long_article, max_words=80)
es    = client.chat.translate("Good morning, have a great day!", "Spanish")

Multi-turn conversation

conv = client.chat.conversation(system="You are a helpful travel guide.")

print(conv.say("I'm visiting Tokyo next month."))
print(conv.say("What neighbourhood should I stay in?"))
print(conv.say("What's the food like there?"))

# Inspect history
for msg in conv.history:
    print(f"[{msg.role}] {msg.content[:60]}")

conv.clear()  # reset (keeps system message by default)

Streaming

# Print tokens to stdout as they arrive
client.streaming.simple("Write a haiku about the ocean").print_stream()

# Iterate manually
for chunk in client.streaming.simple("Count from 1 to 5."):
    print(chunk.text, end="", flush=True)

# Callback style
full_text = client.streaming.simple("Tell me a story").on_token(
    lambda token: print(token, end="", flush=True)
)

# Collect the whole text at once (buffers internally)
text = client.streaming.simple("Explain recursion.").text

Streaming from a messages list

from gpt_client_lite import Message

msgs = [
    Message.system("You are a Python expert."),
    Message.user("Show me a quick sort implementation."),
]

stream = client.streaming.chat(msgs, temperature=0.2)
stream.print_stream()

Embeddings

# Embed a single text
vec = client.embeddings.embed("The quick brown fox")
print(len(vec))  # 1536 for text-embedding-3-small

# Embed multiple texts (one API call)
vecs = client.embeddings.embed_batch(["cat", "dog", "car", "truck"])

# Semantic similarity (0 – 1)
score = client.embeddings.similarity("puppy", "dog")
print(f"{score:.3f}")  # ~0.92

# Find most similar
results = client.embeddings.most_similar(
    "machine learning",
    ["deep learning", "pasta cooking", "neural networks", "astronomy"],
    top_k=2,
)
for text, score in results:
    print(f"{score:.3f}  {text}")

# Rank documents by relevance
ranked = client.embeddings.rank("Python async programming", documents)
for idx, text, score in ranked:
    print(f"[{idx}] {score:.3f}  {text[:60]}")

# Cluster texts (pure-Python k-means, no numpy)
groups = client.embeddings.cluster(sentences, n_clusters=3)

Function / Tool Calling

import json
from gpt_client_lite import Message

# Register functions — schema is inferred from type hints + docstring
@client.functions.register
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get the current weather for a location.

    Args:
        location: City name or address.
        unit: Temperature unit — celsius or fahrenheit.
    """
    # Your real implementation here
    return json.dumps({"location": location, "temperature": 22, "unit": unit})


@client.functions.register
def search_web(query: str, max_results: int = 5) -> str:
    """Search the web and return a summary.

    Args:
        query: Search query string.
        max_results: Maximum number of results to return.
    """
    return f"Top {max_results} results for: {query}"


# Single round
messages = [Message.user("What's the weather like in Paris?")]
tools    = client.functions.build_tools()
resp     = client.chat.with_tools(messages, tools=tools)

if resp.has_tool_calls:
    # Execute and get ready-to-send result messages
    tool_results = client.functions.execute_all_tool_calls(resp.tool_calls)

    messages.append(resp.message.to_dict())
    messages.extend(tool_results)

    final = client.chat.complete(messages)
    print(final.text)

Agentic loop (auto-executes until the model stops calling tools)

messages = [Message.user("Search for Python news and summarise the weather in London.")]

final = client.functions.run_agentic_loop(
    messages=[m.to_dict() for m in messages],
    chat_api=client.chat,
    max_rounds=5,
)
print(final.text)

Configuration

client = GPTClient(
    api_key="sk-...",          # or OPENAI_API_KEY env var
    model="gpt-4o",            # default model for chat + streaming
    timeout=30,                # seconds (default 60)
    max_retries=5,             # exponential backoff retries (default 3)
    organization="org-...",    # optional OpenAI org ID
    project="proj-...",        # optional project ID
    base_url="https://...",    # override for proxies / compatible APIs
    extra_headers={"X-Custom": "value"},
)

# Change the default model after construction
client.set_model("gpt-4o-mini")

Error handling

from gpt_client_lite import (
    AuthenticationError,
    RateLimitError,
    ContextLengthExceededError,
    InvalidRequestError,
    APIError,
    GPTClientError,
)

try:
    reply = client.chat.simple("Hello!")
except AuthenticationError:
    print("Invalid API key.")
except RateLimitError as e:
    print(f"Rate limited. Retry after: {e.retry_after}s")
except ContextLengthExceededError:
    print("Prompt too long for this model.")
except APIError as e:
    print(f"Server error {e.status_code}: {e}")
except GPTClientError as e:
    print(f"Unexpected error: {e}")

API Reference

GPTClient

Parameter Type Default Description
api_key str $OPENAI_API_KEY API key
model str "gpt-4o-mini" Default model
timeout int 60 Request timeout in seconds
max_retries int 3 Retry attempts for transient errors
organization str None OpenAI organization ID
project str None OpenAI project ID
base_url str OpenAI API Override base URL
extra_headers dict None Additional HTTP headers

client.chat

Method Returns Description
.simple(prompt, system, **kw) str One-shot prompt
.complete(messages, **kw) ChatResponse Full completions call
.json(prompt, system, **kw) Any JSON-mode, auto-parsed
.summarise(text, max_words) str Summarise text
.translate(text, language) str Translate text
.with_tools(messages, tools) ChatResponse Tool-calling request
.conversation(system, **kw) Conversation Multi-turn session

client.streaming

Method Returns Description
.simple(prompt, system, **kw) StreamResult Stream a prompt
.chat(messages, **kw) StreamResult Stream a messages list
.complete_to_string(msgs, on_token) str Stream + collect

StreamResult

Method / Property Description
iter_text() Iterator over text fragments
collect() / .text Buffer and return full text
print_stream() Write to stdout, return full text
on_token(callback) Call callback(token) per fragment
to_message() Return buffered result as Message

client.embeddings

Method Returns Description
.embed(text) List[float] Single embedding vector
.embed_batch(texts) List[List[float]] Batch embedding
.similarity(a, b) float Semantic similarity 0–1
.most_similar(query, candidates, top_k) List[(str, float)] Nearest texts
.rank(query, documents) List[(int, str, float)] Ranked with original index
.cluster(texts, n_clusters) List[List[str]] K-means clustering

client.functions

Method Description
@.register Decorator to register a function
.build_tools(names?) List of FunctionDefinition for the API
.execute_tool_call(tc) Execute one tool call dict
.execute_all_tool_calls(tcs) Execute all, return result messages
.run_agentic_loop(messages, chat_api) Auto loop until no more tool calls

License

MIT © Vladyslav Zaiets

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt_client_lite-1.0.0.tar.gz (25.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gpt_client_lite-1.0.0-py3-none-any.whl (26.0 kB view details)

Uploaded Python 3

File details

Details for the file gpt_client_lite-1.0.0.tar.gz.

File metadata

  • Download URL: gpt_client_lite-1.0.0.tar.gz
  • Upload date:
  • Size: 25.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for gpt_client_lite-1.0.0.tar.gz
Algorithm Hash digest
SHA256 b4db962d10e88bf1ab0485548674a482ddff48405d2220db90ace2b6af097448
MD5 521249c023abcf0fda97f9a08cc6815b
BLAKE2b-256 b6e5b43c2a780782ce95f7e478d701507eb60bdb22aecb5f4784c7efdc7326eb

See more details on using hashes here.

File details

Details for the file gpt_client_lite-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gpt_client_lite-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 31f55d2b1b31550d8847621d867b475b1b4e92774f0a8d71d1e4bd0d2958972d
MD5 9528a7875a19705984d3ca21dc430c99
BLAKE2b-256 63efc3d48c47576cbc4aa7a04e53ced3c0d80348f15afbdcdc091cadc41f8e21

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page