Minimal OpenAI GPT client for Python - chat, embeddings, functions, streaming, no heavy deps
Project description
gpt-client-lite
Minimal OpenAI GPT client for Python — chat, embeddings, function calling, and streaming with zero heavy dependencies.
Built entirely on the Python standard library (urllib, json,
dataclasses). Works on Python 3.9 and above.
Install
pip install gpt-client-lite
Quick start
from gpt_client_lite import GPTClient
client = GPTClient() # reads OPENAI_API_KEY from environment
# client = GPTClient("sk-...") # or pass the key directly
Chat
Single-turn prompt
reply = client.chat.simple("What is the capital of Japan?")
print(reply) # "Tokyo"
With a system prompt
reply = client.chat.simple(
"Tell me a joke.",
system="You are a stand-up comedian who only tells puns.",
)
Full response object
from gpt_client_lite import Message
resp = client.chat.complete(
[Message.user("Explain black holes in one sentence.")],
model="gpt-4o",
temperature=0.7,
max_tokens=120,
)
print(resp.text) # assistant reply
print(resp.usage) # Usage(prompt=..., completion=..., total=...)
print(resp.finish_reason) # "stop"
JSON mode
data = client.chat.json("List 5 world capitals as a JSON array of strings.")
# data is already a parsed Python object: ["Tokyo", "Paris", ...]
Summarise / translate helpers
short = client.chat.summarise(long_article, max_words=80)
es = client.chat.translate("Good morning, have a great day!", "Spanish")
Multi-turn conversation
conv = client.chat.conversation(system="You are a helpful travel guide.")
print(conv.say("I'm visiting Tokyo next month."))
print(conv.say("What neighbourhood should I stay in?"))
print(conv.say("What's the food like there?"))
# Inspect history
for msg in conv.history:
print(f"[{msg.role}] {msg.content[:60]}")
conv.clear() # reset (keeps system message by default)
Streaming
# Print tokens to stdout as they arrive
client.streaming.simple("Write a haiku about the ocean").print_stream()
# Iterate manually
for chunk in client.streaming.simple("Count from 1 to 5."):
print(chunk.text, end="", flush=True)
# Callback style
full_text = client.streaming.simple("Tell me a story").on_token(
lambda token: print(token, end="", flush=True)
)
# Collect the whole text at once (buffers internally)
text = client.streaming.simple("Explain recursion.").text
Streaming from a messages list
from gpt_client_lite import Message
msgs = [
Message.system("You are a Python expert."),
Message.user("Show me a quick sort implementation."),
]
stream = client.streaming.chat(msgs, temperature=0.2)
stream.print_stream()
Embeddings
# Embed a single text
vec = client.embeddings.embed("The quick brown fox")
print(len(vec)) # 1536 for text-embedding-3-small
# Embed multiple texts (one API call)
vecs = client.embeddings.embed_batch(["cat", "dog", "car", "truck"])
# Semantic similarity (0 – 1)
score = client.embeddings.similarity("puppy", "dog")
print(f"{score:.3f}") # ~0.92
# Find most similar
results = client.embeddings.most_similar(
"machine learning",
["deep learning", "pasta cooking", "neural networks", "astronomy"],
top_k=2,
)
for text, score in results:
print(f"{score:.3f} {text}")
# Rank documents by relevance
ranked = client.embeddings.rank("Python async programming", documents)
for idx, text, score in ranked:
print(f"[{idx}] {score:.3f} {text[:60]}")
# Cluster texts (pure-Python k-means, no numpy)
groups = client.embeddings.cluster(sentences, n_clusters=3)
Function / Tool Calling
import json
from gpt_client_lite import Message
# Register functions — schema is inferred from type hints + docstring
@client.functions.register
def get_weather(location: str, unit: str = "celsius") -> str:
"""Get the current weather for a location.
Args:
location: City name or address.
unit: Temperature unit — celsius or fahrenheit.
"""
# Your real implementation here
return json.dumps({"location": location, "temperature": 22, "unit": unit})
@client.functions.register
def search_web(query: str, max_results: int = 5) -> str:
"""Search the web and return a summary.
Args:
query: Search query string.
max_results: Maximum number of results to return.
"""
return f"Top {max_results} results for: {query}"
# Single round
messages = [Message.user("What's the weather like in Paris?")]
tools = client.functions.build_tools()
resp = client.chat.with_tools(messages, tools=tools)
if resp.has_tool_calls:
# Execute and get ready-to-send result messages
tool_results = client.functions.execute_all_tool_calls(resp.tool_calls)
messages.append(resp.message.to_dict())
messages.extend(tool_results)
final = client.chat.complete(messages)
print(final.text)
Agentic loop (auto-executes until the model stops calling tools)
messages = [Message.user("Search for Python news and summarise the weather in London.")]
final = client.functions.run_agentic_loop(
messages=[m.to_dict() for m in messages],
chat_api=client.chat,
max_rounds=5,
)
print(final.text)
Configuration
client = GPTClient(
api_key="sk-...", # or OPENAI_API_KEY env var
model="gpt-4o", # default model for chat + streaming
timeout=30, # seconds (default 60)
max_retries=5, # exponential backoff retries (default 3)
organization="org-...", # optional OpenAI org ID
project="proj-...", # optional project ID
base_url="https://...", # override for proxies / compatible APIs
extra_headers={"X-Custom": "value"},
)
# Change the default model after construction
client.set_model("gpt-4o-mini")
Error handling
from gpt_client_lite import (
AuthenticationError,
RateLimitError,
ContextLengthExceededError,
InvalidRequestError,
APIError,
GPTClientError,
)
try:
reply = client.chat.simple("Hello!")
except AuthenticationError:
print("Invalid API key.")
except RateLimitError as e:
print(f"Rate limited. Retry after: {e.retry_after}s")
except ContextLengthExceededError:
print("Prompt too long for this model.")
except APIError as e:
print(f"Server error {e.status_code}: {e}")
except GPTClientError as e:
print(f"Unexpected error: {e}")
API Reference
GPTClient
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
$OPENAI_API_KEY |
API key |
model |
str |
"gpt-4o-mini" |
Default model |
timeout |
int |
60 |
Request timeout in seconds |
max_retries |
int |
3 |
Retry attempts for transient errors |
organization |
str |
None |
OpenAI organization ID |
project |
str |
None |
OpenAI project ID |
base_url |
str |
OpenAI API | Override base URL |
extra_headers |
dict |
None |
Additional HTTP headers |
client.chat
| Method | Returns | Description |
|---|---|---|
.simple(prompt, system, **kw) |
str |
One-shot prompt |
.complete(messages, **kw) |
ChatResponse |
Full completions call |
.json(prompt, system, **kw) |
Any |
JSON-mode, auto-parsed |
.summarise(text, max_words) |
str |
Summarise text |
.translate(text, language) |
str |
Translate text |
.with_tools(messages, tools) |
ChatResponse |
Tool-calling request |
.conversation(system, **kw) |
Conversation |
Multi-turn session |
client.streaming
| Method | Returns | Description |
|---|---|---|
.simple(prompt, system, **kw) |
StreamResult |
Stream a prompt |
.chat(messages, **kw) |
StreamResult |
Stream a messages list |
.complete_to_string(msgs, on_token) |
str |
Stream + collect |
StreamResult
| Method / Property | Description |
|---|---|
iter_text() |
Iterator over text fragments |
collect() / .text |
Buffer and return full text |
print_stream() |
Write to stdout, return full text |
on_token(callback) |
Call callback(token) per fragment |
to_message() |
Return buffered result as Message |
client.embeddings
| Method | Returns | Description |
|---|---|---|
.embed(text) |
List[float] |
Single embedding vector |
.embed_batch(texts) |
List[List[float]] |
Batch embedding |
.similarity(a, b) |
float |
Semantic similarity 0–1 |
.most_similar(query, candidates, top_k) |
List[(str, float)] |
Nearest texts |
.rank(query, documents) |
List[(int, str, float)] |
Ranked with original index |
.cluster(texts, n_clusters) |
List[List[str]] |
K-means clustering |
client.functions
| Method | Description |
|---|---|
@.register |
Decorator to register a function |
.build_tools(names?) |
List of FunctionDefinition for the API |
.execute_tool_call(tc) |
Execute one tool call dict |
.execute_all_tool_calls(tcs) |
Execute all, return result messages |
.run_agentic_loop(messages, chat_api) |
Auto loop until no more tool calls |
License
MIT © Vladyslav Zaiets
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gpt_client_lite-1.0.0.tar.gz.
File metadata
- Download URL: gpt_client_lite-1.0.0.tar.gz
- Upload date:
- Size: 25.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4db962d10e88bf1ab0485548674a482ddff48405d2220db90ace2b6af097448
|
|
| MD5 |
521249c023abcf0fda97f9a08cc6815b
|
|
| BLAKE2b-256 |
b6e5b43c2a780782ce95f7e478d701507eb60bdb22aecb5f4784c7efdc7326eb
|
File details
Details for the file gpt_client_lite-1.0.0-py3-none-any.whl.
File metadata
- Download URL: gpt_client_lite-1.0.0-py3-none-any.whl
- Upload date:
- Size: 26.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31f55d2b1b31550d8847621d867b475b1b4e92774f0a8d71d1e4bd0d2958972d
|
|
| MD5 |
9528a7875a19705984d3ca21dc430c99
|
|
| BLAKE2b-256 |
63efc3d48c47576cbc4aa7a04e53ced3c0d80348f15afbdcdc091cadc41f8e21
|