Minimal OpenAI GPT client for Python - chat, embeddings, functions, streaming, no heavy deps

These details have not been verified by PyPI

Project links

Project description

gpt-client-lite

Minimal OpenAI GPT client for Python — chat, embeddings, function calling, and streaming with zero heavy dependencies.

Built entirely on the Python standard library (urllib, json, dataclasses). Works on Python 3.9 and above.

Install

pip install gpt-client-lite

Quick start

from gpt_client_lite import GPTClient

client = GPTClient()              # reads OPENAI_API_KEY from environment
# client = GPTClient("sk-...")   # or pass the key directly

Chat

Single-turn prompt

reply = client.chat.simple("What is the capital of Japan?")
print(reply)  # "Tokyo"

With a system prompt

reply = client.chat.simple(
    "Tell me a joke.",
    system="You are a stand-up comedian who only tells puns.",
)

Full response object

from gpt_client_lite import Message

resp = client.chat.complete(
    [Message.user("Explain black holes in one sentence.")],
    model="gpt-4o",
    temperature=0.7,
    max_tokens=120,
)

print(resp.text)          # assistant reply
print(resp.usage)         # Usage(prompt=..., completion=..., total=...)
print(resp.finish_reason) # "stop"

JSON mode

data = client.chat.json("List 5 world capitals as a JSON array of strings.")
# data is already a parsed Python object: ["Tokyo", "Paris", ...]

Summarise / translate helpers

short = client.chat.summarise(long_article, max_words=80)
es    = client.chat.translate("Good morning, have a great day!", "Spanish")

Multi-turn conversation

conv = client.chat.conversation(system="You are a helpful travel guide.")

print(conv.say("I'm visiting Tokyo next month."))
print(conv.say("What neighbourhood should I stay in?"))
print(conv.say("What's the food like there?"))

# Inspect history
for msg in conv.history:
    print(f"[{msg.role}] {msg.content[:60]}")

conv.clear()  # reset (keeps system message by default)

Streaming

# Print tokens to stdout as they arrive
client.streaming.simple("Write a haiku about the ocean").print_stream()

# Iterate manually
for chunk in client.streaming.simple("Count from 1 to 5."):
    print(chunk.text, end="", flush=True)

# Callback style
full_text = client.streaming.simple("Tell me a story").on_token(
    lambda token: print(token, end="", flush=True)
)

# Collect the whole text at once (buffers internally)
text = client.streaming.simple("Explain recursion.").text

Streaming from a messages list

from gpt_client_lite import Message

msgs = [
    Message.system("You are a Python expert."),
    Message.user("Show me a quick sort implementation."),
]

stream = client.streaming.chat(msgs, temperature=0.2)
stream.print_stream()

Embeddings

# Embed a single text
vec = client.embeddings.embed("The quick brown fox")
print(len(vec))  # 1536 for text-embedding-3-small

# Embed multiple texts (one API call)
vecs = client.embeddings.embed_batch(["cat", "dog", "car", "truck"])

# Semantic similarity (0 – 1)
score = client.embeddings.similarity("puppy", "dog")
print(f"{score:.3f}")  # ~0.92

# Find most similar
results = client.embeddings.most_similar(
    "machine learning",
    ["deep learning", "pasta cooking", "neural networks", "astronomy"],
    top_k=2,
)
for text, score in results:
    print(f"{score:.3f}  {text}")

# Rank documents by relevance
ranked = client.embeddings.rank("Python async programming", documents)
for idx, text, score in ranked:
    print(f"[{idx}] {score:.3f}  {text[:60]}")

# Cluster texts (pure-Python k-means, no numpy)
groups = client.embeddings.cluster(sentences, n_clusters=3)

Function / Tool Calling

import json
from gpt_client_lite import Message

# Register functions — schema is inferred from type hints + docstring
@client.functions.register
def get_weather(location: str, unit: str = "celsius") -> str:
    """Get the current weather for a location.

    Args:
        location: City name or address.
        unit: Temperature unit — celsius or fahrenheit.
    """
    # Your real implementation here
    return json.dumps({"location": location, "temperature": 22, "unit": unit})


@client.functions.register
def search_web(query: str, max_results: int = 5) -> str:
    """Search the web and return a summary.

    Args:
        query: Search query string.
        max_results: Maximum number of results to return.
    """
    return f"Top {max_results} results for: {query}"


# Single round
messages = [Message.user("What's the weather like in Paris?")]
tools    = client.functions.build_tools()
resp     = client.chat.with_tools(messages, tools=tools)

if resp.has_tool_calls:
    # Execute and get ready-to-send result messages
    tool_results = client.functions.execute_all_tool_calls(resp.tool_calls)

    messages.append(resp.message.to_dict())
    messages.extend(tool_results)

    final = client.chat.complete(messages)
    print(final.text)

Agentic loop (auto-executes until the model stops calling tools)

messages = [Message.user("Search for Python news and summarise the weather in London.")]

final = client.functions.run_agentic_loop(
    messages=[m.to_dict() for m in messages],
    chat_api=client.chat,
    max_rounds=5,
)
print(final.text)

Configuration

client = GPTClient(
    api_key="sk-...",          # or OPENAI_API_KEY env var
    model="gpt-4o",            # default model for chat + streaming
    timeout=30,                # seconds (default 60)
    max_retries=5,             # exponential backoff retries (default 3)
    organization="org-...",    # optional OpenAI org ID
    project="proj-...",        # optional project ID
    base_url="https://...",    # override for proxies / compatible APIs
    extra_headers={"X-Custom": "value"},
)

# Change the default model after construction
client.set_model("gpt-4o-mini")

Error handling

from gpt_client_lite import (
    AuthenticationError,
    RateLimitError,
    ContextLengthExceededError,
    InvalidRequestError,
    APIError,
    GPTClientError,
)

try:
    reply = client.chat.simple("Hello!")
except AuthenticationError:
    print("Invalid API key.")
except RateLimitError as e:
    print(f"Rate limited. Retry after: {e.retry_after}s")
except ContextLengthExceededError:
    print("Prompt too long for this model.")
except APIError as e:
    print(f"Server error {e.status_code}: {e}")
except GPTClientError as e:
    print(f"Unexpected error: {e}")

API Reference

`GPTClient`

Parameter	Type	Default	Description
`api_key`	`str`	`$OPENAI_API_KEY`	API key
`model`	`str`	`"gpt-4o-mini"`	Default model
`timeout`	`int`	`60`	Request timeout in seconds
`max_retries`	`int`	`3`	Retry attempts for transient errors
`organization`	`str`	`None`	OpenAI organization ID
`project`	`str`	`None`	OpenAI project ID
`base_url`	`str`	OpenAI API	Override base URL
`extra_headers`	`dict`	`None`	Additional HTTP headers

`client.chat`

Method	Returns	Description
`.simple(prompt, system, **kw)`	`str`	One-shot prompt
`.complete(messages, **kw)`	`ChatResponse`	Full completions call
`.json(prompt, system, **kw)`	`Any`	JSON-mode, auto-parsed
`.summarise(text, max_words)`	`str`	Summarise text
`.translate(text, language)`	`str`	Translate text
`.with_tools(messages, tools)`	`ChatResponse`	Tool-calling request
`.conversation(system, **kw)`	`Conversation`	Multi-turn session

`client.streaming`

Method	Returns	Description
`.simple(prompt, system, **kw)`	`StreamResult`	Stream a prompt
`.chat(messages, **kw)`	`StreamResult`	Stream a messages list
`.complete_to_string(msgs, on_token)`	`str`	Stream + collect

`StreamResult`

Method / Property	Description
`iter_text()`	Iterator over text fragments
`collect()` / `.text`	Buffer and return full text
`print_stream()`	Write to stdout, return full text
`on_token(callback)`	Call `callback(token)` per fragment
`to_message()`	Return buffered result as `Message`

`client.embeddings`

Method	Returns	Description
`.embed(text)`	`List[float]`	Single embedding vector
`.embed_batch(texts)`	`List[List[float]]`	Batch embedding
`.similarity(a, b)`	`float`	Semantic similarity 0–1
`.most_similar(query, candidates, top_k)`	`List[(str, float)]`	Nearest texts
`.rank(query, documents)`	`List[(int, str, float)]`	Ranked with original index
`.cluster(texts, n_clusters)`	`List[List[str]]`	K-means clustering

`client.functions`

Method	Description
`@.register`	Decorator to register a function
`.build_tools(names?)`	List of `FunctionDefinition` for the API
`.execute_tool_call(tc)`	Execute one tool call dict
`.execute_all_tool_calls(tcs)`	Execute all, return result messages
`.run_agentic_loop(messages, chat_api)`	Auto loop until no more tool calls

License

MIT © Vladyslav Zaiets

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.0

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt_client_lite-1.0.0.tar.gz (25.1 kB view details)

Uploaded May 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

gpt_client_lite-1.0.0-py3-none-any.whl (26.0 kB view details)

Uploaded May 4, 2026 Python 3

File details

Details for the file gpt_client_lite-1.0.0.tar.gz.

File metadata

Download URL: gpt_client_lite-1.0.0.tar.gz
Upload date: May 4, 2026
Size: 25.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for gpt_client_lite-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`b4db962d10e88bf1ab0485548674a482ddff48405d2220db90ace2b6af097448`
MD5	`521249c023abcf0fda97f9a08cc6815b`
BLAKE2b-256	`b6e5b43c2a780782ce95f7e478d701507eb60bdb22aecb5f4784c7efdc7326eb`

See more details on using hashes here.

File details

Details for the file gpt_client_lite-1.0.0-py3-none-any.whl.

File metadata

Download URL: gpt_client_lite-1.0.0-py3-none-any.whl
Upload date: May 4, 2026
Size: 26.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for gpt_client_lite-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`31f55d2b1b31550d8847621d867b475b1b4e92774f0a8d71d1e4bd0d2958972d`
MD5	`9528a7875a19705984d3ca21dc430c99`
BLAKE2b-256	`63efc3d48c47576cbc4aa7a04e53ced3c0d80348f15afbdcdc091cadc41f8e21`

See more details on using hashes here.

gpt-client-lite 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

gpt-client-lite

Install

Quick start

Chat

Single-turn prompt

With a system prompt

Full response object

JSON mode

Summarise / translate helpers

Multi-turn conversation

Streaming

Streaming from a messages list

Embeddings

Function / Tool Calling

Agentic loop (auto-executes until the model stops calling tools)

Configuration

Error handling

API Reference

GPTClient

client.chat

client.streaming

StreamResult

client.embeddings

client.functions

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`GPTClient`

`client.chat`

`client.streaming`

`StreamResult`

`client.embeddings`

`client.functions`