LLM provider abstraction layer.
Project description
oneping
Give me a ping, Vasily. One ping only, please.
This is a library for querying LLM providers such as OpenAI or Anthropic, as well as local models. Currently the following providers are supported: openai
, anthropic
, fireworks
, and local
(local models).
Requesting a local
provider will target localhost
and use an OpenAI-compatible API as in llama.cpp
or llama-cpp-python
. Also included is a simple function to start a llama-cpp-python
server on the fly (see below).
The various native libraries are soft dependencies and the library can still partially function with or without any or all of them. The native packages for these providers are: openai
, anthropic
, and fireworks-ai
.
There is also a Chat
interface that automatically tracks message history. Kind of departing from the "one ping" notion, but oh well. This accepts a provider
and system
argument. Other parameters are passed by calling it (an alias for chat
) or to stream
.
Installation
For standard usage, install with:
pip install oneping
To include the native provider dependencies, install with:
pip install oneping[native]
To include the chat and web interface dependencies, install with:
pip install oneping[chat]
Library Usage
Basic usage with Anthropic through the URL interface:
response = oneping.reply(prompt, provider='anthropic')
The reply
function accepts a number of arguments including (some of these have per-provider defaults):
prompt
(required): The prompt to send to the LLM (required)provider
=local
: The provider to use:openai
,anthropic
,fireworks
, orlocal
system
=None
: The system prompt to use (not required, but recommended)prefill
=None
: Start "assistant" response with a string (Anthropic doesn't like newlines in this)model
=None
: Indicate the desired model for the provider (provider default)max_tokens
=1024
: The maximum number of tokens to returnhistory
=None
: List of prior messages orTrue
to request full history as return valuenative
=False
: Use the native provider librariesurl
=None
: Override the default URL for the provider (provider default)port
=8000
: Which port to use for local or custom providerapi_key
=None
: The API key to use for non-local providers
For example, to use the OpenAI API with a custom system
prompt:
response = oneping.reply(prompt, provider='openai', system=system)
To conduct a full conversation with a local LLM:
history = True
history = oneping.reply(prompt1, provider='local', history=history)
history = oneping.reply(prompt2, provider='local', history=history)
For streaming, use the function stream
and for async
streaming, use stream_async
. Both of these take the same arguments as reply
.
Command Line
You can call the oneping
module directly and use the following subcommands:
reply
: get a single response from the LLMstream
: stream a response from the LLMembed
: get embeddings from the LLMconsole
: start a console (Textual) chatweb
: start a web (FastHTML) chat
These accept the arguments listed above for reply
as command line arguments. For example:
python -m oneping stream "Does Jupiter have a solid core?" --provider anthropic
Or you can pipe in your query from stdin
:
echo "Does Jupiter have a solid core?" | python -m oneping stream --provider anthropic
Chat Interface
The Chat
interface is a simple wrapper for a conversation history. It can be used to chat with an LLM provider or to simply maintain a conversation history for your bot. If takes the usual reply
, stream
, and stream_async
functions, and calling it directly will map to reply
.
chat = oneping.Chat(provider='anthropic', system=system)
response1 = chat(prompt1)
response2 = chat(prompt2)
There is also a textual
powered console interface and a fasthtml
powered web interface. You can call these with: python -m oneping console
or python -m oneping web
.
Server
The server
module includes a simple function to start a llama-cpp-python
server on the fly (oneping.server.start
or python -m oneping.server start
).
python -m oneping.server start <path-to-gguf>
To run the server in embedding mode, either pass the --embedding
flag.
Embeddings
Embeddings queries are supported through the embed
function. It accepts the relevant arguments from the reply
function. Right now only openai
and local
providers are supported.
vecs = oneping.embed(text, provider='openai')
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file oneping-0.5.tar.gz
.
File metadata
- Download URL: oneping-0.5.tar.gz
- Upload date:
- Size: 15.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e49a6ef0abe004810446cccccd57bbed8402e2abc47bb0935e01ca007d661dde |
|
MD5 | bedbddcf3bfcd74ba273c189b06f3b7e |
|
BLAKE2b-256 | 20e9527c21478bbd66ed8fa6765363cfc6b00a6192289cc4c99e26313988732a |
File details
Details for the file oneping-0.5-py3-none-any.whl
.
File metadata
- Download URL: oneping-0.5-py3-none-any.whl
- Upload date:
- Size: 17.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e86ddbd9f2e966d46e7b0f7de585f363f1a500cb6e115357781a66d2f3ae1d63 |
|
MD5 | 205c18940f75faa2bf494f4781eddce8 |
|
BLAKE2b-256 | 3da2c660975fa304f621c12466a9d0398ab64d531160ddfdde36ee6438cef166 |