LLM provider abstraction layer.
Project description
oneping
oneping.reply('Give me a ping, Vasily. One ping only, please.', provider='anthropic')
This is a Python library for querying LLM providers such as OpenAI or Anthropic, as well as local models. The main goal is to create an abstraction layer that makes switching between them seamless. Currently the following providers are supported: openai
, anthropic
, fireworks
, and local
(local models).
There is also a Chat
interface that automatically tracks the message history. Kind of departing from the "one ping" notion, but oh well. Additionally, there is a textual
powered console interface and a fasthtml
powered web interface. Both are components that can be embedded in other applications.
Requesting the local
provider will target localhost
and use an OpenAI-compatible API as in llama.cpp
or llama-cpp-python
. The various native libraries are soft dependencies and the library can still partially function with or without any or all of them. The native packages for these providers are: openai
, anthropic
, and fireworks-ai
.
Installation
For standard usage, install with:
pip install oneping
To install the native provider dependencies add "[native]"
after oneping
in the command above. The same goes for the chat interface dependencies with "[chat]"
.
The easiest way to handle authentication is to set an API key environment variable such as: OPENAI_API_KEY
, ANTHROPIC_API_KEY
, FIREWORKS_API_KEY
, etc. You can also pass the api_key
argument to any of the functions directly.
Library Usage
Basic usage with Anthropic through the URL interface:
response = oneping.reply(query, provider='anthropic')
The reply
function accepts a number of arguments including (some of these have per-provider defaults):
query
(required): The query to send to the LLM (required)provider
=local
: The provider to use:openai
,anthropic
,fireworks
, orlocal
system
=None
: The system prompt to use (not required, but recommended)prefill
=None
: Start "assistant" response with a string (Anthropic doesn't like newlines in this)model
=None
: Indicate the desired model for the provider (provider default)max_tokens
=1024
: The maximum number of tokens to returnhistory
=None
: List of prior messages orTrue
to request full history as return valuenative
=False
: Use the native provider librariesurl
=None
: Override the default URL for the provider (provider default)port
=8000
: Which port to use for local or custom providerapi_key
=None
: The API key to use for non-local providers
For example, to use the OpenAI API with a custom system
prompt:
response = oneping.reply(query, provider='openai', system=system)
To conduct a full conversation with a local LLM, see Chat
interface below. For streaming, use the function stream
and for async
streaming, use stream_async
. Both of these take the same arguments as reply
.
Command Line
You can call oneping
directly or as a module with python -m oneping
and use the following subcommands:
reply
: get a single response from the LLMstream
: stream a response from the LLMembed
: get embeddings from the LLMconsole
: start a console (Textual) chatweb
: start a web (FastHTML) chat
These accept the arguments listed above for reply
as command line arguments. For example:
oneping stream "Does Jupiter have a solid core?" --provider anthropic
Or you can pipe in your query from stdin
:
echo "Does Jupiter have a solid core?" | oneping stream --provider anthropic
I've personally found it useful to set up aliases like claude = oneping stream --provider anthropic
.
Chat Interface
The Chat
interface is a simple wrapper for a conversation history. It can be used to chat with an LLM provider or to simply maintain a conversation history for your bot. If takes the usual reply
, stream
, and stream_async
functions, and calling it directly will map to reply
.
chat = oneping.Chat(provider='anthropic', system=system)
reply1 = chat(query1)
reply2 = chat(query2)
There is also a textual
powered console interface and a fasthtml
powered web interface. You can call these with: oneping console
or oneping web
.
Server
The server
module includes a simple function to start a llama-cpp-python
server on the fly (oneping.server.start
in Python or oneping server
from the command line).
oneping server <path-to-gguf>
To run the server in embedding mode, pass the --embedding
flag. You can also specify things like --host
and --port
or any options supported by llama-cpp-python
.
Embeddings
Embeddings queries are supported through the embed
function. It accepts the relevant arguments from the reply
function. Right now only openai
and local
providers are supported.
vecs = oneping.embed(text, provider='openai')
and on the command line:
oneping embed "hello world" --provider openai
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file oneping-0.5.7.tar.gz
.
File metadata
- Download URL: oneping-0.5.7.tar.gz
- Upload date:
- Size: 16.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | dbbedcf44b91052e62d8642339c581075d7b3d5fd892add9beb96feeba5c95f5 |
|
MD5 | 3b66ba8e8efb3704f674fa47ed417c5d |
|
BLAKE2b-256 | 5358f07cae683b5c8874301afe2faa3cd4d8fcbf9427d1a8f2b43a3003836471 |
File details
Details for the file oneping-0.5.7-py3-none-any.whl
.
File metadata
- Download URL: oneping-0.5.7-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5353d91aeae1cc9771bb919569a913cb62feb115105425ca1e573fe0e7548ddb |
|
MD5 | c5fe4c042bbb2eed08a14c56b2b8dd47 |
|
BLAKE2b-256 | 75dcc7853fcbc9e9d5a9461b92d13d529fc55d598a8cca2dd3d1e7c07813a30f |