Save OpenAI API results to a SQLite database
Project description
openai-to-sqlite
Save OpenAI API results to a SQLite database
This tool is under active development. It is not yet ready for production use.
Installation
Install this tool using pip
:
pip install openai-to-sqlite
Usage
For help, run:
openai-to-sqlite --help
You can also use:
python -m openai_to_sqlite --help
Configuration
You will need an OpenAI API key to use this tool.
You can create one at https://beta.openai.com/account/api-keys
You can then either set the API key as an environment variable:
export OPENAI_API_KEY=sk-...
Or pass it to each command using the --token sk-...
option.
Embeddings
The first command supported by this tool is embeddings
:
openai-to-sqlite embeddings --help
This command can be fed a CSV (or JSON or TSV) file full of content, and it will use the OpenAI API to generate embeddings for each row.
The first column of the CSV file will be treated as the content ID. Any other columns will be concatenated together and used as the text to be embedded.
These embeddings will then be saved as binary blobs in the embeddings
table of a
SQLite database.
Given a CSV file like this:
id,content
1,This is a test
2,This is another test
Embeddings can be stored like so:
openai-to-sqlite embeddings embeddings.db data.csv --csv
The --csv
flag tells the tool that the input file is a CSV file. Without this it
will attempt to guess.
The resulting schema looks like this:
CREATE TABLE [embeddings] (
[id] TEXT PRIMARY KEY,
[embedding] BLOB
);
The binary data can be extracted into a Python array of floating point numbers like this:
import struct
vector = struct.unpack(
"f" * 1536, binary_embedding
)
Search
Having saved the embeddings for content, you can run searches using the search
command:
openai-to-sqlite search embeddings.db 'this is my search term'
The output will be a list of cosine similarity scores and content IDs:
% openai-to-sqlite search blog.db 'cool datasette demo'
0.843 7849
0.830 8036
0.828 8195
0.826 8098
0.818 8086
0.817 8171
0.816 8121
0.815 7860
0.815 7872
0.814 8169
Development
To contribute to this tool, first checkout the code. Then create a new virtual environment:
cd openai-to-sqlite
python -m venv venv
source venv/bin/activate
Now install the dependencies and test dependencies:
pip install -e '.[test]'
To run the tests:
pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file openai-to-sqlite-0.1a0.tar.gz
.
File metadata
- Download URL: openai-to-sqlite-0.1a0.tar.gz
- Upload date:
- Size: 8.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2b057d013f8e676a3a1eeafb197fd4839f62052db20e09dda29c3f6fe64b29f9 |
|
MD5 | ef747061c4656575f97d505866935397 |
|
BLAKE2b-256 | 30e4e1f9b7c4a78b47b8253c6eac3ff2da1cc10241658d8139748d837d3b3f5f |
File details
Details for the file openai_to_sqlite-0.1a0-py3-none-any.whl
.
File metadata
- Download URL: openai_to_sqlite-0.1a0-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.1
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 620998af7fcfb189a4f0d431988ebc0a30e89295d8cc2509be35361a2f1d4deb |
|
MD5 | cc64bb7eb7f66f67ab5f52ec4d1d3320 |
|
BLAKE2b-256 | 979485262c3d7894d810267fe64d2948f5f8372cecfc401593e54ce853c69fd0 |