Python implementation of the VXDF (Vector Exchange Data Format) for storing text, metadata and vector embeddings in a single portable file.
Project description
VXDF Python Library
VXDF (Vector eXchange Data Format) is an AI-native container for text, metadata and vector embeddings—portable, indexable and compressed. If you do RAG, semantic search or compliance audits, VXDF gives you one file, one command.
Quick-start
pip install vxdf[zstd] # installs optional Zstandard support
python - << 'PY'
from vxdf import VXDFWriter, VXDFReader
# create a small file
data = [
{"id": "1", "text": "hello", "vector": [0.1, 0.2]},
{"id": "2", "text": "world", "vector": [0.3, 0.4]},
]
with VXDFWriter("demo.vxdf", embedding_dim=2, compression="zstd") as w:
for chunk in data:
w.add_chunk(chunk)
# read it back
a = VXDFReader("demo.vxdf")
print(a.get_chunk("2"))
PY
Command-line
vxdf pack data.jsonl data.vxdf --compression zstd # create
vxdf info data.vxdf # header & stats
vxdf list data.vxdf | head # ids
vxdf get data.vxdf some-id > doc.json # extract
# Pipe stdin to stdout (auto-detects model, disables banner/progress)
cat report.txt | vxdf convert - - > report.vxdf
Colab / Notebook
LangChain integration (preview)
from langchain_community.vectorstores import VXDF
vs = VXDF.from_vxdf("demo.vxdf")
See examples/langchain_integration.py for a minimal adapter.
Authentication
VXDF commands that interact with cloud services need credentials.
OpenAI embeddings
The client looks for an API key in this order (first match wins):
--openai-keyCLI flag (e.g.vxdf convert my.pdf out.vxdf --model openai --openai-key sk-...)OPENAI_API_KEYenvironment variable.~/.vxdf/config.tomlunder the[openai]table:
[openai]
api_key = "sk-..."
AWS (S3 URLs)
Uses the standard AWS credential chain provided by boto3 – environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY), the AWS CLI config, or an attached IAM role. Run aws configure if unsure.
GCP (gs:// URLs)
Relies on Application Default Credentials. Run gcloud auth application-default login or set the GOOGLE_APPLICATION_CREDENTIALS environment variable pointing at a JSON key file.
If credentials are missing VXDF exits early with a clear message and a hint on how to configure them.
Shell completion
Install extra dependencies and activate once:
pip install vxdf[completion]
activate-global-python-argcomplete --user # bash/zsh/fish supported
Re-open your terminal and enjoy TAB-completion for vxdf sub-commands and options.
VXDF is BSD-3-licensed. Contributions welcome!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vxdf-0.1.3.tar.gz.
File metadata
- Download URL: vxdf-0.1.3.tar.gz
- Upload date:
- Size: 35.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4128e1af704bf5cce6288c4fe595aea71878fe6e1d04f90c7f64b096056e9e5c
|
|
| MD5 |
5e7c2c58b5efcb36aca44e9c05a30b71
|
|
| BLAKE2b-256 |
207978c8fa8b0536c273c6dd5436e699eecab1aaf3abc8efcda3c6f81dd0b015
|
File details
Details for the file vxdf-0.1.3-py3-none-any.whl.
File metadata
- Download URL: vxdf-0.1.3-py3-none-any.whl
- Upload date:
- Size: 33.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
155c8f764acb7947383556ec568d4e39adb0f3a2df14da151dbbb59f602b26b0
|
|
| MD5 |
18cdfcfcdfb3249378003146b19173f4
|
|
| BLAKE2b-256 |
a8be46c060da70f0219f64b76f40c8e15ade1251aadddeb4b94c46929c302760
|