Python SDK for HyperBinder - a neurosymbolic database for AI applications

These details have not been verified by PyPI

Project links

Project description

HyperBinder Python SDK

A Python client for HyperBinder — the compositional semantic database that combines vector search, graph traversal, and SQL-like queries with per-field encoding strategies.

Installation

pip install hybi

This installs the HTTP-only Python SDK — enough to talk to a running HyperBinder server.

Quick Start

from hybi import HyperBinder, RelationalTable, Field, Encoding
import pandas as pd

# Connect to a running HyperBinder server
hb = HyperBinder("http://localhost:8000")

# Sample data
df = pd.DataFrame({
    "id": ["1", "2", "3"],
    "category": ["AI", "Cloud", "Analytics"],
    "text": [
        "Artificial intelligence and machine learning solutions",
        "Cloud computing and infrastructure services",
        "Data analytics and business intelligence",
    ],
    "revenue": [5000000, 3000000, 2000000],
})

# Define a schema with per-field encoding
schema = RelationalTable(
    primary_key="id",
    columns={
        "id": Field(encoding=Encoding.EXACT),
        "category": Field(encoding=Encoding.EXACT),
        "text": Field(encoding=Encoding.SEMANTIC),
        "revenue": Field(encoding=Encoding.NUMERIC),
    },
)

# Ingest
result = hb.ingest(df, collection="companies", schema=schema, dim=384)
print(f"Ingested {result.rows_ingested} rows")

# Semantic search
results = hb.search("AI and machine learning", collection="companies", top_k=3)
for r in results:
    print(f"{r.data['text']}: {r.score:.3f}")

# SQL-like query
filtered = hb.select(
    collection="companies",
    where=[("revenue", ">", 2500000)],
    order_by=[("revenue", True)],
)
for row in filtered.rows:
    print(row)

# Hybrid query (semantic + filters)
results = hb.search(
    "cloud services",
    collection="companies",
    filters=[("revenue", ">", 2000000)],
    top_k=5,
)

Key Features

🎯 Per-Field Encoding Strategies

Unlike vector databases that encode entire documents into a single vector, HyperBinder lets you specify different encoding strategies for each field:

schema = RelationalTable(
    primary_key="product_id",
    columns={
        "product_id": Field(encoding=Encoding.EXACT),     # Hash-based exact match
        "category": Field(encoding=Encoding.EXACT),       # Categorical exact match
        "name": Field(encoding=Encoding.SEMANTIC),        # Embedding-based similarity
        "description": Field(encoding=Encoding.SEMANTIC), # Embedding-based similarity
        "price": Field(encoding=Encoding.NUMERIC),        # Numeric comparison
        "stock": Field(encoding=Encoding.NUMERIC),        # Numeric comparison
    },
)

This enables queries that blend matching types in one call:

# Find products semantically similar to "laptop computer"
# WHERE category exactly matches "Electronics" (not similar, exact)
# AND price is between 500 and 1500 (numeric range)
# AND stock > 0 (numeric comparison)
results = hb.search(
    "laptop computer",
    collection="products",
    filters=[
        ("category", "==", "Electronics"),
        ("price", ">=", 500),
        ("price", "<=", 1500),
        ("stock", ">", 0),
    ],
    top_k=10,
)

Exact match where you need it (IDs, categories)
Semantic search where you need it (descriptions, text)
Numeric comparisons where you need it (prices, counts)
All in one query, one database

📊 Hybrid Queries (Semantic + Structured)

Combine semantic search with SQL-like filters:

# Semantic search with exact filters
results = hb.search(
    "machine learning research",
    collection="papers",
    filters=[
        ("year", ">=", "2020"),
        ("citations", ">", 1000),
        ("peer_reviewed", "==", "true"),
    ],
    top_k=10,
)

# Pure SQL-like query
result = hb.select(
    collection="papers",
    where=[
        ("author", "==", "Vaswani"),
        ("year", ">=", "2017"),
    ],
    order_by=[("citations", True)],
    limit=10,
)

Supported operators: =, ==, !=, <>, >, >=, <, <=

Data Ingestion

Basic ingestion with a schema

Always define a schema with encoding types:

from hybi import HyperBinder, RelationalTable, Field, Encoding
import pandas as pd

hb = HyperBinder("http://localhost:8000")

df = pd.DataFrame({
    "id": ["1", "2", "3"],
    "name": ["Product A", "Product B", "Product C"],
    "category": ["Electronics", "Books", "Clothing"],
    "description": ["High-quality electronics", "Bestselling books", "Fashion items"],
    "price": [299.99, 19.99, 49.99],
})

schema = RelationalTable(
    primary_key="id",
    columns={
        "id": Field(encoding=Encoding.EXACT),
        "name": Field(encoding=Encoding.SEMANTIC),
        "category": Field(encoding=Encoding.EXACT),
        "description": Field(encoding=Encoding.SEMANTIC),
        "price": Field(encoding=Encoding.NUMERIC),
    },
)

result = hb.ingest(df, collection="products", schema=schema, dim=384)
print(f"Ingested {result.rows_ingested} rows")

Encoding types

Encoding	Use for	How it works	Example fields
`EXACT`	IDs, categories, tags	Hash-based exact match	`id`, `status`, `category`
`SEMANTIC`	Text, descriptions, titles	Embedding-based similarity	`title`, `description`, `content`
`NUMERIC`	Numbers, prices, counts	Numeric comparison	`price`, `quantity`, `rating`

Without a schema

If you don't provide a schema, HyperBinder will auto-detect encoding per column, but results may be suboptimal:

# Not recommended — auto-detection may not choose the optimal encoding
result = hb.ingest(df, collection="products", dim=384)

Searching

Semantic search

results = hb.search("laptop computers", collection="products", top_k=5)
for r in results:
    print(f"Score: {r.score:.3f}")
    print(f"Name:  {r.data['name']}")
    print(f"Desc:  {r.data['description']}")

Hybrid: semantic + filters

results = hb.search(
    "artificial intelligence",
    collection="products",
    filters=[
        ("category", "==", "Electronics"),
        ("price", ">=", 100),
        ("price", "<=", 500),
        ("in_stock", "==", "true"),
    ],
    top_k=10,
)

Pure SQL-like

result = hb.select(
    collection="products",
    columns=["name", "price", "category"],
    where=[
        ("category", "==", "Electronics"),
        ("price", ">", 200),
    ],
    order_by=[("price", True)],  # True = descending
    limit=10,
)
for row in result.rows:
    print(row)

Collection management

products = hb.collection("products")
if products.exists():
    print(f"Collection has {products.count()} rows")

stats = products.stats()
print(f"Columns:   {stats.columns}")
print(f"Dimension: {stats.dimension}")

for coll in hb.list_collections():
    print(f"{coll.name}: {coll.rows} rows")

# Delete all rows but keep the collection structure
<!-- FORWARD-LOOKING: Collection.truncate() fluent form ships with PR
     feat/namespace-row-counts. Until that lands on master, use the
     equivalent hb.truncate(collection="products") instead. Remove
     this comment once feat/namespace-row-counts is merged. -->
products.truncate()

# Delete the entire collection
products.delete()

Advanced features

Multi-hop graph traversal

results = hb.multihop(
    collection="knowledge_graph",
    start={"entity": "Albert Einstein"},
    hops=[("discovered", "theory"), ("influenced", "scientist")],
    top_k=10,
)

RAG context assembly

context = hb.get_context(
    "What are the latest AI developments?",
    collection="research_papers",
    top_k=5,
)

prompt = f"""Context: {context.text}

Question: What are the latest AI developments?
Answer:"""

Aggregations

result = hb.aggregate(
    collection="sales",
    group_by=["region", "product_type"],
    aggregations=[
        ("revenue", "sum", "total_revenue"),
        ("orders", "count", "order_count"),
        ("revenue", "avg", "avg_order"),
    ],
    order_by=["total_revenue"],
)

for group in result.groups:
    print(f"{group['region']}: ${group['total_revenue']:,.2f}")

Common issues

Search returns zero results

Make sure you ingested with a schema, not just the raw DataFrame.
Confirm the collection has rows: hb.collection("products").count().

Duplicate results after re-ingest

Clear the collection before re-ingesting:

hb.collection("products").truncate()  # keep schema, drop rows
# or
hb.collection("products").delete()    # drop everything

Quick reference

from hybi import HyperBinder, RelationalTable, Field, Encoding

hb = HyperBinder("http://localhost:8000")

# Schema
schema = RelationalTable(
    primary_key="id",
    columns={
        "id": Field(encoding=Encoding.EXACT),
        "text": Field(encoding=Encoding.SEMANTIC),
        "category": Field(encoding=Encoding.EXACT),
        "price": Field(encoding=Encoding.NUMERIC),
    },
)

# Ingest
hb.ingest(df, collection="data", schema=schema, dim=384)

# Search
results = hb.search("query", collection="data", top_k=10)

# Hybrid search
results = hb.search(
    "query",
    collection="data",
    filters=[("category", "==", "value"), ("price", ">", 100)],
    top_k=10,
)

# SQL-like
result = hb.select(collection="data", where=[...], order_by=[...])

# Collection management
hb.collection("data").exists()
hb.collection("data").count()
hb.collection("data").truncate()  # ships with feat/namespace-row-counts
hb.collection("data").delete()

Contributing

See the Contributing Guide for details.

License

MIT License — see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

May 19, 2026

0.1.0

Apr 29, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

hybi-0.1.1-py3-none-any.whl (329.4 kB view details)

Uploaded May 19, 2026 Python 3

File details

Details for the file hybi-0.1.1-py3-none-any.whl.

File metadata

Download URL: hybi-0.1.1-py3-none-any.whl
Upload date: May 19, 2026
Size: 329.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for hybi-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`96dc256022d74ae7d05a91cb0cccfd8cd561bf25f91a21540b20a39f7748e223`
MD5	`f9716aea2cecd9ca05dd835482c3ea4c`
BLAKE2b-256	`069ab9efb0a6bea0eab56acfaf360e62e756f6bd7ecee49b9ead24a434460fd2`

See more details on using hashes here.

hybi 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

HyperBinder Python SDK

Installation

Quick Start

Key Features

🎯 Per-Field Encoding Strategies

📊 Hybrid Queries (Semantic + Structured)

Data Ingestion

Basic ingestion with a schema

Encoding types

Without a schema

Searching

Semantic search

Hybrid: semantic + filters

Pure SQL-like

Collection management

Advanced features

Multi-hop graph traversal

RAG context assembly

Aggregations

Common issues

Search returns zero results

Duplicate results after re-ingest

Quick reference

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes