Skip to main content

A filesystem-like interface for ChromaDB with semantic search

Project description

context

A filesystem-like interface for ChromaDB with semantic search.

Installation

pip install context

For Python < 3.11, install with TOML support for config file reading:

pip install context[toml]

Quick Start

from context import connect

# Connect to a collection (uses ~/.config/cvfs/config.toml if available)
fs = connect("my_docs")

# Write files (like Python's built-in open)
with fs.open("/docs/readme.md", "w") as f:
    f.write("# Hello World\n")
    f.write("This is my document.\n")

# Read files
content = fs.read("/docs/readme.md")

# List directory contents
files = fs.listdir("/docs")  # ['readme.md']

# Glob pattern matching
md_files = fs.glob("**/*.md")  # ['/docs/readme.md']

# Check if file exists
if fs.exists("/docs/readme.md"):
    print("File exists!")

# Remove files
fs.remove("/docs/readme.md")

Features

Familiar Python Interface

context provides a filesystem-like API that feels natural to Python developers:

context Python equivalent
fs.open(path, mode) open(path, mode)
fs.read(path) Path(path).read_text()
fs.write(path, content) Path(path).write_text(content)
fs.exists(path) os.path.exists(path)
fs.remove(path) os.remove(path)
fs.listdir(path) os.listdir(path)
fs.walk(path) os.walk(path)
fs.glob(pattern) glob.glob(pattern)
fs.stat(path) os.stat(path)

Semantic Search

With an embedder function, you can perform semantic search:

from context import FileSystem

def my_embedder(text: str) -> list[float]:
    # Your embedding function here
    # Returns a 384-dimensional vector
    ...

fs = FileSystem("my_docs", embedder=my_embedder)

# Write some documents
fs.write("/ml/intro.md", "Machine learning is a subset of AI...")
fs.write("/ml/neural.md", "Neural networks are inspired by the brain...")

# Semantic search
results = fs.search("artificial intelligence", k=5)
for r in results:
    print(f"{r.path}: {r.score:.3f}")

Configuration

context reads ChromaDB connection settings from ~/.config/cvfs/config.toml (Linux) or ~/Library/Application Support/cvfs/config.toml (macOS):

remote_url = "https://api.trychroma.com"
api_key = "your-api-key"
tenant = "your-tenant-id"
database = "your-database"

Or configure programmatically:

from context import FileSystem

fs = FileSystem(
    "my_collection",
    url="http://localhost:8000",
    tenant="default_tenant",
    database="default_database",
)

API Reference

FileSystem

The main interface for interacting with ChromaDB as a filesystem.

class FileSystem:
    def __init__(
        self,
        collection: str,
        url: str = None,           # ChromaDB URL
        tenant: str = None,        # Tenant ID
        database: str = None,      # Database name
        api_key: str = None,       # API key
        auto_create: bool = True,  # Create collection if missing
        embedder: callable = None, # Embedding function
    ): ...

File Operations

# Open file (returns file-like object)
with fs.open("/path", "w") as f:
    f.write("content")

# Direct read/write
content = fs.read("/path")
fs.write("/path", "content")
fs.append("/path", "more content")

Directory Operations

fs.listdir("/")           # List directory contents
fs.exists("/path")        # Check if file exists
fs.remove("/path")        # Delete file
fs.stat("/path")          # Get file statistics

# Walk directory tree
for dirpath, dirs, files in fs.walk("/"):
    print(dirpath, files)

# Glob pattern matching
fs.glob("**/*.md")        # Find all markdown files
fs.glob("/docs/*.txt")    # Find txt files in /docs

Search Operations

# Semantic search (requires embedder)
results = fs.search("query", k=10)
results = fs.search("query", path_pattern="**/*.md")

# Text search (requires Chroma embedding function)
results = fs.search_text("query", k=10)

connect()

Convenience function to create a FileSystem:

from context import connect

fs = connect("my_collection")
fs = connect("my_collection", url="http://localhost:8000")

Requirements

  • Python 3.8+
  • ChromaDB server (local or cloud)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

context-0.1.0.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

context-0.1.0-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file context-0.1.0.tar.gz.

File metadata

  • Download URL: context-0.1.0.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for context-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0b791121602f747c135e70c35723b84bc4fc8111734f4083362e6e0ba5b8e4b3
MD5 9456ac8f3c35b4d1ae06ef9cea635590
BLAKE2b-256 430b07ff7d1518a8acefe9628d36de9d38ea4d0b046c518bacf6eebd14d48415

See more details on using hashes here.

File details

Details for the file context-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: context-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.10

File hashes

Hashes for context-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a52384ef941b5d01d7ae356a176d4e2485dd8cd69816694a76d280762f8e86cb
MD5 639fb0eb0ce4c37bc775e9a7ca09f89e
BLAKE2b-256 99fbbddc82f6dc1ec9ed494d6c9be8cc687735f08886d71d9f6cc24a481958c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page