Skip to main content

A fast, ergonomic workspace in the cloud for AI agents — backed by Hugging Face Storage Buckets.

Project description

membase

PyPI version Python versions CI License: MIT

A fast, ergonomic workspace in the cloud for AI agents.

membase gives AI agents a filesystem interface backed by Hugging Face Storage Buckets — designed from the ground up for tool-calling agents that need to read, write, search, and organize files in persistent cloud storage.

import membase as mb

ws = mb.Workspace("my-project")
ws.write("hello.txt", "Hello from membase.")
print(ws.read("hello.txt"))

How It Works

Under the hood, membase wraps Hugging Face's Storage Buckets — a mutable, non-versioned object store built on top of the Xet storage backend. Buckets behave like cloud directories with:

  • Chunk-level deduplication — similar files share storage automatically
  • Global addressing — every file has an hf://buckets/... URI
  • Fine-grained permissions — private by default, optionally public
  • No versioning overhead — mutations are immediate, no commits needed

membase wraps all of this behind a simple filesystem API so agents never deal with URIs, authentication plumbing, or SDK quirks directly.

Why membase?

AI agents already have great filesystem tools on local machines — Read, Write, Glob, Grep. But when agents need persistent, shareable, cloud-native storage, they get raw APIs designed for human developers: long bucket URIs, manual grep loops, no batching, no parallelism.

membase closes this gap with:

  • One line to start. ws = mb.Workspace("my-project") — auth is automatic, the bucket is created if needed.
  • Familiar vocabulary. read, write, ls, glob, grep, tree, mv, cp, rm — the same operations agents already know.
  • Parallel reads. Multi-file operations use 16 concurrent workers by default — 16x faster than sequential reads.
  • Batched writes. write_many() sends any number of files in a single network call (~700ms whether it's 1 file or 200).
  • Token-efficient output. Short paths, compact tree views, bounded search results — designed to minimize context window cost.
  • Agent-friendly errors. Exceptions include suggestions for what to do next, so agents can self-correct.

Installation

pip install membase

Requires Python 3.9+ and a Hugging Face account.

Authentication

membase requires a Hugging Face API token with write permissions. Here's how to set it up:

Step 1: Create a token

  1. Go to huggingface.co/settings/tokens
  2. Click "Create new token"
  3. Choose "Write" access (required to create and modify workspaces)
  4. Copy the token — it starts with hf_

Step 2: Make the token available

There are three ways to provide your token. Pick whichever fits your workflow.

Option A — Environment variable (recommended for servers and CI):

export HF_TOKEN=hf_your_token_here

Option B — CLI login (recommended for local development):

pip install huggingface_hub
huggingface-cli login

This stores the token in ~/.cache/huggingface/token so you never have to set it again on this machine.

Option C — Pass it directly (recommended for scripts and agents):

import membase as mb

ws = mb.Workspace("my-project", token="hf_your_token_here")

Verifying your setup

import membase as mb

# This will create a test workspace (or connect to an existing one)
ws = mb.Workspace("hello-test")
ws.write("test.txt", "It works!")
print(ws.read("test.txt"))  # → "It works!"

If you see "It works!", you're all set. If you get an authentication error, double-check that your token has write access.

Quick Start

Create a workspace and write files

import membase as mb

ws = mb.Workspace("my-project")

# Write a single file (parent directories created automatically)
ws.write("src/main.py", "def main():\n    print('hello')\n")

# Write multiple files in one network call (~700ms total)
ws.write_many({
    "src/__init__.py": "",
    "src/utils.py": "def helper():\n    pass\n",
    "tests/test_main.py": "def test_main():\n    assert True\n",
})

Explore the workspace

# ASCII tree view
print(ws.tree())

# List a directory
for entry in ws.ls("src/"):
    print(entry)

# Find files by pattern
py_files = ws.glob("**/*.py")

# Walk the directory tree (like os.walk)
for dirpath, dirs, files in ws.walk():
    for f in files:
        print(f"{dirpath}/{f}" if dirpath else f)

Search file contents

# Search across all Python files (parallel — 16x faster than sequential)
results = ws.grep("def main", include="*.py")
for match in results:
    print(f"{match.path}:{match.line_number}: {match.line}")

Read and edit files

# Read a file
content = ws.read("src/main.py")

# Read just the first 10 lines of a large file
head = ws.read("data/large.csv", head=10)

# Find-and-replace (no need to read the whole file into context)
ws.edit("src/main.py", old="print('hello')", new="print('goodbye')")

Download files to local disk

# Download a single file for local processing
ws.download("data/results.csv", "/tmp/results.csv")

Discover existing workspaces

import membase as mb

# List all workspaces visible to your token
for workspace in mb.list_workspaces():
    print(workspace)

Use with pandas

import pandas as pd

# pandas reads HF bucket URIs natively
df = pd.read_csv(ws.url("data/train.csv"))

API Reference

Module-level

Function Description
mb.Workspace(name, ...) Open or create a workspace
mb.list_workspaces(namespace=...) List available workspaces
mb.Workspace.delete(name) Permanently delete a workspace

File I/O

Method Description
ws.read(path) Read a file as string
ws.read(path, head=N) Read first N lines
ws.read(path, binary=True) Read as bytes
ws.read_many(paths) Read multiple files in parallel
ws.write(path, content) Create or overwrite a file
ws.write_many({path: content}) Write multiple files (one network call)
ws.edit(path, old=..., new=...) Find-and-replace within a file
ws.append(path, content) Append to a file
ws.download(remote, local) Download a file to local disk

Exploration

Method Description
ws.ls(path) List directory contents
ws.tree() ASCII tree of the workspace
ws.walk() Walk directory tree (like os.walk)
ws.glob(pattern) Find files by pattern
ws.grep(pattern) Search inside file contents
ws.exists(path) Check if a path exists
ws.is_file(path) Check if path is a file
ws.is_dir(path) Check if path is a directory
ws.stat(path) File metadata (size, type)
ws.du(path) Total size in bytes

File Operations

Method Description
ws.rm(path) Delete a file or directory
ws.mv(src, dst) Move or rename
ws.cp(src, dst) Copy within workspace

Workspace Management

Method Description
ws.info() Workspace metadata (file count, size)
ws.sync() Sync with local mirror
ws.invalidate() Clear cached metadata
ws.url(path) Get hf:// URI for interop
ws.fs Access raw HfFileSystem

Design Principles

  • Token-efficient. Every API surface minimizes context window cost.
  • Network-aware. Parallel reads, batched writes, aggressive caching.
  • One dependency. Only huggingface_hub is required.
  • Agent-first. Structured returns, compact repr, actionable errors.

Troubleshooting

"Permission denied" or 401 error: Your token doesn't have write access. Create a new token at huggingface.co/settings/tokens with Write permissions.

"Cannot reach Hugging Face" or connection timeout: Check your internet connection. If you're behind a proxy, set the HTTPS_PROXY environment variable.

Stale data after external modifications: If another agent or human modified the workspace externally, call ws.invalidate() to clear cached metadata.

Slow grep on large workspaces (200+ files): Use local mirror mode for repeated searches:

ws = mb.Workspace("large-project", mirror=True)
ws.sync()  # one-time ~1.3s sync
ws.grep("pattern")  # now searches locally in <1ms

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Changelog

See CHANGELOG.md for release history.

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

membase-0.0.2.tar.gz (41.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

membase-0.0.2-py3-none-any.whl (24.4 kB view details)

Uploaded Python 3

File details

Details for the file membase-0.0.2.tar.gz.

File metadata

  • Download URL: membase-0.0.2.tar.gz
  • Upload date:
  • Size: 41.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for membase-0.0.2.tar.gz
Algorithm Hash digest
SHA256 cc2df50027c385bb78bbc4a06cfcad9d2fce48300dca0ca97dff428a1629d04e
MD5 29e22bedad8230fb4ef383c2754a4437
BLAKE2b-256 bca068b7ee2989bbe45d2c620f1a60ad67587bec1c7f112f802b1daaa3deb6e3

See more details on using hashes here.

File details

Details for the file membase-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: membase-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 24.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for membase-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 e718517198fb3eeb400dfbff0b8db2a0732a1f2eb10cda91b8dd84584cbe96f0
MD5 fcbd0fdf0c0e3bc887471da6d5705cc8
BLAKE2b-256 977c7ac71b760331c0d8995ab0b25dc6cab1996332891ec038c8d02e54252838

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page