Skip to main content

A minimal cloud file system for agents with semantic search capabilities

Project description

ClawBox

Open-source cloud file system for AI agents
Semantic search • Folders • File sharing • Self-hostable

License: MIT Python 3.10+ Live Demo Self-Host

Live DemoSelf-Host Guide


What is ClawBox?

ClawBox is an open-source file storage platform built for AI agents. Upload files, search by meaning, organize into folders, and share with anyone — all via API or CLI.

For agents: Store files, search by meaning, organize into folders — all via API.

For humans: Upload files, get a share link, search across documents. Like a smarter Dropbox with an API-first design.

Key Features

Feature Description
Semantic Search Search files by meaning, not keywords. Powered by Gemini embeddings.
Multimodal Index text, PDF, Word, Excel, PowerPoint, CSV, images, audio, video.
Virtual Folders Organize files with paths like /docs/reports/q1.pdf.
File Sharing Generate share links — anyone with the link can download.
Google Login Sign in with Google for 10 GB storage (1 GB free without login).
Self-Hostable Docker Compose and cloud-friendly deployment options for local or hosted setups.

Quick Start

Option 1: Docker

git clone https://github.com/Alfra-AI/clawbox.git
cd clawbox
cp .env.example .env       # Edit to add your Google API key for search
docker compose up -d
docker compose exec app alembic upgrade head

ClawBox is then available at http://localhost:8000.

Option 2: Use the hosted version

No setup needed — just visit clawbox.ink.

Option 3: CLI

pip install .
clawbox init                    # Get a token
clawbox upload report.pdf       # Upload
clawbox search "quarterly revenue"  # Semantic search

If you want to connect ClawBox to a coding or task agent, see ClawBoxSkill/SKILL.md.


API

All endpoints (except public ones) require Authorization: Bearer <token>.

Core

Endpoint Method Auth Description
POST /get_token POST No Get a free token (1 GB storage)
POST /files/upload POST Yes Upload a file (with optional path for folders)
GET /files GET Yes List files (filter by folder, recursive)
GET /files/{id} GET Yes Download a file
PATCH /files/{id} PATCH Yes Move/rename a file
DELETE /files/{id} DELETE Yes Delete a file
POST /search POST Yes Semantic search across files
POST /files/embed POST Yes Generate/retry embeddings

Sharing

Endpoint Method Auth Description
POST /files/{id}/share POST Yes Create a share link
GET /s/{code} GET No Download via share link

Auth

Endpoint Method Description
GET /auth/google GET Start Google OAuth login
GET /auth/me GET Get current user info

Supported File Formats

Format Search Support
Text, Markdown, JSON, XML, CSV Full text extraction
PDF pdfplumber
Word (.docx) python-docx
Excel (.xlsx) openpyxl
PowerPoint (.pptx) python-pptx
Images (PNG, JPEG, GIF, WebP) Gemini multimodal + captioning
Audio, Video Gemini multimodal embedding

Self-Hosting

Single Server

docker compose up -d
docker compose exec app alembic upgrade head

With MinIO (S3-Compatible Storage)

docker compose -f docker-compose.cluster.yml up -d
docker compose -f docker-compose.cluster.yml exec app alembic upgrade head

Any Cloud

ClawBox works with any S3-compatible storage. Set S3_ENDPOINT_URL:

Provider S3_ENDPOINT_URL
AWS S3 (leave empty)
MinIO http://minio:9000
GCS https://storage.googleapis.com
DigitalOcean Spaces https://{region}.digitaloceanspaces.com
Cloudflare R2 https://{account_id}.r2.cloudflarestorage.com

See Self-Hosting Guide for the full local, cluster, and cloud setup flow.


Configuration

Variable Default Description
DATABASE_URL postgresql://... PostgreSQL connection
STORAGE_BACKEND local local or s3
S3_ENDPOINT_URL (empty) S3-compatible endpoint
GOOGLE_API_KEY (empty) Gemini API key (enables search)
GOOGLE_CLIENT_ID (empty) Google OAuth (enables login)
SESSION_SECRET_KEY change-me Session signing (change in prod)
APP_URL http://localhost:8000 Public URL for share links

See .env.example for all options.


Architecture

┌─────────────────────────┐
│     Web UI / CLI        │
└────────┬────────────────┘
         │ HTTP API
┌────────┴────────────────┐
│     FastAPI Server      │
│  (stateless, scalable)  │
└────────┬────────────────┘
         │
    ┌────┴────┐
    │         │
┌───┴───┐ ┌──┴──────────┐
│ PostgreSQL │ │ Object Storage │
│ (pgvector) │ │ (Local/S3/MinIO)│
└────────┘ └─────────────┘
  • PostgreSQL + pgvector — metadata, users, embeddings, search
  • Object storage — file content (local filesystem, S3, MinIO, GCS, etc.)
  • Gemini — embeddings + multimodal indexing (optional)

Project Structure

src/
├── main.py           # FastAPI app, routing, middleware
├── config.py         # Settings from environment
├── models.py         # SQLAlchemy models
├── auth.py           # Bearer token authentication
├── storage.py        # Storage backend (local/S3)
├── embeddings.py     # Gemini embeddings + text extraction
├── database.py       # Database connection
├── cli.py            # CLI tool (clawbox command)
├── routes/
│   ├── tokens.py     # Token management + settings
│   ├── files.py      # File CRUD + sharing + folders
│   ├── search.py     # Semantic search
│   └── oauth.py      # Google OAuth
└── static/
    └── index.html    # Web UI

Contributing

Contributions are welcome! Please open an issue or submit a PR.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clawbox-1.0.0.tar.gz (28.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clawbox-1.0.0-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file clawbox-1.0.0.tar.gz.

File metadata

  • Download URL: clawbox-1.0.0.tar.gz
  • Upload date:
  • Size: 28.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for clawbox-1.0.0.tar.gz
Algorithm Hash digest
SHA256 8294c8a8409710124885b3810f9e8ab7bc9d641336f77406e440060a96be97e9
MD5 2b305b11594cc6b6ed577390a6fd75ed
BLAKE2b-256 1da3695e0009b2819fff134c61c76d4c426baf58e1ecbb61db28e44739af0f13

See more details on using hashes here.

File details

Details for the file clawbox-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: clawbox-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 33.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for clawbox-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 56a569d9744a33629830e25427902df4dc7f1cd477b366f63e1ce5f82dbb13ae
MD5 c23d0b6240cda71335848de51ed658c0
BLAKE2b-256 d80481c6ed2b5b634e415cbe58fb0687ced0199deebbd21f37ff82fa62c7189b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page