Skip to main content

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

Project description

arkiv

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

The Format

Every record is a JSON object. All fields optional.

{"mimetype": "text/plain", "content": "I think the key insight is...", "uri": "https://chatgpt.com/c/abc", "timestamp": "2023-05-14T10:30:00Z", "metadata": {"role": "user", "conversation_id": "abc"}}
{"mimetype": "audio/wav", "uri": "file://media/podcast.wav", "timestamp": "2024-01-15", "metadata": {"transcript": "Welcome to...", "duration": 45.2}}
{"mimetype": "image/jpeg", "uri": "file://media/photo.jpg", "metadata": {"caption": "My talk at MIT"}}

The Stack

JSONL files (canonical, portable, human-readable)
    ↓ arkiv import
SQLite database (queryable, efficient, standard SQL)
    ↓ arkiv mcp
MCP server (3 tools → any LLM)

Quick Start

pip install arkiv

# Import JSONL to SQLite
arkiv import conversations.jsonl --db archive.db

# Query
arkiv query archive.db "SELECT content FROM records WHERE metadata->>'role' = 'user' LIMIT 5"

# Serve to LLMs via MCP
arkiv mcp archive.db

MCP Tools

Read-only by default. Start with arkiv mcp --writable db to enable the write tool.

Tool Description Mode
get_manifest() What collections exist, their descriptions and schemas read-only
get_schema(collection?) What metadata keys can be queried read-only
sql_query(query) Run read-only SQL read-only
write_record(...) Append a single record to a collection writable

Why

  • Your data lives in silos (ChatGPT, email, bookmarks, photos, voice memos)
  • Source toolkits (memex, mtk, btk, ptk, ebk) export it as JSONL
  • arkiv gives you one format, one database, one query interface
  • Any LLM can query it via MCP
  • JSONL is human-readable and durable. SQLite is the most deployed database in history.

Spec

See SPEC.md for the full technical specification.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arkiv-0.1.1.tar.gz (73.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arkiv-0.1.1-py3-none-any.whl (21.3 kB view details)

Uploaded Python 3

File details

Details for the file arkiv-0.1.1.tar.gz.

File metadata

  • Download URL: arkiv-0.1.1.tar.gz
  • Upload date:
  • Size: 73.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bbcbd888c44970ba6ee4647cf3dbf64dcccc4af35a9c57d859fe128a8204f029
MD5 4b2c02c8573eb6ae45cbcebde9fb281d
BLAKE2b-256 7baf9671c808d066021c7e2f46efa200873d5aa0096ae8db1d3c446646602ccc

See more details on using hashes here.

File details

Details for the file arkiv-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: arkiv-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 21.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 683ed1d08fd3d1d53fc71db8bc7a3a6059cc98004d1fb3fadd39e016c7bff279
MD5 ce50a8e497d652feab035e4d3313de9f
BLAKE2b-256 0b9b923cdf4f1f413b04d8b99d07e5df882b6df55ceee72c911bb94367e758c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page