Skip to main content

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

Project description

arkiv

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

The Format

Every record is a JSON object. All fields optional.

{"mimetype": "text/plain", "content": "I think the key insight is...", "uri": "https://chatgpt.com/c/abc", "timestamp": "2023-05-14T10:30:00Z", "metadata": {"role": "user", "conversation_id": "abc"}}
{"mimetype": "audio/wav", "uri": "file://media/podcast.wav", "timestamp": "2024-01-15", "metadata": {"transcript": "Welcome to...", "duration": 45.2}}
{"mimetype": "image/jpeg", "uri": "file://media/photo.jpg", "metadata": {"caption": "My talk at MIT"}}

The Stack

JSONL files (canonical, portable, human-readable)
    ↓ arkiv import
SQLite database (queryable, efficient, standard SQL)
    ↓ arkiv mcp
MCP server (3 tools → any LLM)

Quick Start

pip install arkiv

# Import JSONL to SQLite
arkiv import conversations.jsonl --db archive.db

# Query
arkiv query archive.db "SELECT content FROM records WHERE metadata->>'role' = 'user' LIMIT 5"

# Serve to LLMs via MCP
arkiv mcp archive.db

MCP Tools

Tool Description
get_manifest() What collections exist, their descriptions and schemas
get_schema(collection?) What metadata keys can be queried
sql_query(query) Run read-only SQL

Why

  • Your data lives in silos (ChatGPT, email, bookmarks, photos, voice memos)
  • Source toolkits (memex, mtk, btk, ptk, ebk) export it as JSONL
  • arkiv gives you one format, one database, one query interface
  • Any LLM can query it via MCP
  • JSONL is human-readable and durable. SQLite is the most deployed database in history.

Spec

See SPEC.md for the full technical specification.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arkiv-0.1.0.tar.gz (72.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arkiv-0.1.0-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file arkiv-0.1.0.tar.gz.

File metadata

  • Download URL: arkiv-0.1.0.tar.gz
  • Upload date:
  • Size: 72.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.0.tar.gz
Algorithm Hash digest
SHA256 cee25a0c17f6b3febad353dbfbddd15dbbda6f481fb70e3bf61e0a7b56aeed7b
MD5 d7f1270e582f9f36f89c3e77781a3d71
BLAKE2b-256 3a91ec7a8d6140a66fbbdd149bd9e433806dcff0cee4fab4b99546cbcd0bb4c0

See more details on using hashes here.

File details

Details for the file arkiv-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: arkiv-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 21.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3a7637995c9da213ba9e20394586544b1967bb1069cffc07eb74b5326aa0a224
MD5 7a1bcb65272e0e7b062fa1d0262b7048
BLAKE2b-256 8c3b21e206b7e9fa738f19724b8a27932fca75f930d80ab630a488e423d172a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page