Skip to main content

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

Project description

arkiv

Universal personal data format. JSONL in, SQL out, MCP to LLMs.

The Format

Every record is a JSON object. All fields optional.

{"mimetype": "text/plain", "content": "I think the key insight is...", "uri": "https://chatgpt.com/c/abc", "timestamp": "2023-05-14T10:30:00Z", "metadata": {"role": "user", "conversation_id": "abc"}}
{"mimetype": "audio/wav", "uri": "file://media/podcast.wav", "timestamp": "2024-01-15", "metadata": {"transcript": "Welcome to...", "duration": 45.2}}
{"mimetype": "image/jpeg", "uri": "file://media/photo.jpg", "metadata": {"caption": "My talk at MIT"}}

The Stack

JSONL files (canonical, portable, human-readable)
    ↓ arkiv import
SQLite database (queryable, efficient, standard SQL)
    ↓ arkiv mcp
MCP server (3 tools → any LLM)
    ↑ arkiv export

Quick Start

pip install arkiv

# Import JSONL to SQLite
arkiv import conversations.jsonl --db archive.db

# Query
arkiv query archive.db "SELECT content FROM records WHERE metadata->>'role' = 'user' LIMIT 5"

# Export with temporal slicing
arkiv export archive.db --output 2024/ --since 2024-01-01 --until 2024-12-31

# Serve to LLMs via MCP
arkiv mcp archive.db

MCP Tools

Read-only by default. Start with arkiv mcp --writable db to enable the write tool.

Tool Description Mode
get_manifest() What collections exist, their descriptions and schemas read-only
get_schema(collection?) What metadata keys can be queried read-only
sql_query(query) Run read-only SQL read-only
write_record(...) Append a single record to a collection writable

Why

  • Your data lives in silos (ChatGPT, email, bookmarks, photos, voice memos)
  • Source toolkits (memex, mtk, btk, ptk, ebk) export it as JSONL
  • arkiv gives you one format, one database, one query interface
  • Any LLM can query it via MCP
  • JSONL is human-readable and durable. SQLite is the most deployed database in history.

Spec

See SPEC.md for the full technical specification.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arkiv-0.1.2.tar.gz (77.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

arkiv-0.1.2-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file arkiv-0.1.2.tar.gz.

File metadata

  • Download URL: arkiv-0.1.2.tar.gz
  • Upload date:
  • Size: 77.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.2.tar.gz
Algorithm Hash digest
SHA256 56ccd56d0c28339da6966e96ff26a53fb7aa205c5c46ab649ee16b33426ba446
MD5 9394fd83605dccd0107fce3dd973c7cc
BLAKE2b-256 26d4f07dfec27e6760de4cc6c4e6ab1c71dec9478cc65385819712a3056cd66e

See more details on using hashes here.

File details

Details for the file arkiv-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: arkiv-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for arkiv-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 4bb4baec3130b58c6cf81ced186e8dc85b319d975aedb790fa879dadee4b82a8
MD5 b550882124826068a6375edfdd2013e3
BLAKE2b-256 4fa6d0ad6924974c606ede351ca3ecf50f80c2667053ac317504447cc3ce008b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page