MCP server universal untuk CRUD dokumen lintas format (text, JSON, YAML, CSV, XML, DOCX, XLSX, PPTX, PDF) dengan versioning, sandboxed multi-root, search, batch, dan semantic search opsional.
Project description
Dokumen-Pintar
Universal MCP server for cross-format document CRUD, lint, and template authoring
Read, write, search, lint, and author text, Office, and PDF files from any AI agent that supports the Model Context Protocol.
Features
- Multi-root Sandbox — Define multiple workspace roots with per-root
writablecontrol. All paths outside the sandbox are rejected. - 10 Formats — Plain text, Markdown, LaTeX, JSON / YAML, CSV / TSV, XML / SVG, DOCX, XLSX, PPTX, PDF, plus image EXIF.
- 62 MCP Tools — File & content CRUD, structured access, batch operations, search, versioning, metadata, authoring, image extraction, sections, templates, TOC, bibliography, document compare, lint — all exposed as callable tools.
- Automatic Versioning — Copy-on-write snapshots on every write operation. Undo, diff, restore, and purge anytime.
- Structured Access — JSONPath for JSON / YAML (incl. list indices), XPath for XML, cell / range / sheet for XLSX, paragraph / paragraph_runs / table cells for DOCX, slide for PPTX, page for PDF.
- Authoring — Generate DOCX or PDF from a JSON spec or Markdown, render Jinja2-style DOCX templates, convert DOCX → Markdown.
- Document Lint — Pluggable rule registry with built-in presets (
default,academic_id,academic_id_kp,academic_id_skripsi) for Indonesian academic documents. - Indonesian Stemming (optional) — Sastrawi-based morphological matching so
mengatakan,berkata,perkataancollapse during search. - Semantic Search (optional) — Vector search powered by sentence-transformers; enable via config.
- Audit Trail — Every mutation logged to JSONL with timestamp and operation details.
- 2 Transports — stdio (Claude Desktop, Cursor, VS Code, Windsurf) and HTTP / SSE.
Supported Formats
| Format | Read | Write | Structured Query | Search |
|---|---|---|---|---|
| Plain text / Markdown / LaTeX | Y | Y | - | Y |
| JSON / JSONC / JSON5 | Y | Y | JSONPath $.key, $.array[N] |
Y |
| YAML | Y | Y | JSONPath $.key |
Y |
| CSV / TSV | Y | Y | row:N col:NAME cell:row:N,col:NAME |
Y |
| XML / SVG | Y | Y | XPath //node |
Y |
| DOCX | Y | Y | paragraph:N paragraph_runs:N table:N!A1 |
Y |
| XLSX | Y | Y | cell:Sheet!A1 range: sheet: |
Y |
| PPTX | Y | Y | slide:N slide_title:N |
Y |
| Y | - | page:N outline metadata |
Y | |
| Images (JPG / PNG / TIFF / WEBP) | Y | Y meta | EXIF tags | - |
Quick Start
1. Install
pip install dokumen-pintar
With Indonesian stemming:
pip install dokumen-pintar[indonesian]
With semantic search:
pip install dokumen-pintar[semantic]
2. Create a Config
dokumen-pintar-init
Or create one manually:
{
"roots": [
{ "name": "documents", "path": "~/Documents", "writable": true },
{ "name": "projects", "path": "~/Projects", "writable": true }
]
}
Six pre-tuned profiles ship in docs/profiles/ — copy personal.json for daily desktop use, research.json for thesis libraries, or team-server.json for HTTP deployment.
3. Run
dokumen-pintar --config dokumen-pintar.config.json
4. Connect to an AI Client
Claude Desktop — Add to claude_desktop_config.json:
{
"mcpServers": {
"dokumen-pintar": {
"command": "dokumen-pintar",
"args": ["--config", "/path/to/dokumen-pintar.config.json"]
}
}
}
Cursor / VS Code / Windsurf — Use the same stdio transport. Point your IDE's MCP settings to the dokumen-pintar command and config path.
Tools Overview
62 MCP tools organised by category:
| Category | Tools |
|---|---|
| Workspace | workspace_list_roots · workspace_stat · workspace_tree · workspace_diagnose |
| File CRUD | file_create · file_delete · file_rename · file_copy · file_move |
| Content | content_read · content_write · content_append · content_insert · content_replace · content_delete_range · content_patch · content_diff |
| Structured | struct_get · struct_set · struct_delete · struct_meta |
| Metadata | metadata_read · metadata_write · metadata_delete · metadata_strip · metadata_read_batch |
| Authoring | validate_spec · compose_docx · compose_pdf · compose_from_markdown · compose_to_markdown |
| Sections | section_extract · section_merge |
| Images | image_list · image_extract · image_extract_all · image_replace |
| Templates | template_list · template_install · template_render · template_render_named |
| TOC & Bibliography | toc_generate · bibliography_check · bibliography_format |
| Compare & Lint | document_compare · document_lint · document_lint_fix |
| Batch | batch_rename · batch_replace_content · batch_replace_structured · batch_delete |
| Search | search_filename · search_content · search_in_format |
| Versioning | version_list · version_diff · version_restore · version_undo · version_purge |
| Semantic* | search_semantic · semantic_index_path · semantic_stats |
*Only registered when semantic_search.enabled = true and [semantic] extras are installed.
Bundled templates
academic_id/kp_basic— generic Indonesian Kerja Praktik report skeleton (cover, lembar pengesahan, kata pengantar, BAB I/II, log book, daftar pustaka).
Documentation
Full docs on GitHub: github.com/firdausmntp/Dokumen-Pintar
- USAGE.md — Workspace URIs, every tool with JSON examples, recipes
- CONFIG.md — All config fields with types, defaults, and notes
- TOOLS.md — Full reference for all 62 tools
- ARCHITECTURE.md — Module map, request flow, versioning, safety
- BENCHMARK.md — Performance baselines and methodology
- profiles/ — Six pre-tuned config presets
- AGENTS.md — Contributor guide
License
MIT — 2026 firdausmntp
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dokumen_pintar-1.1.0.tar.gz.
File metadata
- Download URL: dokumen_pintar-1.1.0.tar.gz
- Upload date:
- Size: 262.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee0b19481e7709ba742849f4be7cf3262e5add898a5ea45afa8a3f78a873a033
|
|
| MD5 |
1fa5a70aecb659407bc870325d759dd9
|
|
| BLAKE2b-256 |
9ef85b17605248db9ed15e1c72df9a0d6915893e6d90294cba3eee5d7340c689
|
Provenance
The following attestation bundles were made for dokumen_pintar-1.1.0.tar.gz:
Publisher:
publish.yml on firdausmntp/Dokumen-Pintar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dokumen_pintar-1.1.0.tar.gz -
Subject digest:
ee0b19481e7709ba742849f4be7cf3262e5add898a5ea45afa8a3f78a873a033 - Sigstore transparency entry: 1560317397
- Sigstore integration time:
-
Permalink:
firdausmntp/Dokumen-Pintar@928788f87de3f711c95de438b6c33bb1af16b1fd -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/firdausmntp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@928788f87de3f711c95de438b6c33bb1af16b1fd -
Trigger Event:
push
-
Statement type:
File details
Details for the file dokumen_pintar-1.1.0-py3-none-any.whl.
File metadata
- Download URL: dokumen_pintar-1.1.0-py3-none-any.whl
- Upload date:
- Size: 188.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
06171eb87a0ff03b5c1f856443c9f93a5c0dc74adb96e6012d6240a234ea1813
|
|
| MD5 |
5036b7177334b6d639bc109a105b277f
|
|
| BLAKE2b-256 |
663be1a6e28c962d0914fc33860bdc2e984ffd0fb1076df0035210396f1528be
|
Provenance
The following attestation bundles were made for dokumen_pintar-1.1.0-py3-none-any.whl:
Publisher:
publish.yml on firdausmntp/Dokumen-Pintar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dokumen_pintar-1.1.0-py3-none-any.whl -
Subject digest:
06171eb87a0ff03b5c1f856443c9f93a5c0dc74adb96e6012d6240a234ea1813 - Sigstore transparency entry: 1560317699
- Sigstore integration time:
-
Permalink:
firdausmntp/Dokumen-Pintar@928788f87de3f711c95de438b6c33bb1af16b1fd -
Branch / Tag:
refs/tags/v1.1.0 - Owner: https://github.com/firdausmntp
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@928788f87de3f711c95de438b6c33bb1af16b1fd -
Trigger Event:
push
-
Statement type: