Skip to main content

SuperDoc SDK (CLI-backed)

Project description

superdoc-sdk

Programmatic SDK for deterministic DOCX operations through SuperDoc's Document API.

Alpha — The API surface matches the Document API and will evolve alongside it.

Install

pip install superdoc-sdk

The CLI is bundled with the SDK — no separate install needed. A platform-specific CLI companion package is installed automatically via PEP 508 environment markers.

Platform Architecture
macOS Apple Silicon (arm64), Intel (x64)
Linux x64, ARM64
Windows x64

Quick start

import asyncio

from superdoc import AsyncSuperDocClient


async def main():
    async with AsyncSuperDocClient(default_change_mode="tracked") as client:
        # Open a document
        await client.doc.open({"doc": "./contract.docx"})

        # Find and replace text with query + mutation plan
        match = await client.doc.query.match(
            {
                "select": {"type": "text", "pattern": "ACME Corp"},
                "require": "first",
            }
        )

        items = match.get("items") or []
        first_item = items[0] if items else {}
        ref = first_item.get("handle", {}).get("ref")
        if ref:
            await client.doc.mutations.apply(
                {
                    "expectedRevision": match["evaluatedRevision"],
                    "atomic": True,
                    "steps": [
                        {
                            "id": "replace-acme",
                            "op": "text.rewrite",
                            "where": {"by": "ref", "ref": ref},
                            "args": {"replacement": {"text": "NewCo Inc."}},
                        }
                    ],
                }
            )

        # Save and close
        await client.doc.save({"inPlace": True})
        await client.doc.close({})


asyncio.run(main())

Set default_change_mode="tracked" to make mutations use tracked changes by default. If you pass changeMode on a specific call, that explicit value overrides the default.

The SDK also exposes a synchronous SuperDocClient with the same doc.* operations when you prefer non-async code paths.

Sync

from superdoc import SuperDocClient

with SuperDocClient() as client:
    client.doc.open({"doc": "./contract.docx"})

    info = client.doc.info({})
    print(info["counts"])

    client.doc.save({"inPlace": True})
    client.doc.close({})

User identity

By default the SDK attributes edits to a generic "CLI" user. Set user on the client to identify your automation in comments, tracked changes, and collaboration presence:

client = AsyncSuperDocClient(user={"name": "Review Bot", "email": "bot@example.com"})

The user is injected into every doc.open call. If you pass userName or userEmail on a specific doc.open, those per-call values take precedence.

Client lifecycle

The SDK uses a persistent host process for all operations. The host is started on first use and reused across calls, avoiding per-operation subprocess overhead.

Context managers (recommended)

# Sync
with SuperDocClient() as client:
    client.doc.find({"query": "test"})

# Async
async with AsyncSuperDocClient() as client:
    await client.doc.find({"query": "test"})

The context manager calls connect() on entry and dispose() on exit (including on exception).

Explicit lifecycle

client = SuperDocClient()
client.connect()      # Optional — first invoke() auto-connects
result = client.doc.find({"query": "test"})
client.dispose()      # Shuts down the host process

connect() is optional. If not called explicitly, the first operation triggers a lazy connection to the host process.

Configuration

client = SuperDocClient(
    startup_timeout_ms=10_000,    # Max time for host handshake (default: 5000)
    shutdown_timeout_ms=5_000,    # Max time for graceful shutdown (default: 5000)
    request_timeout_ms=60_000,    # Per-operation timeout passed to CLI (default: None)
    watchdog_timeout_ms=30_000,   # Client-side safety timer per request (default: 30000)
    default_change_mode="tracked", # Auto-inject changeMode for mutations (default: None)
    user={"name": "Bot", "email": "bot@example.com"},  # User identity for attribution
    env={"SUPERDOC_CLI_BIN": "/path/to/superdoc"},  # Environment overrides
)

Thread safety

Client instances are serialized: one operation at a time per client. For parallelism, use multiple client instances. Do not share a single client across threads.

Collaboration sessions

Use this when your app already has a live collaboration room (Liveblocks, Hocuspocus, or SuperDoc Yjs).

Join an existing room

Pass collabUrl and collabDocumentId to doc.open:

import asyncio

from superdoc import AsyncSuperDocClient


async def main():
    async with AsyncSuperDocClient() as client:
        await client.doc.open({
            "collabUrl": "ws://localhost:4000",
            "collabDocumentId": "my-doc-room",
        })

        await client.doc.insert({
            "target": {"type": "end"},
            "content": "Added by the SDK",
        })


asyncio.run(main())

Start an empty room from a local .docx

If the room is empty, pass doc together with collaboration params:

await client.doc.open({
    "doc": "./starting-template.docx",
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
})

What happens when you pass doc:

Room state Result
Room already has content SDK joins the room. doc is ignored.
Room is empty and doc is provided SDK seeds the room from doc, then joins.
Room is empty and no doc is provided SDK starts a blank document.

Control empty-room behavior

Parameter Type Default Description
collabUrl string WebSocket URL for your collaboration provider.
collabDocumentId string session ID Room/document ID on the provider.
doc string Local .docx used only when the room is empty.
onMissing string seedFromDoc seedFromDoc, blank, or error.
bootstrapSettlingMs number 1500 Wait time (ms) before seeding to avoid race conditions.

If you only want to join rooms that already exist, use onMissing: 'error':

await client.doc.open({
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
    "onMissing": "error",
})

Check if the SDK seeded or joined

doc.open returns bootstrap details in collaboration mode:

result = await client.doc.open({
    "doc": "./starting-template.docx",
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
})

print(result.get("bootstrap"))
# { roomState, bootstrapApplied, bootstrapSource }

Available operations

The SDK exposes all operations from the Document API plus lifecycle and session commands.

Lifecycle

Operation Description
doc.open Open a document and create a persistent editing session. Optionally override the document body with contentOverride + overrideType (markdown, html, or text).
doc.save Save the current session to the original file or a new path.
doc.close Close the active editing session and clean up resources.

Query

Operation Description
doc.find Search the document for nodes matching type, text, or attribute criteria.
doc.get_node Retrieve a single node by target position.
doc.get_node_by_id Retrieve a single node by its unique ID.
doc.get_text Extract the plain-text content of the document.
doc.get_markdown Extract the document content as a Markdown string.
doc.info Return document metadata including revision, node count, and capabilities.
doc.query.match Deterministic selector-based search with cardinality contracts for mutation targeting.
doc.mutations.preview Dry-run a mutation plan, returning resolved targets without applying changes.

Mutation

Operation Description
doc.insert Insert content at a target position, or at the end of the document when target is omitted. Supports text (default), markdown, and html content types via the type field.
doc.replace Replace content at a target position with new text or inline content.
doc.delete Delete content at a target position.
doc.mutations.apply Execute a mutation plan atomically against the document.

Format

Operation Description
doc.format.apply Apply inline run-property patch changes to the target range with explicit set/clear semantics.
doc.format.bold Set or clear bold on the target text range.
doc.format.italic Set or clear italic on the target text range.
doc.format.strike Set or clear strikethrough on the target text range.
doc.format.underline Set or clear underline on the target text range.
doc.format.highlight Set or clear highlight on the target text range.
doc.format.color Set or clear text color on the target text range.
doc.format.font_size Set or clear font size on the target text range.
doc.format.font_family Set or clear font family on the target text range.

And 30+ additional formatting operations (letter spacing, vertical alignment, small caps, shading, borders, and more).

Create

Operation Description
doc.create.paragraph Create a new paragraph at the target position.
doc.create.heading Create a new heading at the target position.
doc.create.section_break Create a section break at the target location.
doc.create.table Create a new table at the target position.
doc.create.table_of_contents Insert a new table of contents at the target position.

Blocks

Operation Description
doc.blocks.delete Delete an entire block node (paragraph, heading, list item, table, image, or sdt).

Lists

Operation Description
doc.lists.list List all list nodes in the document, optionally filtered by scope.
doc.lists.get Retrieve a specific list node by target.
doc.lists.insert Insert a new list at the target position.
doc.lists.create Create a new list from one or more paragraphs.
doc.lists.attach Convert non-list paragraphs to list items under an existing list sequence.
doc.lists.detach Remove numbering properties from list items, converting them to plain paragraphs.
doc.lists.indent Increase the indentation level of a list item.
doc.lists.outdent Decrease the indentation level of a list item.
doc.lists.join Merge two adjacent list sequences into one.
doc.lists.can_join Check whether two adjacent list sequences can be joined.
doc.lists.separate Split a list sequence at the target item.
doc.lists.set_level Set the absolute nesting level (0..8) of a list item.
doc.lists.set_value Set an explicit numbering value at the target item.
doc.lists.continue_previous Continue numbering from the nearest compatible previous list sequence.
doc.lists.can_continue_previous Check whether the target sequence can continue numbering from a previous sequence.
doc.lists.set_level_restart Set the restart behavior for a specific list level.
doc.lists.convert_to_text Convert list items to plain paragraphs, optionally prepending the rendered marker text.

Comments

Operation Description
doc.comments.create Create a new comment thread (or reply when parentCommentId is given).
doc.comments.patch Patch fields on an existing comment (text, target, status, or isInternal).
doc.comments.delete Remove a comment or reply by ID.
doc.comments.get Retrieve a single comment thread by ID.
doc.comments.list List all comment threads in the document.

Track changes

Operation Description
doc.track_changes.list List all tracked changes in the document.
doc.track_changes.get Retrieve a single tracked change by ID.
doc.track_changes.decide Accept or reject a tracked change (by ID or scope: all).

History

Operation Description
doc.history.get Query the current undo/redo history state of the active editor.
doc.history.undo Undo the most recent history-safe mutation in the active editor.
doc.history.redo Redo the most recently undone action in the active editor.

Session

Operation Description
doc.session.list List all active editing sessions.
doc.session.save Persist the current session state.
doc.session.close Close a specific editing session by ID.
doc.session.set_default Set the default session for subsequent commands.

Introspection

Operation Description
doc.status Show the current session status and document metadata.
doc.describe List all available CLI operations and contract metadata.
doc.describe_command Show detailed metadata for a single CLI operation.

Troubleshooting

Custom CLI binary

If you need to use a custom-built CLI binary (e.g. a newer version or a patched build), set the SUPERDOC_CLI_BIN environment variable:

export SUPERDOC_CLI_BIN=/path/to/superdoc

Debug logging

Enable transport-level debug logging to diagnose connectivity issues:

export SUPERDOC_DEBUG=1

Air-gapped / private index environments

Mirror both superdoc-sdk and the superdoc-sdk-cli-* package for your platform to your private index. For example, on macOS ARM64:

pip download superdoc-sdk superdoc-sdk-cli-darwin-arm64
# Upload both wheels to your private index

Related

  • Document API — the in-browser API that defines the operation set
  • CLI — use the same operations from the terminal
  • Collaboration guides — set up Liveblocks, Hocuspocus, or SuperDoc Yjs

Part of SuperDoc

This SDK is part of SuperDoc — an open source document editor bringing Microsoft Word to the web.

License

AGPL-3.0 · Enterprise license available

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superdoc_sdk-1.0.0a43.tar.gz (364.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

superdoc_sdk-1.0.0a43-py3-none-any.whl (385.8 kB view details)

Uploaded Python 3

File details

Details for the file superdoc_sdk-1.0.0a43.tar.gz.

File metadata

  • Download URL: superdoc_sdk-1.0.0a43.tar.gz
  • Upload date:
  • Size: 364.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superdoc_sdk-1.0.0a43.tar.gz
Algorithm Hash digest
SHA256 cd0c27c938561a9590d96b35e077570cb47f24d8cf0270f61b36160ed80856cd
MD5 823c014babcfc44cbefda66967b0a748
BLAKE2b-256 a59cb9c7364fd9812da2340bd69e797f1b5fc15e5c3dcb5b2ca256c6932ab152

See more details on using hashes here.

Provenance

The following attestation bundles were made for superdoc_sdk-1.0.0a43.tar.gz:

Publisher: release-sdk.yml on superdoc-dev/superdoc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file superdoc_sdk-1.0.0a43-py3-none-any.whl.

File metadata

  • Download URL: superdoc_sdk-1.0.0a43-py3-none-any.whl
  • Upload date:
  • Size: 385.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superdoc_sdk-1.0.0a43-py3-none-any.whl
Algorithm Hash digest
SHA256 f02672b3e302633c52c4dcad662b5b4d21109f63e74147261e5e0e6a0b256857
MD5 ad2036e00b53739eece547c1954d3577
BLAKE2b-256 10990c0b0adbdebe520b55325be633f09cf1c8fb0664f82db2ff4c2181457e78

See more details on using hashes here.

Provenance

The following attestation bundles were made for superdoc_sdk-1.0.0a43-py3-none-any.whl:

Publisher: release-sdk.yml on superdoc-dev/superdoc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page