Skip to main content

SuperDoc SDK (CLI-backed)

Project description

superdoc-sdk

Programmatic SDK for deterministic DOCX operations through SuperDoc's Document API.

Alpha — The API surface matches the Document API and will evolve alongside it.

Install

pip install superdoc-sdk

The CLI is bundled with the SDK — no separate install needed. A platform-specific CLI companion package is installed automatically via PEP 508 environment markers.

Platform Architecture
macOS Apple Silicon (arm64), Intel (x64)
Linux x64, ARM64
Windows x64

Quick start

import asyncio

from superdoc import AsyncSuperDocClient


async def main():
    async with AsyncSuperDocClient(default_change_mode="tracked") as client:
        # Open a document
        doc = await client.open({"doc": "./contract.docx"})

        # Find and replace text with query + mutation plan
        match = await doc.query.match(
            {
                "select": {"type": "text", "pattern": "ACME Corp"},
                "require": "first",
            }
        )

        items = match.get("items") or []
        first_item = items[0] if items else {}
        ref = first_item.get("handle", {}).get("ref")
        if ref:
            await doc.mutations.apply(
                {
                    "expectedRevision": match["evaluatedRevision"],
                    "atomic": True,
                    "steps": [
                        {
                            "id": "replace-acme",
                            "op": "text.rewrite",
                            "where": {"by": "ref", "ref": ref},
                            "args": {"replacement": {"text": "NewCo Inc."}},
                        }
                    ],
                }
            )

        # Save and close
        await doc.save({"inPlace": True})
        await doc.close({})


asyncio.run(main())

Set default_change_mode="tracked" to make mutations use tracked changes by default. If you pass changeMode on a specific call, that explicit value overrides the default.

The SDK also exposes a synchronous SuperDocClient with the same document-handle methods when you prefer non-async code paths.

Encrypted documents

Pass password when opening a password-protected .docx:

doc = await client.open({"doc": "./secret.docx", "password": "mypassword"})

The password is forwarded only for the initial open and is not persisted. If the password is missing or wrong, the error includes a machine-readable code (DOCX_PASSWORD_REQUIRED, DOCX_PASSWORD_INVALID).

Sync

from superdoc import SuperDocClient

with SuperDocClient() as client:
    doc = client.open({"doc": "./contract.docx"})

    info = doc.info({})
    print(info["counts"])

    doc.save({"inPlace": True})
    doc.close({})

User identity

By default the SDK attributes edits to a generic "CLI" user. Set user on the client to identify your automation in comments, tracked changes, and collaboration presence:

client = AsyncSuperDocClient(user={"name": "Review Bot", "email": "bot@example.com"})

The user is injected into every client.open call. If you pass userName or userEmail on a specific client.open, those per-call values take precedence.

Client lifecycle

The SDK uses a persistent host process for all operations. The host is started on first use and reused across calls, avoiding per-operation subprocess overhead.

Context managers (recommended)

# Sync
with SuperDocClient() as client:
    doc = client.open({"doc": "./test.docx"})
    doc.find({"query": "test"})

# Async
async with AsyncSuperDocClient() as client:
    doc = await client.open({"doc": "./test.docx"})
    await doc.find({"query": "test"})

The context manager calls connect() on entry and dispose() on exit (including on exception).

Explicit lifecycle

client = SuperDocClient()
client.connect()      # Optional — first invoke() auto-connects
doc = client.open({"doc": "./test.docx"})
result = doc.find({"query": "test"})
client.dispose()      # Shuts down the host process

connect() is optional. If not called explicitly, the first operation triggers a lazy connection to the host process.

Configuration

client = SuperDocClient(
    startup_timeout_ms=10_000,    # Max time for host handshake (default: 5000)
    shutdown_timeout_ms=5_000,    # Max time for graceful shutdown (default: 5000)
    request_timeout_ms=60_000,    # Per-operation timeout passed to CLI (default: None)
    watchdog_timeout_ms=30_000,   # Client-side safety timer per request (default: 30000)
    default_change_mode="tracked", # Auto-inject changeMode for mutations (default: None)
    user={"name": "Bot", "email": "bot@example.com"},  # User identity for attribution
    env={"SUPERDOC_CLI_BIN": "/path/to/superdoc"},  # Environment overrides
)

Thread safety

Client instances are serialized: one operation at a time per client. For parallelism, use multiple client instances. Do not share a single client across threads.

Collaboration sessions

Use this when your app already has a live collaboration room. The SDK supports y-websocket, hocuspocus, and liveblocks providers. See the full collaboration docs for provider choice guidance and all configuration options.

Join an existing room (y-websocket shorthand)

The collabUrl + collabDocumentId shorthand defaults to the y-websocket provider. For Hocuspocus or Liveblocks, pass an explicit collaboration object.

import asyncio

from superdoc import AsyncSuperDocClient


async def main():
    async with AsyncSuperDocClient() as client:
        doc = await client.open({
            "collabUrl": "ws://localhost:4000",
            "collabDocumentId": "my-doc-room",
        })

        await doc.insert({
            "target": {"type": "end"},
            "content": "Added by the SDK",
        })


asyncio.run(main())

Connect to Hocuspocus explicitly

doc = await client.open({
    "collaboration": {
        "providerType": "hocuspocus",
        "url": "ws://localhost:1234",
        "documentId": "my-doc-room",
    }
})

Connect to Liveblocks

# Public API key
doc = await client.open({
    "collaboration": {
        "providerType": "liveblocks",
        "roomId": "my-room",
        "publicApiKey": "pk_live_xxx",
    }
})

# Auth endpoint (production)
doc = await client.open({
    "collaboration": {
        "providerType": "liveblocks",
        "roomId": "my-room",
        "authEndpoint": "https://app.example.com/api/liveblocks-auth",
    }
})

SDK authEndpoint values must be absolute URLs. Relative paths like /api/liveblocks-auth are not supported because the CLI host has no browser origin.

Start an empty room from a local .docx

doc = await client.open({
    "doc": "./starting-template.docx",
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
})

Control empty-room behavior

Parameter Type Default Description
collabUrl string WebSocket URL. Shorthand for y-websocket only.
collabDocumentId string session ID Room/document ID on the provider.
doc string Local .docx used only when the room is empty.
onMissing string seedFromDoc seedFromDoc, blank, or error.
bootstrapSettlingMs number 1500 Wait time (ms) before seeding to avoid race conditions.

If you only want to join rooms that already exist, use onMissing: 'error':

doc = await client.open({
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
    "onMissing": "error",
})

Check if the SDK seeded or joined

doc = await client.open({
    "doc": "./starting-template.docx",
    "collabUrl": "ws://localhost:4000",
    "collabDocumentId": "my-doc-room",
})

print(doc.open_result.get("bootstrap"))
# { roomState, bootstrapApplied, bootstrapSource }

Available operations

The SDK exposes all document-handle operations from the Document API plus client lifecycle and introspection methods.

Lifecycle

Operation Description
client.open Open a document and return a bound document handle. Optionally override the document body with contentOverride + overrideType (markdown, html, or text).
doc.save Save the current session to the original file or a new path.
doc.close Close the bound editing session and clean up resources.

Query

Operation Description
doc.find Search the document for nodes matching type, text, or attribute criteria.
doc.get_node Retrieve a single node by target position.
doc.get_node_by_id Retrieve a single node by its unique ID.
doc.get_text Extract the plain-text content of the document.
doc.get_markdown Extract the document content as a Markdown string.
doc.info Return document metadata including revision, node count, and capabilities.
doc.query.match Deterministic selector-based search with cardinality contracts for mutation targeting.
doc.mutations.preview Dry-run a mutation plan, returning resolved targets without applying changes.

Mutation

Operation Description
doc.insert Insert content at a target position, or at the end of the document when target is omitted. Supports text (default), markdown, and html content types via the type field.
doc.replace Replace content at a target position with new text or inline content.
doc.delete Delete content at a target position.
doc.mutations.apply Execute a mutation plan atomically against the document.

Format

Operation Description
doc.format.apply Apply inline run-property patch changes to the target range with explicit set/clear semantics.
doc.format.bold Set or clear bold on the target text range.
doc.format.italic Set or clear italic on the target text range.
doc.format.strike Set or clear strikethrough on the target text range.
doc.format.underline Set or clear underline on the target text range.
doc.format.highlight Set or clear highlight on the target text range.
doc.format.color Set or clear text color on the target text range.
doc.format.font_size Set or clear font size on the target text range.
doc.format.font_family Set or clear font family on the target text range.

And 30+ additional formatting operations (letter spacing, vertical alignment, small caps, shading, borders, and more).

Create

Operation Description
doc.create.paragraph Create a new paragraph at the target position.
doc.create.heading Create a new heading at the target position.
doc.create.section_break Create a section break at the target location.
doc.create.table Create a new table at the target position.
doc.create.table_of_contents Insert a new table of contents at the target position.

Blocks

Operation Description
doc.blocks.delete Delete an entire block node (paragraph, heading, list item, table, image, or sdt).

Lists

Operation Description
doc.lists.list List all list nodes in the document, optionally filtered by scope.
doc.lists.get Retrieve a specific list node by target.
doc.lists.insert Insert a new list at the target position.
doc.lists.create Create a new list from one or more paragraphs.
doc.lists.attach Convert non-list paragraphs to list items under an existing list sequence.
doc.lists.detach Remove numbering properties from list items, converting them to plain paragraphs.
doc.lists.indent Increase the indentation level of a list item.
doc.lists.outdent Decrease the indentation level of a list item.
doc.lists.join Merge two adjacent list sequences into one.
doc.lists.can_join Check whether two adjacent list sequences can be joined.
doc.lists.separate Split a list sequence at the target item.
doc.lists.set_level Set the absolute nesting level (0..8) of a list item.
doc.lists.set_value Set an explicit numbering value at the target item.
doc.lists.continue_previous Continue numbering from the nearest compatible previous list sequence.
doc.lists.can_continue_previous Check whether the target sequence can continue numbering from a previous sequence.
doc.lists.set_level_restart Set the restart behavior for a specific list level.
doc.lists.convert_to_text Convert list items to plain paragraphs, optionally prepending the rendered marker text.

Comments

Operation Description
doc.comments.create Create a new comment thread (or reply when parentCommentId is given).
doc.comments.patch Patch fields on an existing comment (text, target, status, or isInternal).
doc.comments.delete Remove a comment or reply by ID.
doc.comments.get Retrieve a single comment thread by ID.
doc.comments.list List all comment threads in the document.

Track changes

Operation Description
doc.track_changes.list List all tracked changes in the document.
doc.track_changes.get Retrieve a single tracked change by ID.
doc.track_changes.decide Accept or reject a tracked change (by ID or scope: all).

History

Operation Description
doc.history.get Query the current undo/redo history state of the active editor.
doc.history.undo Undo the most recent history-safe mutation in the active editor.
doc.history.redo Redo the most recently undone action in the active editor.

Client methods

Operation Description
client.describe List all available CLI operations and contract metadata.
client.describe_command Show detailed metadata for a single CLI operation.

Troubleshooting

Custom CLI binary

If you need to use a custom-built CLI binary (e.g. a newer version or a patched build), set the SUPERDOC_CLI_BIN environment variable:

export SUPERDOC_CLI_BIN=/path/to/superdoc

Debug logging

Enable transport-level debug logging to diagnose connectivity issues:

export SUPERDOC_DEBUG=1

Air-gapped / private index environments

Mirror both superdoc-sdk and the superdoc-sdk-cli-* package for your platform to your private index. For example, on macOS ARM64:

pip download superdoc-sdk superdoc-sdk-cli-darwin-arm64
# Upload both wheels to your private index

Related

  • Document API — the in-browser API that defines the operation set
  • CLI — use the same operations from the terminal
  • Collaboration guides — set up Liveblocks, Hocuspocus, or SuperDoc Yjs

Part of SuperDoc

This SDK is part of SuperDoc — open-source DOCX editing and tooling. Renders, edits, and automates .docx in the browser and on the server.

License

AGPL-3.0 · Enterprise license available

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superdoc_sdk-1.3.0.dev64.tar.gz (335.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

superdoc_sdk-1.3.0.dev64-py3-none-any.whl (338.2 kB view details)

Uploaded Python 3

File details

Details for the file superdoc_sdk-1.3.0.dev64.tar.gz.

File metadata

  • Download URL: superdoc_sdk-1.3.0.dev64.tar.gz
  • Upload date:
  • Size: 335.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for superdoc_sdk-1.3.0.dev64.tar.gz
Algorithm Hash digest
SHA256 054839a6dfce3a45038c2de8750f1c5551558befec86f62a022ac76c7b80e853
MD5 71da81fd1e11ba091821641a6dc30870
BLAKE2b-256 e9d395be8c6d9572225dd4bfabb927ab65a7e8278c1595e4f9b83c4da5086da1

See more details on using hashes here.

Provenance

The following attestation bundles were made for superdoc_sdk-1.3.0.dev64.tar.gz:

Publisher: release-sdk.yml on superdoc-dev/superdoc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file superdoc_sdk-1.3.0.dev64-py3-none-any.whl.

File metadata

File hashes

Hashes for superdoc_sdk-1.3.0.dev64-py3-none-any.whl
Algorithm Hash digest
SHA256 52b2ab1c4e6aad8680dd902275713e2e1c4d7811a2508997e3661c2b2e87d172
MD5 56af8a22febf5ee9aedce468f4a13737
BLAKE2b-256 3417108fc94986217293bc1234957f4843ed8a0c57c85592866b304ac3e3960e

See more details on using hashes here.

Provenance

The following attestation bundles were made for superdoc_sdk-1.3.0.dev64-py3-none-any.whl:

Publisher: release-sdk.yml on superdoc-dev/superdoc

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page