Skip to main content

Morphik Python Client

Project description

Morphik

A Python client for Morphik API that enables document ingestion, semantic search, and retrieval augmented generation capabilities.

🚨 Upgrading to v1.0

Breaking Change: list_documents() now returns a ListDocsResponse object instead of a list.

Quick Migration:

# Change this:
for doc in db.list_documents():
    process(doc)

# To this:
for doc in db.list_documents().documents:
    process(doc)

See CHANGELOG.md for full details and new features.


Installation

pip install morphik

Usage

The SDK provides both synchronous and asynchronous clients:

Synchronous Usage

from morphik import Morphik

# Initialize client - connects to localhost:8000 by default
db = Morphik()

# You can also use a direct HTTP(S) base URL for self-hosted deployments
# db = Morphik("http://morphik:8000")

# Or with authentication URI (for production)
# db = Morphik("morphik://owner_id:token@api.morphik.ai")

# Ingest a text document
doc = db.ingest_text(
    content="Your document content",
    metadata={"title": "Example Document"}
)

# Ingest a file
doc = db.ingest_file(
    file="path/to/document.pdf",
    metadata={"category": "reports"}
)

# Run a Morphik On-the-Fly document query
doc_query = db.query_document(
    file="path/to/document.pdf",
    prompt="Extract the parties and effective date.",
    ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
)
print(doc_query.structured_output)

# Retrieve relevant chunks
chunks = db.retrieve_chunks(
    query="Your search query",
    filters={"category": "reports"}
)

# Query with RAG
response = db.query(
    query="Summarize the key points in the document",
    filters={"category": "reports"}
)

print(response.completion)

# Migrate this app's documents into another Morphik deployment.
# Run this from a machine that can reach both source and target, such as
# inside a customer's VPN for on-prem targets.
result = db.migrate(target_uri="morphik://owner_id:token@onprem.example.com", target_is_local=True)
print(result.created_count, result.skipped_count, result.failed_count)

Nested Folders & Folder Depth

# Create a nested folder (parents are auto-created server-side)
folder = db.create_folder(full_path="/projects/alpha/specs", description="Specs folder")

# Move or rename folder paths
moved = db.move_folder("/projects/alpha/specs", "/projects/archive/specs")
renamed = moved.rename("specs-v2")

# Scope queries to a path and include descendants with folder_depth=-1
chunks = folder.retrieve_chunks(query="design notes", folder_depth=-1)
docs = db.list_documents(folder_name="/projects/alpha", folder_depth=-1)

# List only the fields you need. The server reads and returns just those columns, so
# the full document text is never downloaded — fast for large corpora.
for doc in db.list_documents(fields=["metadata"]).documents:
    print(doc.external_id, doc.metadata)

Folder.full_path is exposed on folder objects, and Document.folder_path mirrors server responses for tracing scope.

Asynchronous Usage

import asyncio
from morphik.async_ import AsyncMorphik

async def main():
    # Initialize async client - connects to localhost:8000 by default
    async with AsyncMorphik() as db:

    # You can also use a direct HTTP(S) base URL for self-hosted deployments
    # async with AsyncMorphik("http://morphik:8000") as db:

    # Or with authentication URI (for production)
    # async with AsyncMorphik("morphik://owner_id:token@api.morphik.ai") as db:
        # Ingest a text document
        doc = await db.ingest_text(
            content="Your document content",
            metadata={"title": "Example Document"}
        )

        doc_query = await db.query_document(
            file="path/to/document.pdf",
            prompt="Extract the parties and effective date.",
            ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
        )
        print(doc_query.structured_output)

        # Query with RAG
        response = await db.query(
            query="Summarize the key points in the document",
        )

        print(response.completion)

# Run the async function
asyncio.run(main())

Features

  • Document ingestion (text, files, directories)
  • Semantic search and retrieval
  • Retrieval-augmented generation (RAG)
  • Morphik On-the-Fly document querying with optional ingestion follow-up
  • Multi-user and multi-folder scoping
  • Metadata filtering
  • Document management

Development

Running Tests

To run the tests, first install the development dependencies:

pip install -r test_requirements.txt

Then run the tests:

# Run all tests (requires a running Morphik server)
pytest morphik/tests/ -v

# Run specific test modules
pytest morphik/tests/test_sync.py -v
pytest morphik/tests/test_async.py -v

# Skip tests if you don't have a running server
SKIP_LIVE_TESTS=1 pytest morphik/tests/ -v

# Specify a custom server URL for tests
MORPHIK_TEST_URL=http://custom-server:8000 pytest morphik/tests/ -v

Example Usage Script

The SDK comes with an example script that demonstrates basic usage:

# Run synchronous example
python -m morphik.tests.example_usage

# Run asynchronous example
python -m morphik.tests.example_usage --async

The example script demonstrates:

  • Text and file ingestion
  • Creating folders and user scopes
  • Retrieving chunks and documents
  • Generating completions using RAG
  • Batch operations and cleanup

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphik-1.2.6.tar.gz (67.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morphik-1.2.6-py3-none-any.whl (77.8 kB view details)

Uploaded Python 3

File details

Details for the file morphik-1.2.6.tar.gz.

File metadata

  • Download URL: morphik-1.2.6.tar.gz
  • Upload date:
  • Size: 67.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.6.tar.gz
Algorithm Hash digest
SHA256 39f67e1dd1f215fd89662c89ae922adf0a56e887270d31dc4fcb82b8a3257040
MD5 1e9eb81d588df9c6f912f5149b66d94d
BLAKE2b-256 28eabae2833d6ec1b1ca766748aa832df35be6e5765516a734d4fca3cbbd08cb

See more details on using hashes here.

File details

Details for the file morphik-1.2.6-py3-none-any.whl.

File metadata

  • Download URL: morphik-1.2.6-py3-none-any.whl
  • Upload date:
  • Size: 77.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 30272695ebb7efc33b50aedffa22def4b6888e9dd53b5db878ccbec6490a05fa
MD5 b61c4c9583c0ac62d23300eca44b4a47
BLAKE2b-256 f019639909a2c3aa1ff376db2eb69e2a50eca68642560501d34c865aa03283fe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page