Skip to main content

Morphik Python Client

Project description

Morphik

A Python client for Morphik API that enables document ingestion, semantic search, and retrieval augmented generation capabilities.

🚨 Upgrading to v1.0

Breaking Change: list_documents() now returns a ListDocsResponse object instead of a list.

Quick Migration:

# Change this:
for doc in db.list_documents():
    process(doc)

# To this:
for doc in db.list_documents().documents:
    process(doc)

See CHANGELOG.md for full details and new features.


Installation

pip install morphik

Usage

The SDK provides both synchronous and asynchronous clients:

Synchronous Usage

from morphik import Morphik

# Initialize client - connects to localhost:8000 by default
db = Morphik()

# You can also use a direct HTTP(S) base URL for self-hosted deployments
# db = Morphik("http://morphik:8000")

# Or with authentication URI (for production)
# db = Morphik("morphik://owner_id:token@api.morphik.ai")

# Ingest a text document
doc = db.ingest_text(
    content="Your document content",
    metadata={"title": "Example Document"}
)

# Ingest a file
doc = db.ingest_file(
    file="path/to/document.pdf",
    metadata={"category": "reports"}
)

# Run a Morphik On-the-Fly document query
doc_query = db.query_document(
    file="path/to/document.pdf",
    prompt="Extract the parties and effective date.",
    ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
)
print(doc_query.structured_output)

# Retrieve relevant chunks
chunks = db.retrieve_chunks(
    query="Your search query",
    filters={"category": "reports"}
)

# Query with RAG
response = db.query(
    query="Summarize the key points in the document",
    filters={"category": "reports"}
)

print(response.completion)

Nested Folders & Folder Depth

# Create a nested folder (parents are auto-created server-side)
folder = db.create_folder(full_path="/projects/alpha/specs", description="Specs folder")

# Move or rename folder paths
moved = db.move_folder("/projects/alpha/specs", "/projects/archive/specs")
renamed = moved.rename("specs-v2")

# Scope queries to a path and include descendants with folder_depth=-1
chunks = folder.retrieve_chunks(query="design notes", folder_depth=-1)
docs = db.list_documents(folder_name="/projects/alpha", folder_depth=-1)

Folder.full_path is exposed on folder objects, and Document.folder_path mirrors server responses for tracing scope.

Asynchronous Usage

import asyncio
from morphik.async_ import AsyncMorphik

async def main():
    # Initialize async client - connects to localhost:8000 by default
    async with AsyncMorphik() as db:

    # You can also use a direct HTTP(S) base URL for self-hosted deployments
    # async with AsyncMorphik("http://morphik:8000") as db:

    # Or with authentication URI (for production)
    # async with AsyncMorphik("morphik://owner_id:token@api.morphik.ai") as db:
        # Ingest a text document
        doc = await db.ingest_text(
            content="Your document content",
            metadata={"title": "Example Document"}
        )

        doc_query = await db.query_document(
            file="path/to/document.pdf",
            prompt="Extract the parties and effective date.",
            ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
        )
        print(doc_query.structured_output)

        # Query with RAG
        response = await db.query(
            query="Summarize the key points in the document",
        )

        print(response.completion)

# Run the async function
asyncio.run(main())

Features

  • Document ingestion (text, files, directories)
  • Semantic search and retrieval
  • Retrieval-augmented generation (RAG)
  • Morphik On-the-Fly document querying with optional ingestion follow-up
  • Multi-user and multi-folder scoping
  • Metadata filtering
  • Document management

Development

Running Tests

To run the tests, first install the development dependencies:

pip install -r test_requirements.txt

Then run the tests:

# Run all tests (requires a running Morphik server)
pytest morphik/tests/ -v

# Run specific test modules
pytest morphik/tests/test_sync.py -v
pytest morphik/tests/test_async.py -v

# Skip tests if you don't have a running server
SKIP_LIVE_TESTS=1 pytest morphik/tests/ -v

# Specify a custom server URL for tests
MORPHIK_TEST_URL=http://custom-server:8000 pytest morphik/tests/ -v

Example Usage Script

The SDK comes with an example script that demonstrates basic usage:

# Run synchronous example
python -m morphik.tests.example_usage

# Run asynchronous example
python -m morphik.tests.example_usage --async

The example script demonstrates:

  • Text and file ingestion
  • Creating folders and user scopes
  • Retrieving chunks and documents
  • Generating completions using RAG
  • Batch operations and cleanup

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphik-1.2.2.tar.gz (59.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morphik-1.2.2-py3-none-any.whl (69.1 kB view details)

Uploaded Python 3

File details

Details for the file morphik-1.2.2.tar.gz.

File metadata

  • Download URL: morphik-1.2.2.tar.gz
  • Upload date:
  • Size: 59.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.2.tar.gz
Algorithm Hash digest
SHA256 8ccc5ce6881dc0962e2948d844b4506c1910fc5aaf34c20e54a8fc7381b7f883
MD5 87af5f01099cc761ba3eb20ebf6a9ade
BLAKE2b-256 de9c1c22c0e3c124967419ee61daca9872dd741da47ab73ee5bcd41eae0d92b9

See more details on using hashes here.

File details

Details for the file morphik-1.2.2-py3-none-any.whl.

File metadata

  • Download URL: morphik-1.2.2-py3-none-any.whl
  • Upload date:
  • Size: 69.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 aa2156bdf223788e381b508e4b87c217242dd340b83ff2d9f1437ccd5f78e1f9
MD5 d0bb5f1943625d4259c9b147def5378f
BLAKE2b-256 fdb40875c4cdae2ce28c0abb5114eebf396817f68ccfa8708f8b4dd75b47da99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page