Skip to main content

Morphik Python Client

Project description

Morphik

A Python client for Morphik API that enables document ingestion, semantic search, and retrieval augmented generation capabilities.

🚨 Upgrading to v1.0

Breaking Change: list_documents() now returns a ListDocsResponse object instead of a list.

Quick Migration:

# Change this:
for doc in db.list_documents():
    process(doc)

# To this:
for doc in db.list_documents().documents:
    process(doc)

See CHANGELOG.md for full details and new features.


Installation

pip install morphik

Usage

The SDK provides both synchronous and asynchronous clients:

Synchronous Usage

from morphik import Morphik

# Initialize client - connects to localhost:8000 by default
db = Morphik()

# You can also use a direct HTTP(S) base URL for self-hosted deployments
# db = Morphik("http://morphik:8000")

# Or with authentication URI (for production)
# db = Morphik("morphik://owner_id:token@api.morphik.ai")

# Ingest a text document
doc = db.ingest_text(
    content="Your document content",
    metadata={"title": "Example Document"}
)

# Ingest a file
doc = db.ingest_file(
    file="path/to/document.pdf",
    metadata={"category": "reports"}
)

# Run a Morphik On-the-Fly document query
doc_query = db.query_document(
    file="path/to/document.pdf",
    prompt="Extract the parties and effective date.",
    ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
)
print(doc_query.structured_output)

# Retrieve relevant chunks
chunks = db.retrieve_chunks(
    query="Your search query",
    filters={"category": "reports"}
)

# Query with RAG
response = db.query(
    query="Summarize the key points in the document",
    filters={"category": "reports"}
)

print(response.completion)

# Migrate this app's documents into another Morphik deployment.
# Run this from a machine that can reach both source and target, such as
# inside a customer's VPN for on-prem targets.
result = db.migrate(target_uri="morphik://owner_id:token@onprem.example.com", target_is_local=True)
print(result.created_count, result.skipped_count, result.failed_count)

Nested Folders & Folder Depth

# Create a nested folder (parents are auto-created server-side)
folder = db.create_folder(full_path="/projects/alpha/specs", description="Specs folder")

# Move or rename folder paths
moved = db.move_folder("/projects/alpha/specs", "/projects/archive/specs")
renamed = moved.rename("specs-v2")

# Scope queries to a path and include descendants with folder_depth=-1
chunks = folder.retrieve_chunks(query="design notes", folder_depth=-1)
docs = db.list_documents(folder_name="/projects/alpha", folder_depth=-1)

# List only the fields you need. The server reads and returns just those columns, so
# the full document text is never downloaded — fast for large corpora.
for doc in db.list_documents(fields=["metadata"]).documents:
    print(doc.external_id, doc.metadata)

Folder.full_path is exposed on folder objects, and Document.folder_path mirrors server responses for tracing scope.

Asynchronous Usage

import asyncio
from morphik.async_ import AsyncMorphik

async def main():
    # Initialize async client - connects to localhost:8000 by default
    async with AsyncMorphik() as db:

    # You can also use a direct HTTP(S) base URL for self-hosted deployments
    # async with AsyncMorphik("http://morphik:8000") as db:

    # Or with authentication URI (for production)
    # async with AsyncMorphik("morphik://owner_id:token@api.morphik.ai") as db:
        # Ingest a text document
        doc = await db.ingest_text(
            content="Your document content",
            metadata={"title": "Example Document"}
        )

        doc_query = await db.query_document(
            file="path/to/document.pdf",
            prompt="Extract the parties and effective date.",
            ingestion_options={"ingest": True, "metadata": {"source": "contracts"}}
        )
        print(doc_query.structured_output)

        # Query with RAG
        response = await db.query(
            query="Summarize the key points in the document",
        )

        print(response.completion)

# Run the async function
asyncio.run(main())

Features

  • Document ingestion (text, files, directories)
  • Semantic search and retrieval
  • Retrieval-augmented generation (RAG)
  • Morphik On-the-Fly document querying with optional ingestion follow-up
  • Multi-user and multi-folder scoping
  • Metadata filtering
  • Document management

Development

Running Tests

To run the tests, first install the development dependencies:

pip install -r test_requirements.txt

Then run the tests:

# Run all tests (requires a running Morphik server)
pytest morphik/tests/ -v

# Run specific test modules
pytest morphik/tests/test_sync.py -v
pytest morphik/tests/test_async.py -v

# Skip tests if you don't have a running server
SKIP_LIVE_TESTS=1 pytest morphik/tests/ -v

# Specify a custom server URL for tests
MORPHIK_TEST_URL=http://custom-server:8000 pytest morphik/tests/ -v

Example Usage Script

The SDK comes with an example script that demonstrates basic usage:

# Run synchronous example
python -m morphik.tests.example_usage

# Run asynchronous example
python -m morphik.tests.example_usage --async

The example script demonstrates:

  • Text and file ingestion
  • Creating folders and user scopes
  • Retrieving chunks and documents
  • Generating completions using RAG
  • Batch operations and cleanup

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphik-1.2.5.tar.gz (66.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morphik-1.2.5-py3-none-any.whl (75.7 kB view details)

Uploaded Python 3

File details

Details for the file morphik-1.2.5.tar.gz.

File metadata

  • Download URL: morphik-1.2.5.tar.gz
  • Upload date:
  • Size: 66.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.5.tar.gz
Algorithm Hash digest
SHA256 0a4ef706590086d17adf1004d031d42f275a974dfc810fe6b51b4e765ce6d6ef
MD5 fa5455e7fce08f3bee5105648c440fed
BLAKE2b-256 b90c46d4d3fa366200ef54e4da68bcb299ed049b352afb0deaaa69f773e39d08

See more details on using hashes here.

File details

Details for the file morphik-1.2.5-py3-none-any.whl.

File metadata

  • Download URL: morphik-1.2.5-py3-none-any.whl
  • Upload date:
  • Size: 75.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.11

File hashes

Hashes for morphik-1.2.5-py3-none-any.whl
Algorithm Hash digest
SHA256 f033d4dd6ac9d2b3e22f412c30951bd6d553e34c3c1c3b2cb23a15d06c3e9e2f
MD5 edfba579e4c8ac53a5c21e0557d14e1d
BLAKE2b-256 7f90835538dcbe12045d46f05ac21796a47af87bb3066503aabd86db8f007c14

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page