amsdal_ml plugin for AMSDAL Framework

Reason this release was yanked:

license issue

Project description

AMSDAL ML

Machine learning plugin for the AMSDAL Framework, providing embeddings, vector search, semantic retrieval, and AI agents with support for OpenAI models.

Features

Vector Embeddings: Generate and store embeddings for any AMSDAL model with automatic chunking
Semantic Search: Query your data using natural language with tag-based filtering
AI Agents: Build Q&A systems with streaming support and citation tracking
Async-First: Optimized for high-performance async operations
MCP Integration: Expose and consume tools via Model Context Protocol (stdio/HTTP)
File Attachments: Process and embed documents with built-in loaders
Extensible: Abstract base classes for custom models, retrievers, and ingesters

Installation

pip install amsdal-ml

Requirements

Python 3.11 or higher
AMSDAL Framework 0.5.6+
OpenAI API key (for default implementations)

Quick Start

1. Configuration

Create a .env file in your project root:

OPENAI_API_KEY=sk-your-api-key-here
async_mode=true
ml_model_class=amsdal_ml.ml_models.openai_model.OpenAIModel
ml_retriever_class=amsdal_ml.ml_retrievers.openai_retriever.OpenAIRetriever
ml_ingesting_class=amsdal_ml.ml_ingesting.openai_ingesting.OpenAIIngesting

Create a config.yml for AMSDAL connections:

application_name: my-ml-app
async_mode: true
connections:
  - name: sqlite_state
    backend: sqlite-state-async
    credentials:
      - db_path: ./warehouse/state.sqlite3
      - check_same_thread: false
  - name: lock
    backend: amsdal_data.lock.implementations.thread_lock.ThreadLock
resources_config:
  repository:
    default: sqlite_state
  lock: lock

2. Generate Embeddings

from amsdal_ml.ml_ingesting.openai_ingesting import OpenAIIngesting
from amsdal_ml.ml_config import ml_config

# Initialize ingesting
ingester = OpenAIIngesting(
    model=MyModel,
    embedding_field='embedding',
)

# Generate embeddings for an instance
instance = MyModel(content='Your text here')
embeddings = await ingester.agenerate_embeddings(instance)
await ingester.asave(embeddings, instance)

3. Semantic Search

from amsdal_ml.ml_retrievers.openai_retriever import OpenAIRetriever

retriever = OpenAIRetriever()

# Search for relevant content
results = await retriever.asimilarity_search(
    query='What is machine learning?',
    k=5,
    include_tags=['documentation']
)

for chunk in results:
    print(f'{chunk.object_class}:{chunk.object_id} - {chunk.raw_text}')

4. Build an AI Agent

from amsdal_ml.agents.default_qa_agent import DefaultQAAgent

agent = DefaultQAAgent()

# Ask questions
output = await agent.arun('Explain vector embeddings')
print(output.answer)
print(f'Used tools: {output.used_tools}')

# Stream responses
async for chunk in agent.astream('What is semantic search?'):
    print(chunk, end='', flush=True)

5. Functional Calling Agent with Python Tools

from amsdal_ml.agents.functional_calling_agent import FunctionalCallingAgent
from amsdal_ml.agents.python_tool import PythonTool
from amsdal_ml.ml_models.openai_model import OpenAIModel

llm = OpenAIModel()
agent = FunctionalCallingAgent(model=llm, tools=[search_tool, render_tool])
result = await agent.arun(user_query="Find products with price > 100", history=[])

6. Natural Language Query Retriever

from amsdal_ml.ml_retrievers.query_retriever import NLQueryRetriever

retriever = NLQueryRetriever(llm=llm, queryset=Product.objects.all())
documents = await retriever.invoke("Show me red products", limit=10)

7. Document Ingestion Pipeline

from amsdal_ml.ml_ingesting import ModelIngester
from amsdal_ml.ml_ingesting.pipeline import DefaultIngestionPipeline
from amsdal_ml.ml_ingesting.loaders.pdf_loader import PdfLoader
from amsdal_ml.ml_ingesting.processors.text_cleaner import TextCleaner
from amsdal_ml.ml_ingesting.splitters.token_splitter import TokenSplitter
from amsdal_ml.ml_ingesting.embedders.openai_embedder import OpenAIEmbedder
from amsdal_ml.ml_ingesting.stores.embedding_data import EmbeddingDataStore

pipeline = DefaultIngestionPipeline(
    loader=PdfLoader(),  # Uses pymupdf for PDF processing
    cleaner=TextCleaner(),
    splitter=TokenSplitter(max_tokens=800, overlap_tokens=80),
    embedder=OpenAIEmbedder(),
    store=EmbeddingDataStore(),
)

ingester = ModelIngester(
    pipeline=pipeline,
    base_tags=["document"],
    base_metadata={"source": "pdf"},
)

Architecture

Core Components

MLModel: Abstract interface for LLM inference (invoke, stream, with attachments)
MLIngesting: Generate text and embeddings from data objects with chunking
MLRetriever: Semantic similarity search with tag-based filtering
Agent: Q&A and task-oriented agents with streaming and citations
EmbeddingModel: Database model storing 1536-dimensional vectors linked to source objects
PythonTool: Tool for executing Python functions within agents
FunctionalCallingAgent: Agent specialized in functional calling with configurable tools
NLQueryRetriever: Retriever for natural language queries on AMSDAL querysets
DefaultIngestionPipeline: Pipeline for document ingestion including loader, cleaner, splitter, embedder, and store
ModelIngester: High-level ingester for processing models with customizable pipelines and metadata
PdfLoader: Document loader using pymupdf for PDF processing
TextCleaner: Processor for cleaning and normalizing text
TokenSplitter: Splitter for dividing text into chunks based on token count
OpenAIEmbedder: Embedder for generating embeddings via OpenAI API
EmbeddingDataStore: Store for saving embedding data linked to source objects
MCP Server/Client: Expose retrievers as tools or consume external MCP services

Configuration

All settings are managed via MLConfig in .env:

# Model Configuration
llm_model_name=gpt-4o
llm_temperature=0.0
embed_model_name=text-embedding-3-small

# Chunking Parameters
embed_max_depth=2
embed_max_chunks=10
embed_max_tokens_per_chunk=800

# Retrieval Settings
retriever_default_k=8

Development

Setup

# Install dependencies
pip install --upgrade uv hatch==1.14.2
hatch env create
hatch run sync

Testing

# Run all tests with coverage
hatch run cov

# Run specific tests
hatch run test tests/test_openai_model.py

# Watch mode
pytest tests/ -v

Code Quality

# Run all checks (style + typing)
hatch run all

# Format code
hatch run fmt

# Type checking
hatch run typing

AMSDAL CLI

# Generate a new model
amsdal generate model MyModel --format py

# Generate property
amsdal generate property --model MyModel embedding_field

# Generate transaction
amsdal generate transaction ProcessEmbeddings

# Generate hook
amsdal generate hook --model MyModel on_create

MCP Server

Run the retriever as an MCP server for integration with Claude Desktop or other MCP clients:

python -m amsdal_ml.mcp_server.server_retriever_stdio \
  --amsdal-config "$(echo '{"async_mode": true, ...}' | base64)"

The server exposes a search tool for semantic search in your knowledge base.

License

See amsdal_ml/Third-Party Materials - AMSDAL Dependencies - License Notices.md for dependency licenses.

Project details

Release history Release notifications | RSS feed

1.5.0

Apr 29, 2026

1.4.0

Apr 10, 2026

1.3.0

Mar 25, 2026

1.2.1

Mar 6, 2026

1.2.0 yanked

Mar 4, 2026

Reason this release was yanked:

license issue

1.1.0 yanked

Feb 25, 2026

Reason this release was yanked:

license issue

1.0.1 yanked

Feb 18, 2026

Reason this release was yanked:

license issue

1.0.0 yanked

Feb 18, 2026

Reason this release was yanked:

license issue

0.3.2 yanked

Feb 6, 2026

Reason this release was yanked:

license issue

0.3.1 yanked

Feb 6, 2026

Reason this release was yanked:

license issue

0.3.0 yanked

Feb 5, 2026

Reason this release was yanked:

license issue

0.2.2 yanked

Jan 19, 2026

Reason this release was yanked:

license issue

0.2.1 yanked

Dec 23, 2025

Reason this release was yanked:

license issue

This version

0.2.0 yanked

Dec 16, 2025

Reason this release was yanked:

license issue

0.1.4

Oct 15, 2025

0.1.3

Oct 15, 2025

0.1.2

Oct 8, 2025

0.1.1

Oct 8, 2025

0.1.0

Sep 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amsdal_ml-0.2.0.tar.gz (984.7 kB view details)

Uploaded Dec 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

amsdal_ml-0.2.0-py3-none-any.whl (87.2 kB view details)

Uploaded Dec 16, 2025 Python 3

File details

Details for the file amsdal_ml-0.2.0.tar.gz.

File metadata

Download URL: amsdal_ml-0.2.0.tar.gz
Upload date: Dec 16, 2025
Size: 984.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for amsdal_ml-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`fbf23214139d7cb60f9d048a98c6782a8b872ecec3a1e512559cce638042a416`
MD5	`cd2555d5ecafddc037af6bf8b9c6463a`
BLAKE2b-256	`a8ff65fe59c3cff241df3deb3e8ddbde4cf1bd9f935d786973bc538bd27091a8`

See more details on using hashes here.

File details

Details for the file amsdal_ml-0.2.0-py3-none-any.whl.

File metadata

Download URL: amsdal_ml-0.2.0-py3-none-any.whl
Upload date: Dec 16, 2025
Size: 87.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: python-httpx/0.28.1

File hashes

Hashes for amsdal_ml-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5c67b02ea274874df987d6f8d985db7739f54f6297e384c744c641ccf8a55c0`
MD5	`9e39afbef22f41f2d17935dba306a9c3`
BLAKE2b-256	`8f40086ffb6e6569645a39ed6f6c76a2c65a9f2e477d8705675e7368224e46d0`

See more details on using hashes here.

amsdal_ml 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

AMSDAL ML

Features

Installation

Requirements

Quick Start

1. Configuration

2. Generate Embeddings

3. Semantic Search

4. Build an AI Agent

5. Functional Calling Agent with Python Tools

6. Natural Language Query Retriever

7. Document Ingestion Pipeline

Architecture

Core Components

Configuration

Development

Setup

Testing

Code Quality

AMSDAL CLI

MCP Server

License

Links

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes