A Jupyter-compatible plugin that detects risky ML model and dataset loads.

Project description

MAIS - ML Model Audit & Inspection System

A Python notebook plugin that watches for potentially risky model or dataset loads in Jupyter notebooks. MAIS analyzes code in real-time to detect when you're trying to load models that might require special permissions or licensing.

Detection Architecture - V1 vs V2

MAIS offers two detection architectures that can be toggled via feature flag:

🔄 V1: Legacy Baseline Detection (Default)

Production-safe default for backward compatibility
Uses configuration-based pattern matching
Watches predefined function lists in config.py
Best for: Stable production environments

🚀 V2: Provider-Based Detection (Enhanced)

Specialized detectors for major ML/AI providers
Comprehensive coverage including patterns V1 misses
Provider-specific intelligence for better accuracy
Best for: Development and comprehensive model monitoring

Provider	V1 Detection	V2 Detection
HuggingFace	✅ Basic patterns	✅ Advanced + Hub integration
OpenAI	❌ Missed patterns	✅ Full API coverage
PyTorch	✅ torch.load	✅ Extended patterns
Anthropic	❌ Not detected	✅ Claude API detection
LangChain	❌ Framework blind	✅ Full framework support
LlamaIndex	❌ Not detected	✅ Document processing

Architecture Overview

MAIS uses a flexible, strategy-based architecture with multiple specialized components:

MAIS Architecture

Additional Architecture Views

View	Purpose	Link
📊 Dependencies	Component relationships & data flow	MAIS_DEPENDENCY.svg
⚡ Process Flow	End-to-end analysis workflow	MAIS_PROCESS.svg
🏗️ DDD Layers	Domain-driven design structure	MAIS_ARCHITECTURE.svg

Core Components

📥 Input Layer

Processes various types of source code inputs:

Source Code: Direct Python code analysis
Notebooks: Jupyter notebook cell analysis
Requirements: Dependency file scanning
Python Files: Static file analysis

🔍 Provider-Specific Detectors

Specialized detectors for different ML/AI providers and frameworks:

OpenAI: Detects GPT, DALL-E, and OpenAI API usage
HuggingFace: Identifies Transformers, Datasets, and Hub model loads
Anthropic: Catches Claude API integrations
LangChain: Finds LangChain components and chains
LlamaIndex: Detects LlamaIndex document processing

⚙️ Detection Strategies

Pluggable analysis approaches that detectors can use:

AST Strategy: Advanced parsing with variable resolution for complex code analysis
Regex Strategy: Fast pattern matching for simple detection scenarios
LLM-based Strategy: Future AI-powered code understanding

📊 Intermediate Output

Analysis results from provider detectors:

Model Findings: Detected model usage with metadata
Risk Assessment: Security and compliance evaluation
Inventory Mapping: Model-to-provider relationship mapping

📋 JSON Schema Standardization

Converts findings into structured format:

AI Detection JSON Schema: Standardized detection results format
Provider Attribution: Links findings to specific ML providers
Risk Categorization: Security and compliance classifications

📦 SBOM Generation

Creates comprehensive software bills of materials:

manifest-cli Integration: Uses external SBOM generation tools
SBOM Builder: Internal component for SBOM creation
Dependency Analysis: Maps AI/ML dependencies

📤 Output Formats

Multiple standard formats for integration:

CycloneDX JSON: Industry-standard SBOM format
SPDX JSON: Open-source license compliance format

Installation

# Using pip
pip install mais

# Import and initialize the MAIS plugin
from mais import MAIS

# V1: Default legacy detection (production-safe)
m = MAIS(api_token="<manifest-api-token>")

# V2: Enhanced provider-based detection (recommended for dev/comprehensive monitoring)
m = MAIS(api_token="<manifest-api-token>", use_v2_detectors=True)

# Now run your notebook as normal
# MAIS will monitor for potentially risky model loads

Detection Architecture Configuration

Constructor Parameter (Per Instance)

# Use V2 provider-based detection architecture
from mais import MAIS

# Enable V2 provider-based detection (default: legacy V1)
m = MAIS(api_token="token", use_v2_detectors=True)

# Use legacy detection (default)
m = MAIS(api_token="token")  # or use_v2_detectors=False

# Explicitly use legacy detection
m = MAIS(api_token="token", use_v2_detectors=False)

Google Colab Usage

Perfect for environments where you can't set environment variables:

from google.colab import userdata
api_token = userdata.get('MANIFEST_API_KEY')

from mais import MAIS
# Use V2 for comprehensive OpenAI + HuggingFace detection
m = MAIS(api_token=api_token, use_v2_detectors=True)

Advanced Usage

MAIS supports different detection strategies and provider combinations:

from mais.application.services.ast_analyzer import ASTAnalyzer

# Use default baseline detection (backward compatible)
analyzer = ASTAnalyzer()

# Or use with custom detectors
from mais.domain.model_analysis.detectors.baseline_detector import BaselineDetector
analyzer = ASTAnalyzer(detectors=[BaselineDetector()])

# Analyze code for model usage
findings = analyzer.analyze_code(your_code)

SBOM Generation

# Generate an SBOM for your project or notebook environment.
m.create_sbom(path=".", publish=False)

SBOM Publishing

m.create_sbom(path=".", publish=True)

Environment Variables

MAIS supports configuration through environment variables:

Core Configuration

MANIFEST_API_TOKEN - API token for MOSAIC/Manifest integration
MAIS_MOSAIC_API_URL - Override default API URL
MAIS_DEFAULT_VERBOSITY - Set default logging level
MAIS_API_TIMEOUT - API request timeout in seconds

All configuration values can be overridden with MAIS_ prefix.

Detection Mode Information

from mais import MAIS

m = MAIS(api_token="token", use_v2_detectors=True)

# Check current detection mode
print(m.get_detection_mode())  # "new" or "legacy"

# Get detailed detection information
info = m.get_detection_info()
print(info["detection_mode"])      # Current mode
print(info["source"])              # "constructor parameter" or "config/environment"
print(info["feature_flag"])        # Environment variable name
print(info["current_value"])       # Boolean value of feature flag

Project details

Release history Release notifications | RSS feed

2.1.3

Dec 1, 2025

2.1.2

Nov 26, 2025

This version

2.1.1

Nov 25, 2025

0.3.2

Sep 3, 2025

0.3.1

Aug 29, 2025

0.3.0

Aug 25, 2025

0.2.9

Aug 15, 2025

0.2.8

Aug 15, 2025

0.2.5

Aug 11, 2025

0.1.8

Apr 23, 2025

0.1.7

Apr 23, 2025

0.1.6

Apr 23, 2025

0.1.5

Apr 23, 2025

0.1.4

Apr 23, 2025

0.1.2

Apr 23, 2025

0.1.1

Apr 23, 2025

0.1.0

Apr 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mais-2.1.1.tar.gz (46.4 MB view details)

Uploaded Nov 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mais-2.1.1-py3-none-any.whl (46.8 MB view details)

Uploaded Nov 25, 2025 Python 3

File details

Details for the file mais-2.1.1.tar.gz.

File metadata

Download URL: mais-2.1.1.tar.gz
Upload date: Nov 25, 2025
Size: 46.4 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for mais-2.1.1.tar.gz
Algorithm	Hash digest
SHA256	`8d96cce31e3151ab81b424607f6a303398ebf03305efb541033203db4e496744`
MD5	`0e49ae39ff5e16d79f8f8658d95307ca`
BLAKE2b-256	`80afb1824733878587d79c0197232d56ace04277a23105cfe4adad7e647d0f08`

See more details on using hashes here.

File details

Details for the file mais-2.1.1-py3-none-any.whl.

File metadata

Download URL: mais-2.1.1-py3-none-any.whl
Upload date: Nov 25, 2025
Size: 46.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.11.14

File hashes

Hashes for mais-2.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`28c9ac98047a5a57fed5623541ef95222025796c8c271a5903cb773dd9a7acca`
MD5	`db15bdb63af9aabda113154fd72a358c`
BLAKE2b-256	`6ac48cab69639da703db8ef974621702937724cecd84e5ecfd00f9e5d624f949`

See more details on using hashes here.

mais 2.1.1

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

MAIS - ML Model Audit & Inspection System

Detection Architecture - V1 vs V2

🔄 V1: Legacy Baseline Detection (Default)

🚀 V2: Provider-Based Detection (Enhanced)

Architecture Overview

Additional Architecture Views

Core Components

📥 Input Layer

🔍 Provider-Specific Detectors

⚙️ Detection Strategies

📊 Intermediate Output

📋 JSON Schema Standardization

📦 SBOM Generation

📤 Output Formats

Installation

Detection Architecture Configuration

Constructor Parameter (Per Instance)

Google Colab Usage

Advanced Usage

SBOM Generation

SBOM Publishing

Environment Variables

Core Configuration

Detection Mode Information

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes