Skip to main content

Python toolkit for ML, CV, NLP and multimodal AI development

Project description

maque (麻雀)

Python toolkit for ML, CV, NLP and multimodal AI development

PyPI version License tests pypi downloads


Features

  • MLLM Processing - Batch image analysis with OpenAI/Gemini compatible APIs
  • LLM Server - Local LLM inference with Transformers backend
  • Model Quantization - Support auto-round, AWQ, GPTQ, BNB quantization methods
  • Embedding Service - Text/multimodal embedding API server
  • Clustering Pipeline - UMAP + HDBSCAN for vector clustering and visualization
  • Async Executor - Priority queue-based concurrent task execution with retry
  • Rich CLI - Modular command groups for various tasks

Installation

# Basic installation
pip install maque

# With specific feature sets
pip install maque[torch,nlp,cv]          # ML/NLP/CV features
pip install maque[clustering,embedding]  # ML pipeline features
pip install maque[quant]                 # Model quantization support
pip install maque[dev,test]              # Development setup

# From source
pip install -e .
pip install -e .[dev,test]

CLI Usage

Commands are organized into groups: maque <group> <command>. Short alias mq is also available.

Config Management

maque config show                 # Show current configuration
maque config edit                 # Open config in editor
maque config init                 # Initialize config file

MLLM (Multimodal LLM)

# Process images from a table
maque mllm call-table data.xlsx --image_col="image_path" --model="gpt-4o"

# Process images from a folder
maque mllm call-images ./photos --recursive=True --output_file="results.csv"

LLM Server

# Start LLM inference server
maque llm serve Qwen/Qwen2.5-7B-Instruct --port=8000

# AWQ quantized model (requires: pip install maque[quant])
maque llm serve Qwen2.5-VL-3B-Instruct-AWQ

# Interactive chat
maque llm chat --model="gpt-4o"

Embedding Service

# Start embedding API server
maque embedding serve --model=BAAI/bge-m3 --port=8001

# Test embedding endpoint
maque embedding test --text="Hello world"

Data Processing

# Interactive table viewer (Streamlit)
maque data table-viewer data.csv --port=8501

# Convert between formats
maque data convert input.json output.csv

System Utilities

# Kill processes on ports
maque system kill 8000 8001

# Pack directory
maque system pack ./folder

# Split large file
maque system split large_file.dat --chunk_size=1GB

Claude Code Skill

# Install maque skill to Claude Code
maque install-skill

# Check installation status
maque skill-status

# Uninstall skill
maque uninstall-skill

After installation, use /maque in Claude Code to access maque documentation.

Git Helpers

# GitHub 镜像代理(国内加速)
maque git mirror-set                      # 设置全局镜像(默认 ghproxy)
maque git mirror-set --mirror=ghproxy-cdn # 使用 CDN 镜像
maque git mirror-status                   # 查看当前镜像配置
maque git mirror-unset                    # 移除镜像,恢复直连

# 设置后,原生 git 命令自动走镜像
git clone https://github.com/user/repo    # 自动使用镜像加速

# 可用镜像列表
maque git mirrors

# 单次使用镜像克隆(不修改全局配置)
maque git clone-mirror https://github.com/user/repo ./repo

Python API

IO Utilities

from maque import yaml_load, yaml_dump, json_load, json_dump, jsonl_load, jsonl_dump

# Load/save YAML
config = yaml_load("config.yaml")
yaml_dump(data, "output.yaml")

# Load/save JSONL
records = jsonl_load("data.jsonl")
jsonl_dump(records, "output.jsonl")

MLLM Client

from flexllm import MllmClient

client = MllmClient(
    base_url="https://api.openai.com/v1",
    api_key="your-api-key",
    model="gpt-4o"
)

# Single image
response = client.call("Describe this image", image_path="photo.jpg")

# Batch processing
from flexllm import MllmTableProcessor
processor = MllmTableProcessor(client)
results = processor.process("data.xlsx", image_col="image_path", prompt="Describe the image")

Async Executor

from flexllm.async_api import ConcurrentExecutor

async def process_item(item):
    # Your async processing logic
    return result

executor = ConcurrentExecutor(
    max_concurrent=10,
    max_qps=5,
    max_retries=3
)

results = await executor.run(
    process_item,
    items,
    progress=True
)

Embedding & Retrieval

from maque.embedding import TextEmbedding
from maque.retriever import ChromaRetriever, Document

# Initialize
embedding = TextEmbedding(base_url="http://localhost:8001/v1", model="bge-m3")
retriever = ChromaRetriever(
    embedding,
    persist_dir="./chroma_db",
    collection_name="my_data"
)

# Insert documents
documents = [Document(id="1", content="text...", metadata={"source": "file1"})]
retriever.upsert_batch(documents, batch_size=32, skip_existing=True)

# Search
results = retriever.search("query text", top_k=10)

Clustering Pipeline

from maque.clustering import ClusterAnalyzer

analyzer = ClusterAnalyzer(algorithm="hdbscan", min_cluster_size=15)

# Analyze from ChromaDB
result = analyzer.analyze_chroma(
    persist_dir="./chroma_db",
    collection_name="my_data",
    output_dir="./results",
    sample_size=10000,
    visualize=True
)

# Access results
print(f"Found {result.n_clusters} clusters")
print(result.labels)
print(result.cluster_stats)

Performance Measurement

from maque import MeasureTime

with MeasureTime("model inference", gpu=True):
    output = model(input)
# Prints: model inference took 0.123s (GPU: 0.089s)

Configuration

maque uses hierarchical configuration (highest priority first):

  1. ./maque_config.yaml (current directory)
  2. Project root config
  3. ~/.maque/config.yaml (user config)

Example configuration:

mllm:
  model: gpt-4o
  base_url: https://api.openai.com/v1
  api_key: ${OPENAI_API_KEY}

embedding:
  model: BAAI/bge-m3
  base_url: http://localhost:8001/v1

llm:
  default_port: 8000

Initialize config:

maque config init

Development

# Install development dependencies
pip install -e .[dev,test]

# Run tests
pytest
pytest -m "not slow"  # Skip slow tests

# Format code
black .
isort .

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

maque-0.2.13.tar.gz (1.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

maque-0.2.13-py3-none-any.whl (1.8 MB view details)

Uploaded Python 3

File details

Details for the file maque-0.2.13.tar.gz.

File metadata

  • Download URL: maque-0.2.13.tar.gz
  • Upload date:
  • Size: 1.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for maque-0.2.13.tar.gz
Algorithm Hash digest
SHA256 b469933d2a51bfc549602fc13f92abc8e751487b9d15de6dbf35aa5a5edbb264
MD5 bfa5ff1f4f38d68d1f556649ab2f0423
BLAKE2b-256 2dd8d4661b67b9d610826953164d1b12244a93b4c9eae0517561d4a98c70b998

See more details on using hashes here.

Provenance

The following attestation bundles were made for maque-0.2.13.tar.gz:

Publisher: python-publish.yml on KenyonY/maque

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file maque-0.2.13-py3-none-any.whl.

File metadata

  • Download URL: maque-0.2.13-py3-none-any.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for maque-0.2.13-py3-none-any.whl
Algorithm Hash digest
SHA256 4db9f7c1029d7c99678458838338a5162afdcf53b58a700cf79202e46918983f
MD5 adcc9be4b2792b0f36fc975474be369a
BLAKE2b-256 cf2386de58a975ef116309904b2b9523fa4d036310918cac6ea024ee3024f491

See more details on using hashes here.

Provenance

The following attestation bundles were made for maque-0.2.13-py3-none-any.whl:

Publisher: python-publish.yml on KenyonY/maque

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page