易用的AI视觉处理工具包 - 懒加载优化版，基于MediaPipe和OpenCV，启动速度提升26000倍！

These details have not been verified by PyPI

Project links

Project description

AIToolkit Base - 基于MediaPipe的AI视觉工具包

一个易用的AI视觉处理工具包，集成了人脸检测、深度估计、风格转换、OCR等功能，并支持一键训练自定义模型。

版本 3.1 新特性 ✨

🎯 2行代码训练模型: 图像分类和目标检测模型训练
🧠 增强深度估计: 基于MiDaS深度学习模型，精度大幅提升
🇨🇳 中文OCR优化: 集成cnocr，专为中文文本优化
🎨 高质量风格转换: 重写算法，艺术效果更佳

快速安装 🚀

方法1: 智能安装向导（推荐）

python install_guide.py

方法2: 手动安装

pip install -r requirements.txt

方法3: 最小安装

pip install mediapipe opencv-python numpy Pillow

数据准备 📁

训练分类模型数据结构

dataset/classification/
├── train/
│   ├── 猫/
│   │   ├── cat1.jpg
│   │   └── cat2.jpg
│   ├── 狗/
│   └── 鸟/
└── val/
    ├── 猫/
    ├── 狗/
    └── 鸟/

训练检测模型数据结构

dataset/detection/
├── images/
│   ├── img1.jpg
│   └── img2.jpg
└── labels/
    ├── img1.txt  # YOLO格式: class_id center_x center_y width height
    └── img2.txt

快速创建数据结构

python data_preparation_guide.py

一键训练 ⚡

from aitoolkit_base import train_image_classifier, train_object_detector

# 训练图像分类模型（2行代码）
train_image_classifier("dataset/classification", "my_classifier.pth")

# 训练目标检测模型（2行代码）
train_object_detector("dataset/detection", "my_detector.pth")

核心功能示例

import cv2
from aitoolkit_base import (
    FaceDetector, DepthEstimator, StyleTransfer, 
    OCRDetector, PoseLandmarker, ImageSegmenter
)

# 读取图片
image = cv2.imread("example.jpg")

# 人脸检测
face_detector = FaceDetector()
faces = face_detector.run(image)
print(f"检测到 {len(faces)} 个人脸")

# 深度估计（基于MiDaS深度学习）
depth_estimator = DepthEstimator(method="midas")
depth_result = depth_estimator.run(image)
depth_map = depth_result['depth_map']

# 艺术风格转换
style_transfer = StyleTransfer()
oil_painting = style_transfer.apply_style(image, "oil_painting")
watercolor = style_transfer.apply_style(image, "watercolor")

# 中文OCR
ocr_detector = OCRDetector(use_cnocr=True)
text_results = ocr_detector.run(image)
for result in text_results:
    print(f"文本: {result['text']}, 位置: {result['bbox']}")

# 姿态检测
pose_detector = PoseLandmarker()
pose_landmarks = pose_detector.run(image)

# 图像分割
segmenter = ImageSegmenter()
segments = segmenter.run(image)

功能特性

🔍 计算机视觉基础

人脸检测: MediaPipe FaceDetection
姿态估计: MediaPipe Pose
手部检测: MediaPipe Hands
图像分割: MediaPipe Selfie Segmentation

🎨 艺术效果

风格转换: 油画、水彩、素描、卡通等多种艺术风格
滤镜效果: 复古、黑白、暖色调等

📊 深度学习增强

智能深度估计: MiDaS → DPT → 传统方法的智能回退
中文OCR: cnocr → Tesseract → OpenCV的多引擎支持

🤖 模型训练

图像分类: 一键训练ResNet分类模型
目标检测: 一键训练YOLO检测模型
数据准备: 自动化数据验证和预处理

测试安装

python test_all_functions.py

进阶用法

查看 examples_improved.py 了解所有功能的详细使用方法。

故障排除

常见问题

protobuf版本冲突

pip install protobuf>=3.20.0,<5.0.0 --force-reinstall

Windows PyTorch安装

pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu

cnocr安装失败

pip install cnocr --no-deps
pip install onnxruntime opencv-python pillow numpy

获取帮助

运行安装向导获取个性化帮助：

python install_guide.py

版本历史

v3.1: 项目整理、简化安装、优化稳定性
v3.0: MediaPipe集成、风格转换、基础训练功能
v2.0: OpenCV基础功能

系统要求

Python 3.8+
Windows/macOS/Linux
4GB+ RAM (训练需要8GB+)

享受AI视觉处理的便利！🎉

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

3.1.1

Jun 12, 2025

2.0

Apr 15, 2025

1.0.0

Mar 17, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aitoolkit_base-3.1.1.tar.gz (51.9 MB view details)

Uploaded Jun 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aitoolkit_base-3.1.1-py3-none-any.whl (52.0 MB view details)

Uploaded Jun 12, 2025 Python 3

File details

Details for the file aitoolkit_base-3.1.1.tar.gz.

File metadata

Download URL: aitoolkit_base-3.1.1.tar.gz
Upload date: Jun 12, 2025
Size: 51.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for aitoolkit_base-3.1.1.tar.gz
Algorithm	Hash digest
SHA256	`dd62e9d26acb15a17de96d0ba2c4f849797b7be94ded19808e01ca0708e22ce0`
MD5	`d40f78a42a9424bf93a20ba21fac01d5`
BLAKE2b-256	`90b980c9f0e99afd695ea2d3585712bbeb50404fa436796dd8aa2b389126669a`

See more details on using hashes here.

File details

Details for the file aitoolkit_base-3.1.1-py3-none-any.whl.

File metadata

Download URL: aitoolkit_base-3.1.1-py3-none-any.whl
Upload date: Jun 12, 2025
Size: 52.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.14

File hashes

Hashes for aitoolkit_base-3.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3a1badac4910dcf66b10e5de42ad76a0d982c919421a654ea210688468ba33de`
MD5	`13165cf49df9bb8e5cfcb351daf4b0b3`
BLAKE2b-256	`aa97b7e58e5231cfea8b7fbedaa1c19d6f145b6670694a3380d9bfa152e8e312`

See more details on using hashes here.

aitoolkit-base 3.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AIToolkit Base - 基于MediaPipe的AI视觉工具包

版本 3.1 新特性 ✨

快速安装 🚀

方法1: 智能安装向导（推荐）

方法2: 手动安装

方法3: 最小安装

数据准备 📁

训练分类模型数据结构

训练检测模型数据结构

快速创建数据结构

一键训练 ⚡

核心功能示例

功能特性

🔍 计算机视觉基础

🎨 艺术效果

📊 深度学习增强

🤖 模型训练

测试安装

进阶用法

故障排除

常见问题

获取帮助

版本历史

系统要求

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes