High-performance MTCNN face detection optimized for Apple Neural Engine

These details have not been verified by PyPI

Project description

PyMTCNN

High-performance MTCNN face detection optimized for Apple Neural Engine, achieving 34.26 FPS on Apple Silicon.

Overview

PyMTCNN is a pure Python implementation of MTCNN (Multi-task Cascaded Convolutional Networks) that leverages CoreML and Apple's Neural Engine for hardware-accelerated face detection. It achieves 175.7x speedup over baseline Python implementations while maintaining 95% IoU accuracy.

Key Features

High Performance: 34.26 FPS with batch processing on Apple Silicon
Accurate: 95% IoU agreement with C++ OpenFace baseline
Easy to Use: Simple, clean Python API
Hardware Accelerated: Leverages Apple Neural Engine (ANE)
Flexible: Single-frame or batch processing modes
Production Ready: Optimized for real-time video analysis

Performance

Method	FPS	ms/frame	Use Case
`detect()`	31.88	31.4	Single-frame real-time
`detect_batch(4)`	34.26	29.2	Batch video processing

Speedup: 175.7x faster than baseline Python implementation

Requirements

macOS: macOS 13.0 or later
Hardware: Apple Silicon (M1, M2, M3) recommended
Python: 3.8 or later

Installation

From Source

git clone https://github.com/your-org/PyMTCNN.git
cd PyMTCNN
pip install -e .

From PyPI (Coming Soon)

pip install pymtcnn

Quick Start

Single Frame Detection

import cv2
from pymtcnn import CoreMLMTCNN

# Initialize detector
detector = CoreMLMTCNN()

# Load image
img = cv2.imread("image.jpg")

# Detect faces
bboxes, landmarks = detector.detect(img)

# Process results
print(f"Detected {len(bboxes)} faces")
for i, bbox in enumerate(bboxes):
    x, y, w, h, conf = bbox
    print(f"Face {i+1}: ({x:.0f}, {y:.0f}) {w:.0f}×{h:.0f} (confidence: {conf:.3f})")

Batch Video Processing

import cv2
from pymtcnn import CoreMLMTCNN

# Initialize detector
detector = CoreMLMTCNN()

# Load video frames
cap = cv2.VideoCapture("video.mp4")
frames = []
for _ in range(4):  # Process 4 frames at a time
    ret, frame = cap.read()
    if ret:
        frames.append(frame)

# Batch detection (cross-frame batching for maximum throughput)
results = detector.detect_batch(frames)

# Process results
for i, (bboxes, landmarks) in enumerate(results):
    print(f"Frame {i+1}: {len(bboxes)} faces detected")

API Reference

`CoreMLMTCNN`

Main face detector class.

Constructor

CoreMLMTCNN(
    min_face_size=60,
    thresholds=[0.6, 0.7, 0.7],
    factor=0.709,
    coreml_dir=None,
    verbose=False
)

Parameters:

min_face_size (int): Minimum face size in pixels. Default: 60
thresholds (list): Detection thresholds for [PNet, RNet, ONet]. Default: [0.6, 0.7, 0.7]
factor (float): Image pyramid scale factor. Default: 0.709
coreml_dir (str): Path to CoreML models directory. Default: bundled models
verbose (bool): Enable verbose logging. Default: False

Methods

`detect(image)`

Detect faces in a single image using within-frame batching.

Parameters:

image (numpy.ndarray): Input image (BGR format, H×W×3)

Returns:

bboxes (numpy.ndarray): Bounding boxes (N×5), format: [x, y, w, h, confidence]
landmarks (numpy.ndarray): Facial landmarks (N×5×2), 5 points per face: left eye, right eye, nose, left mouth, right mouth

Performance: 31.88 FPS (31.4 ms/frame)

`detect_batch(frames)`

Detect faces in multiple frames using cross-frame batching.

Parameters:

frames (list): List of images (each BGR format, H×W×3)

Returns:

results (list): List of (bboxes, landmarks) tuples, one per frame

Performance: 34.26 FPS (29.2 ms/frame) with batch_size=4

Recommended batch size: 4 frames for optimal throughput

Performance Guide

When to Use Each Method

detect(): Use for real-time per-frame processing, webcam feeds, or when you need lowest latency
detect_batch(): Use for offline batch video processing, maximum throughput, or when processing multiple frames simultaneously

Optimization Tips

Batch Size: Use 4 frames for optimal throughput
- Larger batches (8, 16) are slower due to overhead
Frame Resolution: Performance tested on 1920×1080
- Lower resolution → faster processing
- Higher resolution → more candidates, may require batch splitting
Min Face Size: Increase min_face_size for better performance
- Default: 60 pixels
- 80-100 pixels: 1.2-1.5x faster (may miss smaller faces)

Examples

See the examples/ directory for complete examples:

single_frame_detection.py: Basic single-frame face detection
batch_processing.py: Batch video processing
s1_integration_example.py: Integration with S1 video pipeline

Accuracy

PyMTCNN maintains high accuracy while achieving exceptional performance:

Mean IoU: 95% vs C++ OpenFace baseline
Detection Agreement: 100% (same faces detected)
Validation: Tested on 30 frames from real-world patient videos

Architecture

PyMTCNN uses a three-stage cascade architecture:

PNet (Proposal Network): Fast candidate generation using image pyramid
RNet (Refinement Network): Candidate refinement with batching
ONet (Output Network): Final bbox regression and landmark prediction

All networks are converted to CoreML FP32 format with flexible batch dimensions (1-50) for optimal ANE utilization.

Optimization Journey

PyMTCNN achieved a 175.7x speedup through multiple optimization phases:

Phase	Implementation	FPS	Speedup	Status
Baseline	Pure Python CNN	0.195	1.0x	✅
Phase 1	Vectorized NumPy	0.910	4.7x	✅
Phase 2	ONNX Runtime CPU	5.870	30.1x	✅
Phase 3	CoreML + ANE	13.56	69.5x	✅
Phase 4	Within-Frame Batching	31.88	163.5x	✅
Phase 5	Cross-Frame Batching	34.26	175.7x	✅

See docs/OPTIMIZATION_JOURNEY.md for the complete story.

License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

You are free to:

Share: Copy and redistribute the material
Adapt: Remix, transform, and build upon the material

Under the following terms:

Attribution: You must give appropriate credit
NonCommercial: You may not use the material for commercial purposes

See LICENSE for full terms.

Citation

If you use PyMTCNN in your research, please cite:

@software{pymtcnn2025,
  title={PyMTCNN: High-Performance MTCNN Face Detection for Apple Silicon},
  author={SplitFace},
  year={2025},
  url={https://github.com/your-org/PyMTCNN}
}

Acknowledgments

Original MTCNN paper: Zhang et al., "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks"
C++ OpenFace implementation: Tadas Baltrušaitis et al.
Apple Neural Engine optimization insights from the CoreML community

Support

For issues, questions, or contributions, please visit the GitHub repository.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.1.5

Feb 3, 2026

1.1.4

Feb 3, 2026

1.1.3

Dec 25, 2025

1.1.2

Dec 25, 2025

1.1.1

Dec 17, 2025

1.1.0

Nov 14, 2025

This version

1.0.0

Nov 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pymtcnn-1.0.0.tar.gz (965.3 kB view details)

Uploaded Nov 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pymtcnn-1.0.0-py3-none-any.whl (950.0 kB view details)

Uploaded Nov 14, 2025 Python 3

File details

Details for the file pymtcnn-1.0.0.tar.gz.

File metadata

Download URL: pymtcnn-1.0.0.tar.gz
Upload date: Nov 14, 2025
Size: 965.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pymtcnn-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`3c33bfda8f5ffcecdaa7cedaa68a50048377fb7bd0c71e089e6f10634ae94e7d`
MD5	`5c4e18cfbcf578060f0caabc2ec15f5f`
BLAKE2b-256	`f79e5aedbc7868f483c5865127acbe1264a3cf611d368a045dfc498503bfae50`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pymtcnn-1.0.0.tar.gz:

Publisher: publish.yml on johnwilsoniv/pymtcnn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pymtcnn-1.0.0.tar.gz
- Subject digest: 3c33bfda8f5ffcecdaa7cedaa68a50048377fb7bd0c71e089e6f10634ae94e7d
- Sigstore transparency entry: 701410394
- Sigstore integration time: Nov 14, 2025
Source repository:
- Permalink: johnwilsoniv/pymtcnn@a8ead390b1b5984cc6ff9deb3b4d8340c0e8a887
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/johnwilsoniv
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a8ead390b1b5984cc6ff9deb3b4d8340c0e8a887
- Trigger Event: release

File details

Details for the file pymtcnn-1.0.0-py3-none-any.whl.

File metadata

Download URL: pymtcnn-1.0.0-py3-none-any.whl
Upload date: Nov 14, 2025
Size: 950.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for pymtcnn-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bde04dc2c4383bf96ad2d75555e86ee99f768629aa2dc81b6cda957fff2c6c24`
MD5	`0892ee0c0ecbeb14142b509b20d75a2e`
BLAKE2b-256	`6765e827d4377a74bce5fae68c7f72222da36d3198aab8ef936b63c97890340c`

See more details on using hashes here.

Provenance

The following attestation bundles were made for pymtcnn-1.0.0-py3-none-any.whl:

Publisher: publish.yml on johnwilsoniv/pymtcnn

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: pymtcnn-1.0.0-py3-none-any.whl
- Subject digest: bde04dc2c4383bf96ad2d75555e86ee99f768629aa2dc81b6cda957fff2c6c24
- Sigstore transparency entry: 701410398
- Sigstore integration time: Nov 14, 2025
Source repository:
- Permalink: johnwilsoniv/pymtcnn@a8ead390b1b5984cc6ff9deb3b4d8340c0e8a887
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/johnwilsoniv
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a8ead390b1b5984cc6ff9deb3b4d8340c0e8a887
- Trigger Event: release

pymtcnn 1.0.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

PyMTCNN

Overview

Key Features

Performance

Requirements

Installation

From Source

From PyPI (Coming Soon)

Quick Start

Single Frame Detection

Batch Video Processing

API Reference

CoreMLMTCNN

Constructor

Methods

detect(image)

detect_batch(frames)

Performance Guide

When to Use Each Method

Optimization Tips

Examples

Accuracy

Architecture

Optimization Journey

License

Citation

Acknowledgments

Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`CoreMLMTCNN`

`detect(image)`

`detect_batch(frames)`