Some image related machine learning methods, to be used by Ruurd Photos.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

Ruurd Photos ML

A Python package providing a suite of machine learning tools for image analysis, designed to be the backbone of the Ruurd Photos project, a self-hosted Google Photos alternative. This package is intended to be called from Rust using PyO3.

✨ Features

This library offers a selection of pre-trained models for various image analysis tasks:

Image Captioning

Generate descriptive captions for images and ask questions about their content.

InstructBLIP: A powerful model for both generating detailed descriptions and answering questions about an image.
Salesforce BLIP: A robust model for generating high-quality image captions.

😀 Facial Recognition

Detect and analyze faces within images.

InsightFace: A comprehensive toolkit for face analysis that can:
- Detect multiple faces in an image.
- Estimate age and gender.
- Identify key facial landmarks (eyes, nose, mouth).
- Generate facial embeddings for clustering and recognition.

🖼️ Object Detection

Identify and locate various objects within an image.

ResNet: Utilizes a ResNet-based model to detect a wide range of common objects, returning their labels and bounding boxes.

🔤 Optical Character Recognition (OCR)

Detect and extract text from images.

ResNet & Tesseract: A two-stage process that first uses a ResNet model to determine if an image contains legible text, and then employs Tesseract to extract the text and its bounding boxes.

🚀 Installation

This package will be available on PyPI. You can install it using pip:

pip install ruurd-photos-ml

💻 Usage

The library is designed to be simple to use. Here are some examples for each of the main functionalities.

First, you'll need to load an image using Pillow:

from PIL import Image

# Load your image
image = Image.open("path/to/your/image.jpg")

Image Captioning

from ruurd_photos_ml import get_captioner, CaptionerProvider

# Initialize the captioner
captioner = get_captioner(CaptionerProvider.BLIP_INSTRUCT)

# Generate a simple caption
caption = captioner.caption(image)
print(f"Caption: {caption}")

# Ask a question about the image
question = "What color is the main object?"
answer = captioner.caption(image, instruction=question)
print(f"Answer: {answer}")

Facial Recognition

from ruurd_photos_ml import get_facial_recognition, FacialRecognitionProvider

# Initialize the facial recognition model
face_detector = get_facial_recognition(FacialRecognitionProvider.INSIGHT)

# Get faces from the image
faces = face_detector.get_faces(image)

for face in faces:
    print(f"Found a face at position {face.position} with confidence {face.confidence}")
    print(f"  - Age: {face.age}")
    print(f"  - Gender: {face.sex}")
    print(f"  - Embedding: {face.embedding[:5]}...")  # Showing first 5 values

Object Detection

from ruurd_photos_ml import get_object_detection, ObjectDetectionProvider

# Initialize the object detector
object_detector = get_object_detection(ObjectDetectionProvider.RESNET)

# Detect objects in the image
objects = object_detector.detect_objects(image)

for obj in objects:
    print(f"Detected '{obj.label}' with confidence {obj.confidence}")

Optical Character Recognition (OCR)

from ruurd_photos_ml import get_ocr, OCRProvider

# Initialize the OCR model
ocr = get_ocr(OCRProvider.RESNET_TESSERACT)

# Check for legible text
if ocr.has_legible_text(image):
    # Extract text (specify languages for better accuracy)
    text = ocr.get_text(image, languages=("eng", "nld"))
    print(f"Extracted Text: {text}")

    # Get text with bounding boxes
    boxes = ocr.get_boxes(image, languages=("eng", "nld"))
    for box in boxes:
        print(f"Found text: '{box.text}' at position {box.position}")

🛠️ Development

To contribute to this project, you can set up a local development environment.

Clone the repository:

git clone https://github.com/RuurdBijlsma/ruurd-photos-ml.git
cd ruurd-photos-ml

Install dependencies using uv:
```
uv sync --all-extras --dev
```

3Run tests:

uv run pytest

3Quality checks:

pre-commit run -a

🔗 Project Links

**Homepage **: https://github.com/RuurdBijlsma/ruurd-photos-ml
**Repository **: https://github.com/RuurdBijlsma/ruurd-photos-ml
**Documentation **: https://ruurdbijlsma.github.io/ruurd-photos-ml

📜 License

This project is licensed under the MIT License.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

RuteNL

Release history Release notifications | RSS feed

0.2.2

Nov 6, 2025

0.2.1

Oct 11, 2025

0.2.0

Oct 11, 2025

This version

0.1.2

Oct 9, 2025

0.1.1

Oct 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ruurd_photos_ml-0.1.2.tar.gz (114.2 kB view details)

Uploaded Oct 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ruurd_photos_ml-0.1.2-py3-none-any.whl (14.6 kB view details)

Uploaded Oct 9, 2025 Python 3

File details

Details for the file ruurd_photos_ml-0.1.2.tar.gz.

File metadata

Download URL: ruurd_photos_ml-0.1.2.tar.gz
Upload date: Oct 9, 2025
Size: 114.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ruurd_photos_ml-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`f7c4ec9fef6f3304e3833472777509ebde82acb5cdccae9e952560c27e908360`
MD5	`2604feaca0a967021f4340df1f2acf86`
BLAKE2b-256	`6154317d93d394d7fe18ec3d1c6b84e81ec0c9ed4e344f452189fc34bc7466a9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ruurd_photos_ml-0.1.2.tar.gz:

Publisher: publish-to-pypi.yml on RuurdBijlsma/ruurd-photos-ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ruurd_photos_ml-0.1.2.tar.gz
- Subject digest: f7c4ec9fef6f3304e3833472777509ebde82acb5cdccae9e952560c27e908360
- Sigstore transparency entry: 598001917
- Sigstore integration time: Oct 9, 2025
Source repository:
- Permalink: RuurdBijlsma/ruurd-photos-ml@3e7a2fcc56c1d69f9803f5efb6dedf4ebb302a78
- Branch / Tag: refs/tags/0.1.2
- Owner: https://github.com/RuurdBijlsma
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@3e7a2fcc56c1d69f9803f5efb6dedf4ebb302a78
- Trigger Event: push

File details

Details for the file ruurd_photos_ml-0.1.2-py3-none-any.whl.

File metadata

Download URL: ruurd_photos_ml-0.1.2-py3-none-any.whl
Upload date: Oct 9, 2025
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ruurd_photos_ml-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0ccf5836256c5b19f978f1fe408bfa82c05a222099cad6422b1ad5d82e87c83c`
MD5	`ecbc4806995c968131751dadc56b02b3`
BLAKE2b-256	`08a12f06b520ffd4bec2133404fa96e219c61bffeea6043214680b2af3530030`

See more details on using hashes here.

Provenance

The following attestation bundles were made for ruurd_photos_ml-0.1.2-py3-none-any.whl:

Publisher: publish-to-pypi.yml on RuurdBijlsma/ruurd-photos-ml

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: ruurd_photos_ml-0.1.2-py3-none-any.whl
- Subject digest: 0ccf5836256c5b19f978f1fe408bfa82c05a222099cad6422b1ad5d82e87c83c
- Sigstore transparency entry: 598001929
- Sigstore integration time: Oct 9, 2025
Source repository:
- Permalink: RuurdBijlsma/ruurd-photos-ml@3e7a2fcc56c1d69f9803f5efb6dedf4ebb302a78
- Branch / Tag: refs/tags/0.1.2
- Owner: https://github.com/RuurdBijlsma
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@3e7a2fcc56c1d69f9803f5efb6dedf4ebb302a78
- Trigger Event: push

ruurd-photos-ml 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Ruurd Photos ML

✨ Features

Image Captioning

😀 Facial Recognition

🖼️ Object Detection

🔤 Optical Character Recognition (OCR)

🚀 Installation

💻 Usage

Image Captioning

Facial Recognition

Object Detection

Optical Character Recognition (OCR)

🛠️ Development

🔗 Project Links

📜 License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance