A PyPI package for object detection using advanced vision models

These details have not been verified by PyPI

Project links

Project description

Spatial Reasoning

A powerful Python package for object detection using advanced vision and reasoning models, including OpenAI's models and Google's Gemini.

Example Results Comparison of detection results across different models - showing the superior performance of the advanced reasoning model

Features

Multiple Detection Models:
- Advanced Reasoning Model (OpenAI) - Reasoning model that leverages tools and other foundation models to perform object detection
- Vanilla Reasoning Model - Directly using a reasoning model to perform object detection
- Vision Model - GroundingDino + SAM
- Gemini Model (Google) - Fine-tuned LMM for object detection
Tool-Use Reasoning: Our advanced model uses innovative grid-based reasoning for precise object detection

How the advanced reasoning model works under the hood - using grid cells for precise localization
Simple API: One function for all your detection needs
CLI Support: Command-line interface for quick testing

Installation

pip install spatial-reasoning

Or install from source:

git clone https://github.com/QasimWani/spatial-reasoning.git
cd spatial_reasoning
pip install -e .

Optional: Flash Attention (for better performance)

For improved performance with transformer models, you can optionally install Flash Attention:

pip install flash-attn --no-build-isolation

Note: Flash Attention requires CUDA development tools and must be compiled for your specific PyTorch/CUDA version. The package will work without it, just with slightly reduced performance.

Setup

Create a .env file in your project root:

# .env
OPENAI_API_KEY=your-openai-api-key-here
GEMINI_API_KEY=your-google-gemini-api-key-here

Get your API keys:

OpenAI: https://platform.openai.com/api-keys
Gemini: https://makersuite.google.com/app/apikey

Quick Start

Python API

from spatial_reasoning import detect

# Detect objects in an image
result = detect(
    image_path="https://ix-cdn.b2e5.com/images/27094/27094_3063d356a3a54cc3859537fd23c5ba9d_1539205710.jpeg",  # or image-path
    object_of_interest="farthest scooter in the image",
    task_type="advanced_reasoning_model"
)

# Access results
bboxes = result['bboxs']
visualized_image = result['visualized_image']
print(f"Found {len(bboxes)} objects")

# Save the result
visualized_image.save("output.jpg")

Command Line

# Basic usage
spatial-reasoning --image-path "image.jpg" --object-of-interest "person"  # "advanced_reasoning_model" used by default

# With specific model
spatial-reasoning --image-path "image.jpg" --object-of-interest "cat" --task-type "gemini"

# From URL with custom parameters
vision-evals \
  --image-path "https://example.com/image.jpg" \
  --object-of-interest "text in image" \
  --task-type "advanced_reasoning_model" \
  --task-kwargs '{"nms_threshold": 0.7}'

Available Models

advanced_reasoning_model (default) - Best accuracy, uses tool-use reasoning
vanilla_reasoning_model - Faster, standard detection
vision_model - Uses GroundingDino + (optional) SAM2 for segmentation
gemini - Google's Gemini model

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Aug 19, 2025

0.2.0

Aug 19, 2025

0.1.9

Aug 4, 2025

0.1.8

Aug 4, 2025

0.1.7

Aug 3, 2025

0.1.6

Aug 3, 2025

This version

0.1.5

Aug 3, 2025

0.1.4

Aug 3, 2025

0.1.3

Aug 2, 2025

0.1.2

Jul 29, 2025

0.1.1

Jul 29, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spatial_reasoning-0.1.5.tar.gz (38.2 kB view details)

Uploaded Aug 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

spatial_reasoning-0.1.5-py3-none-any.whl (47.0 kB view details)

Uploaded Aug 3, 2025 Python 3

File details

Details for the file spatial_reasoning-0.1.5.tar.gz.

File metadata

Download URL: spatial_reasoning-0.1.5.tar.gz
Upload date: Aug 3, 2025
Size: 38.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for spatial_reasoning-0.1.5.tar.gz
Algorithm	Hash digest
SHA256	`2114c728e00032c475c7f323dda2075404f68eca84f7586617dc7de3108a5d26`
MD5	`79d714cee865384af7eeb2dd901315d7`
BLAKE2b-256	`82ae62393deefa8c1f948d22fd07756352f4abd8efc86ceeaf2c8da3740625df`

See more details on using hashes here.

File details

Details for the file spatial_reasoning-0.1.5-py3-none-any.whl.

File metadata

Download URL: spatial_reasoning-0.1.5-py3-none-any.whl
Upload date: Aug 3, 2025
Size: 47.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for spatial_reasoning-0.1.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0f5f63073f81e7d463872ad194f1ba04b8d373b3f53788243752f03973b2e967`
MD5	`fc4a2ede007fc787f1e11bff93959b40`
BLAKE2b-256	`e6412f9db5dd863dec6b9d5048af7790d41543d5adac5d2cf8313b06a3d64a06`

See more details on using hashes here.

spatial-reasoning 0.1.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Spatial Reasoning

Features

Installation

Optional: Flash Attention (for better performance)

Setup

Quick Start

Python API

Command Line

Available Models

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes