Skip to main content

Oculus Vision-Language Model - Inference SDK for multimodal AI research

Project description

Oceanir

Oculus Vision-Language Model SDK for multimodal AI research.

Installation

pip install oceanir

Quick Start

from oceanir import Oculus

# Load the model
model = Oculus.from_pretrained("OceanirAI/Oculus-0.1-Instruct")

# Visual Question Answering
answer = model.ask("photo.jpg", "What is the person doing?")
print(answer)  # "The person is riding a bicycle."

# Image Captioning
caption = model.caption("photo.jpg")
print(caption)  # "A dog playing in the park with a frisbee."

# Object Detection
results = model.detect("photo.jpg")
for box, label, conf in zip(results['boxes'], results['labels'], results['confidences']):
    print(f"{label}: {conf:.2f}")

# Counting Objects
count = model.count("crowd.jpg", "people")
print(f"Found {count} people")

Models

Model Description
OceanirAI/Oculus-0.1-Instruct Instruction-tuned for general VQA and captioning
OceanirAI/Oculus-0.1-Reasoning Enhanced with chain-of-thought reasoning

Reasoning Mode

Enable thinking traces for complex questions:

# With reasoning
answer = model.ask(
    "complex_scene.jpg",
    "How many red cars are parked on the left side?",
    think=True
)

Features

  • Visual Question Answering (VQA) - Answer questions about images
  • Image Captioning - Generate natural language descriptions
  • Object Detection - Detect and localize objects with bounding boxes
  • Object Counting - Count specific objects in images
  • Semantic Segmentation - Pixel-level scene understanding
  • Chain-of-Thought Reasoning - Step-by-step reasoning for complex tasks

Architecture

Oculus combines:

  • DINOv2 - Self-supervised vision transformer for semantic understanding
  • SigLIP - Vision-language alignment for text understanding
  • Trained Projector - Maps vision features to language space
  • BLIP - Language model for text generation

License

This software is released under the Oceanir Research License.

Permitted Uses:

  • Academic research
  • Educational purposes
  • Publishing papers with results
  • Personal experimentation

Prohibited Uses:

  • Commercial applications
  • Training commercial models
  • Integration into commercial products

For commercial licensing, contact: licensing@oceanir.ai

Citation

If you use Oceanir in your research, please cite:

@software{oculus2026,
  title={Oculus Vision-Language Model},
  author={OceanirAI},
  year={2026},
  url={https://github.com/OceanirAI/oceanir}
}

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

oceanir-0.1.0.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

oceanir-0.1.0-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file oceanir-0.1.0.tar.gz.

File metadata

  • Download URL: oceanir-0.1.0.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for oceanir-0.1.0.tar.gz
Algorithm Hash digest
SHA256 1d341efc6ce54901e61af53d5c66935c1465c3c93de33f209fff8b78e87a0c99
MD5 4e0d42ae7a9dbca16de91a79a636f8c0
BLAKE2b-256 835812c318def704d23df2be85b02e7bee0d19705cd29a14e21b23b70c41ecfb

See more details on using hashes here.

File details

Details for the file oceanir-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: oceanir-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for oceanir-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0f5a0455e261f9c5f798fd03cd5024aa3420813492833413ca245432fed12651
MD5 65a63fa7391583079f52df92c4313775
BLAKE2b-256 4375658970af8590d1a259afd216d5051497a2578357aa37b72df611bbab7f9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page