Skip to main content

A tool for programmatic and customizable generation of Raven's Progressive Matrices in the style of the RAVEN dataset

Project description

raven-gen

This repo contains a rewrite of the data-generation code originating from the CVPR paper:


RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang*, Feng Gao*, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
(* indicates equal contribution.)

Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and tracking. Unfortunately, there is still an enormous performance gap between artificial vision systems and human intelligence in terms of higher-level vision problems, especially ones involving reasoning. Earlier attempts in equipping machines with high-level reasoning have hovered around Visual Question Answering (VQA), one typical task associating vision and language understanding. In this work, we propose a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. Unlike previous works in measuring abstract reasoning using RPM, we establish a semantic link between vision and reasoning by providing structure representation. This addition enables a new type of abstract reasoning by jointly operating on the structure representation. Machine reasoning ability using modern computer vision is evaluated in this newly proposed dataset. Additionally, we also provide human performance as a reference. Finally, we show consistent improvement across all models by incorporating a simple neural module that combines visual understanding and structure reasoning.

@inproceedings{zhang2019raven, 
    title={RAVEN: A Dataset for Relational and Analogical Visual rEasoNing}, 
    author={Zhang, Chi and Gao, Feng and Jia, Baoxiong and Zhu, Yixin and Zhu, Song-Chun}, 
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2019}
}

Scientific Replication

I do not offer assurances that the code in this repository functions identically to the code in the original author's repository, or that the performance of models on data generated with this code may be directly and fairly compared to the performance of models on the published RAVEN dataset of Zhang et al. (2019). Authors who utilize this code to generate data and wish to compare model performance against that of models trained with the original RAVEN dataset should offer support for the fairness of this comparison and/or replicate previous results as appropriate.

Goals

The original code used to generate the RAVEN dataset did not offer convenient abstractions for calling RPM generation routines from within other programs. The Matrix class makes it easy to generate RPM problems on demand and to customize the specification of an RPM problem. For example:

>> from matrix import Matrix
>> rpm = Matrix.make(MatrixType.BRANCH, rulesets)
>> rpm.make_alternatives(N_ALTERNATIVES)
>> rpm.save("path/to/data/dir", "PUZZLENAME", background_color=255)
>> print(rpm)
>> print(rpm.rules)
>> with open("path/to/meta/dir/PUZZLENAME_rpm.txt") as f:
...    f.write(str(rpm))
>> with open("path/to/meta/dir/PUZZLENAME_rules.txt") as f:
...    f.write(str(rpm.rules))

The MatrixType enumeration includes the expected seven branches:

  • CENTER_SINGLE
  • DISTRIBUTE_FOUR
  • DISTRIBUTE_NINE
  • LEFT_CENTER_SINGLE_RIGHT_CENTER_SINGLE
  • UP_CENTER_SINGLE_DOWN_CENTER_SINGLE
  • IN_CENTER_SINGLE_OUT_CENTER_SINGLE
  • IN_DISTRIBUTE_FOUR_OUT_CENTER_SINGLE

The rulesets parameter of Matrix.make may be None or may be omitted. Custom rulesets of type List[List[Tuple[RuleType, AttributeType]]] are expected in the following format:

[[..., (RuleType.*, AttributeType.{NUMBER,POSITION,CONFIGURATION}), ...],
 [..., (RuleType.{CONSTANT, PROGRESSION, DISTRIBUTE_THREE}, AttributeType.TYPE)],
 [..., (RuleType.*, AttributeType.SIZE)],
 [..., (RuleType.*, AttributeType.COLOR)]].

where no inner list is empty.

To generate at most N_ALTERNATIVES wrong answers (fewer if there are not more modifications possible), Matrix.make_alternatives must be called. Matrix.save may be called with or without a prior call to Matrix.make_alternatives. Matrix.make_alternatives overwrites the results of previous calls and may be called any number of times.

The call to Matrix.save results in PUZZLENAME_answer.png and PUZZLENAME_alternative_{i}.png files being created in the specified directory. These images show a completed Raven's progressive matrix with either the correct answer or the ith alternative (incorrect) answer as the ninth (bottom-right) panel. This data format is intended for a binary prediction task rather than the task of picking the right completion out of a lineup. As such, we do not implement the alternative sampling mechanisms of the balanced RAVEN dataset or similar improvement efforts.

Serialization and Deserialization

The Matrix object not currently offer full serialization and deserialization methods to/from human readable formats. To generate a human-readable, but not machine-readable, report on a particular Matrix instance, one may call str on the instance or its rules attribute. The resulting outputs provide a full view of the patterns expressed in a puzzle, though an understanding of the relevant implementations may be necessary to understand all of their conventions. These outputs cannot be used to regain the originating Matrix object if it has not been otherwise saved; please pickle your Matrix objects if you anticipate inspecting them programatically beyond the lifetime of the program/interpreter session that originated them.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raven_gen-0.2.0.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

raven_gen-0.2.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file raven_gen-0.2.0.tar.gz.

File metadata

  • Download URL: raven_gen-0.2.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for raven_gen-0.2.0.tar.gz
Algorithm Hash digest
SHA256 588836aec6744a09c50e2056261937198fd14e4ce61e06559e81d89dce373994
MD5 42abcf9d917136bf55275fc68623ea70
BLAKE2b-256 90852ed0473d764db1f12208712b6afc08c378184654e8d600d39cc005864a8f

See more details on using hashes here.

File details

Details for the file raven_gen-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: raven_gen-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for raven_gen-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 75682af3ba29f37b1b40b626629e21e7f6fedc0e4e58eb1553eefbb4a2a95176
MD5 29a54f57592ce33d94c9289a2d7ffb41
BLAKE2b-256 66caf1a1fd8b8e27ae132807d01c4eb4f0fba0b9d7591b515648b3d2c0d32f3a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page