Skip to main content

A tool for programmatic and customizable generation of Raven's Progressive Matrices in the style of the RAVEN dataset

Project description

raven-gen

This repo contains a rewrite of the data-generation code originating from the CVPR paper:


RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang*, Feng Gao*, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
(* indicates equal contribution.)

Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and tracking. Unfortunately, there is still an enormous performance gap between artificial vision systems and human intelligence in terms of higher-level vision problems, especially ones involving reasoning. Earlier attempts in equipping machines with high-level reasoning have hovered around Visual Question Answering (VQA), one typical task associating vision and language understanding. In this work, we propose a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. Unlike previous works in measuring abstract reasoning using RPM, we establish a semantic link between vision and reasoning by providing structure representation. This addition enables a new type of abstract reasoning by jointly operating on the structure representation. Machine reasoning ability using modern computer vision is evaluated in this newly proposed dataset. Additionally, we also provide human performance as a reference. Finally, we show consistent improvement across all models by incorporating a simple neural module that combines visual understanding and structure reasoning.

@inproceedings{zhang2019raven, 
    title={RAVEN: A Dataset for Relational and Analogical Visual rEasoNing}, 
    author={Zhang, Chi and Gao, Feng and Jia, Baoxiong and Zhu, Yixin and Zhu, Song-Chun}, 
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2019}
}

Scientific Replication

I do not offer assurances that the code in this repository functions identically to the code in the original author's repository, or that the performance of models on data generated with this code may be directly and fairly compared to the performance of models on the published RAVEN dataset of Zhang et al. (2019). Authors who utilize this code to generate data and wish to compare model performance against that of models trained with the original RAVEN dataset should offer support for the fairness of this comparison and/or replicate previous results as appropriate.

Goals

The original code used to generate the RAVEN dataset did not offer convenient abstractions for calling RPM generation routines from within other programs. The Matrix class makes it easy to generate RPM problems on demand and to customize the specification of an RPM problem. For example:

>> from matrix import Matrix
>> rpm = Matrix.make(MatrixType.BRANCH, rulesets)
>> rpm.make_alternatives(N_ALTERNATIVES)
>> rpm.save("path/to/data/dir", "PUZZLENAME")
>> print(rpm)
>> print(rpm.rules)
>> with open("path/to/meta/dir/PUZZLENAME_rpm.txt") as f:
...    f.write(str(rpm))
>> with open("path/to/meta/dir/PUZZLENAME_rules.txt") as f:
...    f.write(str(rpm.rules))

The MatrixType enumeration includes the expected seven branches:

  • CENTER_SINGLE
  • DISTRIBUTE_FOUR
  • DISTRIBUTE_NINE
  • LEFT_CENTER_SINGLE_RIGHT_CENTER_SINGLE
  • UP_CENTER_SINGLE_DOWN_CENTER_SINGLE
  • IN_CENTER_SINGLE_OUT_CENTER_SINGLE
  • IN_DISTRIBUTE_FOUR_OUT_CENTER_SINGLE

The rulesets parameter of Matrix.make may be None or may be omitted. Custom rulesets of type List[List[Tuple[RuleType, AttributeType]]] are expected in the following format:

[[..., (RuleType.*, AttributeType.{NUMBER,POSITION,CONFIGURATION}), ...],
 [..., (RuleType.{CONSTANT, PROGRESSION, DISTRIBUTE_THREE}, AttributeType.TYPE)],
 [..., (RuleType.*, AttributeType.SIZE)],
 [..., (RuleType.*, AttributeType.COLOR)]].

where no inner list is empty.

To generate at most N_ALTERNATIVES wrong answers (fewer if there are not more modifications possible), Matrix.make_alternatives must be called. Matrix.save may be called with or without a prior call to Matrix.make_alternatives. Matrix.make_alternatives overwrites the results of previous calls and may be called any number of times.

The call to Matrix.save results in PUZZLENAME_answer.png and PUZZLENAME_alternative_{i}.png files being created in the specified directory. These images show a completed Raven's progressive matrix with either the correct answer or the ith alternative (incorrect) answer as the ninth (bottom-right) panel. This data format is intended for a binary prediction task rather than the task of picking the right completion out of a lineup. As such, we do not implement the alternative sampling mechanisms of the balanced RAVEN dataset or similar improvement efforts.

Serialization and Deserialization

The Matrix object not currently offer full serialization and deserialization methods to/from human readable formats. To generate a human-readable, but not machine-readable, report on a particular Matrix instance, one may call str on the instance or its rules attribute. The resulting outputs provide a full view of the patterns expressed in a puzzle, though an understanding of the relevant implementations may be necessary to understand all of their conventions. These outputs cannot be used to regain the originating Matrix object if it has not been otherwise saved; please pickle your Matrix objects if you anticipate inspecting them programatically beyond the lifetime of the program/interpreter session that originated them.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raven_gen-0.1.0.tar.gz (29.8 kB view hashes)

Uploaded Source

Built Distribution

raven_gen-0.1.0-py3-none-any.whl (17.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page