Skip to main content

A tool for programmatic and customizable generation of Raven's Progressive Matrices in the style of the RAVEN dataset

Project description

raven-gen

This repo contains a rewrite of the data-generation code originating from the CVPR paper:


RAVEN: A Dataset for Relational and Analogical Visual rEasoNing
Chi Zhang*, Feng Gao*, Baoxiong Jia, Yixin Zhu, Song-Chun Zhu
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019
(* indicates equal contribution.)

Dramatic progress has been witnessed in basic vision tasks involving low-level perception, such as object recognition, detection, and tracking. Unfortunately, there is still an enormous performance gap between artificial vision systems and human intelligence in terms of higher-level vision problems, especially ones involving reasoning. Earlier attempts in equipping machines with high-level reasoning have hovered around Visual Question Answering (VQA), one typical task associating vision and language understanding. In this work, we propose a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. Unlike previous works in measuring abstract reasoning using RPM, we establish a semantic link between vision and reasoning by providing structure representation. This addition enables a new type of abstract reasoning by jointly operating on the structure representation. Machine reasoning ability using modern computer vision is evaluated in this newly proposed dataset. Additionally, we also provide human performance as a reference. Finally, we show consistent improvement across all models by incorporating a simple neural module that combines visual understanding and structure reasoning.

@inproceedings{zhang2019raven, 
    title={RAVEN: A Dataset for Relational and Analogical Visual rEasoNing}, 
    author={Zhang, Chi and Gao, Feng and Jia, Baoxiong and Zhu, Yixin and Zhu, Song-Chun}, 
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
    year={2019}
}

Scientific Replication

I do not offer assurances that the code in this repository functions identically to the code in the original author's repository, or that the performance of models on data generated with this code may be directly and fairly compared to the performance of models on the published RAVEN dataset of Zhang et al. (2019). Authors who utilize this code to generate data and wish to compare model performance against that of models trained with the original RAVEN dataset should offer support for the fairness of this comparison and/or replicate previous results as appropriate.

Goals

The original code used to generate the RAVEN dataset did not offer convenient abstractions for calling RPM generation routines from within other programs. The Matrix class makes it easy to generate RPM problems on demand and to customize the specification of an RPM problem. For example:

>> from matrix import Matrix
>> rpm = Matrix.make(MatrixType.BRANCH, rulesets)
>> rpm.make_alternatives(N_ALTERNATIVES)
>> rpm.save("path/to/data/dir", "PUZZLENAME")
>> print(rpm)
>> print(rpm.rules)
>> with open("path/to/meta/dir/PUZZLENAME_rpm.txt") as f:
...    f.write(str(rpm))
>> with open("path/to/meta/dir/PUZZLENAME_rules.txt") as f:
...    f.write(str(rpm.rules))

The MatrixType enumeration includes the expected seven branches:

  • CENTER_SINGLE
  • DISTRIBUTE_FOUR
  • DISTRIBUTE_NINE
  • LEFT_CENTER_SINGLE_RIGHT_CENTER_SINGLE
  • UP_CENTER_SINGLE_DOWN_CENTER_SINGLE
  • IN_CENTER_SINGLE_OUT_CENTER_SINGLE
  • IN_DISTRIBUTE_FOUR_OUT_CENTER_SINGLE

The rulesets parameter of Matrix.make may be None or may be omitted. Custom rulesets of type List[List[Tuple[RuleType, AttributeType]]] are expected in the following format:

[[..., (RuleType.*, AttributeType.{NUMBER,POSITION,CONFIGURATION}), ...],
 [..., (RuleType.{CONSTANT, PROGRESSION, DISTRIBUTE_THREE}, AttributeType.TYPE)],
 [..., (RuleType.*, AttributeType.SIZE)],
 [..., (RuleType.*, AttributeType.COLOR)]].

where no inner list is empty.

To generate at most N_ALTERNATIVES wrong answers (fewer if there are not more modifications possible), Matrix.make_alternatives must be called. Matrix.save may be called with or without a prior call to Matrix.make_alternatives. Matrix.make_alternatives overwrites the results of previous calls and may be called any number of times.

The call to Matrix.save results in PUZZLENAME_answer.png and PUZZLENAME_alternative_{i}.png files being created in the specified directory. These images show a completed Raven's progressive matrix with either the correct answer or the ith alternative (incorrect) answer as the ninth (bottom-right) panel. This data format is intended for a binary prediction task rather than the task of picking the right completion out of a lineup. As such, we do not implement the alternative sampling mechanisms of the balanced RAVEN dataset or similar improvement efforts.

Serialization and Deserialization

The Matrix object not currently offer full serialization and deserialization methods to/from human readable formats. To generate a human-readable, but not machine-readable, report on a particular Matrix instance, one may call str on the instance or its rules attribute. The resulting outputs provide a full view of the patterns expressed in a puzzle, though an understanding of the relevant implementations may be necessary to understand all of their conventions. These outputs cannot be used to regain the originating Matrix object if it has not been otherwise saved; please pickle your Matrix objects if you anticipate inspecting them programatically beyond the lifetime of the program/interpreter session that originated them.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

raven_gen-0.1.0.tar.gz (29.8 kB view details)

Uploaded Source

Built Distribution

raven_gen-0.1.0-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file raven_gen-0.1.0.tar.gz.

File metadata

  • Download URL: raven_gen-0.1.0.tar.gz
  • Upload date:
  • Size: 29.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for raven_gen-0.1.0.tar.gz
Algorithm Hash digest
SHA256 e78e1da398ebb6fe9cd2d802fdb1e050fc5e90211cf3d3baad512e5fa677ea8f
MD5 8eba56e2e1132c52b9e16fc348b98dc0
BLAKE2b-256 06b71ed52dcb3d952c663bb2c3ae5f0d2d708e9f514585bb46a34c955bae939a

See more details on using hashes here.

File details

Details for the file raven_gen-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: raven_gen-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.8.10

File hashes

Hashes for raven_gen-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7c267cd6286dc1d2211b84a72e8e3f4c0cc96e53eae611a5ea2911986ae0b1a4
MD5 d043859ba14f319a0b3645d057beb057
BLAKE2b-256 db5d27fd0f6e6f4755813aeddcafa81aaf148a47c9225858f415de6e7496b610

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page