Skip to main content

GPT solutions for Brazilian university entrance exams.

Project description

gpt-resolve

Can GPT solve Brazilian university entrance exams?

This project is an implementation of how to use LLMs to solve challenging Brazilian university entrance exams.

We'll use o1-preview, which is the best OpenAI model so far with reasoning capabilities, and gpt-4o to describe the exam images so that o1-preview can solve them on question at a time (as it does not have image capabilities yet). Results are saved as txt files with LaTeX formatting, and you can optionally convert them to a nice PDF or using some LaTeX editor.

The project begins with the ITA (Instituto Tecnológico de Aeronáutica) 2025 exam, focusing first on the Math essay section. This section, from the recent exam on November 5, 2024, demands deep subject understanding and step-by-step solutions. More details are in the report documentation. Spoiler: o1-preview scored 90% in the Math essay exam.

After the first ITA 2025 exam is fully solved, the project will try to expand to other sections and eventually other exams. Feel free to contribute with ideas and implementations of other exams!

Table of exams to be solved:

Exam Year Model Status Score Report
ITA 2025 o1-preview 🚧 In Progress - Report

Installation and How to use

gpt-resolve is distributed in pypi:

pip install gpt-resolve

gpt-resolve provides a simple CLI with two main commands: resolve for solving exam questions and compile-solutions for generating PDFs from the solutions:

Solve exams with resolve

To generate solutions for an exam:

  • save the exam images in the exam folder exam_path, one question per image file
  • add OPENAI_API_KEY to your global environment variables or to a .env file in the current directory
  • run gpt-resolve resolve -p exam_path and grab a coffee while it runs.

If you want to test the process without making real API calls, you can use the --dry-run flag. See gpt-resolve resolve --help for more details about solving only a subset of questions or controlling token usage.

Compile solutions with compile-solutions

Once you have the solutions in your exam folder exam_path, you can compile them into a single PDF:

  • run gpt-resolve compile-solutions -p exam_path --title "Your Exam Title"

For that command to work, you'll need a LaTeX distribution in your system. See some guidelines here (MacTeX for MacOS was used to start this project).

Troubleshooting

Sometimes, it was observed that the output from o1-preview produced invalid LaTeX code when nesting display math environments (such as \[...\] and \begin{align*} ... \end{align*} together). The current prompt for o1-preview adds an instruction to avoid this, which works most of the time. If that happens, you can try to solve the question again by running gpt-resolve resolve -p exam_path -q <question_number>, or making more adjustments to the prompt, or fixing the output LaTeX code manually.

Costs

The o1-preview model is so far available only for Tiers 3, 4 and 5. It is 6x more expensive than gpt-4o, and also consumes much more tokens to "reason" (see more here), so be mindful about the number of questions you are solving and how many max tokens you're allowing gpt-resolve to use (see gpt-resolve resolve --help to control max-tokens-question-answer, which drives the cost). You can roughly estimate an upper bound for costs of solving an exam by

(number of questions) * (max_tokens_question_answer / 1_000_000) * (price per 1M tokens)

For the current price for o1-preview of $15/$60 per 1M tokens for input/output tokens, an 10 question exam with 10000 max tokens per question would cost less than $6.

Contributing

There are several ways you can contribute to this project:

  • Add solutions for new exams or sections
  • Improve existing solutions or model prompts
  • Add automatic evaluation metrics for answers
  • Create documentations for exams
  • Report issues or suggest improvements

To contribute, simply fork the repository, create a new branch for your changes, and submit a pull request. Please ensure your PR includes:

  • Clear description of the changes
  • Updated table entry (if adding new exam solutions)
  • Any relevant documentation

Feel free to open an issue first to discuss major changes or new ideas!

Sponsors

Buser Logo

This project is proudly sponsored by Buser, Brazil's leading bus travel platform.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gpt_resolve-0.2.0.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

gpt_resolve-0.2.0-py3-none-any.whl (9.6 kB view details)

Uploaded Python 3

File details

Details for the file gpt_resolve-0.2.0.tar.gz.

File metadata

  • Download URL: gpt_resolve-0.2.0.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.17 Darwin/23.6.0

File hashes

Hashes for gpt_resolve-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a0c83fad9f533553e209371d4f972dca35dabbd04225094320c86f5c25209a2c
MD5 75f58dc2da7374768e7dc9069e5a2d2a
BLAKE2b-256 a352bc9b75894e980499414cabb110a7ffc5beb3730e3617a0f972392b0b90dc

See more details on using hashes here.

File details

Details for the file gpt_resolve-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: gpt_resolve-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 9.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.9.17 Darwin/23.6.0

File hashes

Hashes for gpt_resolve-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 94b1a9e41cfec48ddfc3caafc9f3b42056e8fdeeacc3663942fb01b2f9941b79
MD5 cf1cd30e7e6905b86669086527d0dee1
BLAKE2b-256 4b186521461fbf7ca224598e0b932a7e77999a60ec5f7f438bdfd76d8e982c9d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page