Skip to main content

Awesome myugpt created by Cardinal-Robo-Taxi

Project description

MyuGPT

codecov CI

MyuZero Paper: https://arxiv.org/abs/1911.08265

MyuZero uses AI guided Monte Carlo tree search to make good decisions and hence play games like Atari, Go, Chess, Shogi at a super-human level. Tesla has shown that it has recently applied a similar approach of AI Guided Tree Search for Path Planning. The difference being, at the moment Tesla likely uses their hard-coded simulator for training (along with their large dataset of user data). LLMs can takes the a programming problem statement as input along with the current code and its output and produces new code to process as output

There is potential to build a super human coding agent using LLMs and MyuZero

Inspiration

MyuZero

To summarise the MyuZero Paper, there are three neural networks:

  • h(img) -> S : Environment Encoder takes an image as input and provides a latent space representation as output
  • f(S) -> P,V : Policy-Value Function takes the environment state as input and produces a distribution of policies to take P, and their corresponding future reward value V.
  • g(Si, Ai) -> Ri Si+1 : Dynamics Model takes a state action pair (S, A) for a given frame i as input and produces the next state Si+1 along with the reward Ri for the action Ai.
  • The Environment Encoder is used to convert the sensor reading to a latent space. The Policy-Value Function is used to produce good candidate branches to explore further in the Monte Carlo Tree Search. The Dynamics Model facilitates the system to look into the future. Thus the the networks along with the Monte Carlo Tree Search framework is able to make an informed decision by looking down branches with potential and picking the one with the highest reward.

In the context of LLMs as coding agents, this is how it would translate:

  • h(env) -> S : Environment Encoder takes the problem statement, current code written and the output of the compiled code and wraps it all up into a text prompt for GPT
  • f(S) -> P,V : Policy-Value Function is an LLM. We would have to prompt it to produce a value as well (ask it to score itself). By varying the temperature of the model, we can sample multiple possible chains of thought and follow the most likely path
  • g(Si, Ai) -> Ri Si+1 : Dynamics Model is the code interpreter which the code request from GPT as output, runs the code and updates the environment (new code and output) Monte Carlo Tree search, guided by the three networks will be used to explore potential trajectories the car can take in the near future (say 1 to 5 seconds) and the trajectory with the highest reward is picked.

What would we have to look into:

  1. Prompt engineering to translate the environment to a prompt
  2. We will have to look into how this reward is calulated

Datasets

AlphaCode's Code Contests Dataset

CodeForces Dataset

LeetCode Dataset

Usage

$ python -m myugpt
#or
$ myugpt

Development

Read the CONTRIBUTING.md file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myugpt-0.1.0.tar.gz (15.9 kB view details)

Uploaded Source

Built Distribution

myugpt-0.1.0-py3-none-any.whl (12.2 kB view details)

Uploaded Python 3

File details

Details for the file myugpt-0.1.0.tar.gz.

File metadata

  • Download URL: myugpt-0.1.0.tar.gz
  • Upload date:
  • Size: 15.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for myugpt-0.1.0.tar.gz
Algorithm Hash digest
SHA256 723b9c2c0d4dbf90cfe0c759470cb27cd977518ba8ca36b0913924db3adfc371
MD5 94ab4bbd8c07842c1a3a04ffe188d5d2
BLAKE2b-256 673bc87a79fd4b5e137399e9f0de42a6fbbcadb805a83a3ed8d88733af56bc1d

See more details on using hashes here.

File details

Details for the file myugpt-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: myugpt-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for myugpt-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0c800f2cac669de67f12b90bb8332b958746ab7d5ef43a2408df385e2c45aa8b
MD5 f557248ffe1177d22649a21d889fde50
BLAKE2b-256 2d8375dac070c89b61bbe97982c3c9187aa3d0fb1a639d84107db2b02f105d1c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page