Skip to main content

Awesome myugpt created by Cardinal-Robo-Taxi

Project description

MyuGPT

PyPI - Downloads PyPI - Version codecov CI GitHub License

MyuZero uses AI guided Monte Carlo tree search to make good decisions and hence play games like Atari, Go, Chess, Shogi at a super-human level. Tesla has shown that it has recently applied a similar approach of AI Guided Tree Search for Path Planning. The difference being, at the moment Tesla likely uses their hard-coded simulator for training (along with their large dataset of user data). LLMs can takes the a programming problem statement as input along with the current code and its output and produces new code to process as output

There is potential to build a super human coding agent using LLMs and MyuZero

Inspiration

MyuZero

To summarise the MyuZero Paper, there are three neural networks:

  • h(img) -> S : Environment Encoder takes an image as input and provides a latent space representation as output
  • f(S) -> P,V : Policy-Value Function takes the environment state as input and produces a distribution of policies to take P, and their corresponding future reward value V.
  • g(Si, Ai) -> Ri Si+1 : Dynamics Model takes a state action pair (S, A) for a given frame i as input and produces the next state Si+1 along with the reward Ri for the action Ai.
  • The Environment Encoder is used to convert the sensor reading to a latent space. The Policy-Value Function is used to produce good candidate branches to explore further in the Monte Carlo Tree Search. The Dynamics Model facilitates the system to look into the future. Thus the the networks along with the Monte Carlo Tree Search framework is able to make an informed decision by looking down branches with potential and picking the one with the highest reward.

In the context of LLMs as coding agents, this is how it would translate:

  • h(env) -> S : Environment Encoder takes the problem statement, current code written and the output of the compiled code and wraps it all up into a text prompt for GPT
  • f(S) -> P,V : Policy-Value Function is an LLM. We would have to prompt it to produce a value as well (ask it to score itself). By varying the temperature of the model, we can sample multiple possible chains of thought and follow the most likely path
  • g(Si, Ai) -> Ri Si+1 : Dynamics Model is the code interpreter which the code request from GPT as output, runs the code and updates the environment (new code and output) Monte Carlo Tree search, guided by the three networks will be used to explore potential trajectories the car can take in the near future (say 1 to 5 seconds) and the trajectory with the highest reward is picked.

What would we have to look into:

  1. Prompt engineering to translate the environment to a prompt
  2. We will have to look into how this reward is calulated

Datasets

AlphaCode's Code Contests Dataset

CodeForces Dataset

LeetCode Dataset

Setup datasets:

mkdir ~/Datasets
cd ~/Datasets
git lfs clone https://huggingface.co/datasets/deepmind/code_contests

Usage

$ python -m myugpt
#or
$ myugpt

Development

Read the CONTRIBUTING.md file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

myugpt-0.2.0.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

myugpt-0.2.0-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file myugpt-0.2.0.tar.gz.

File metadata

  • Download URL: myugpt-0.2.0.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for myugpt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 66aba1a44a211148813d31007bd49b42c286eb02b2557c15c78116361863880c
MD5 ec681668e469021b33fb1bb09a3a43b6
BLAKE2b-256 3420a9386f15fc1a10b00c2c9991ad82275fbf895a68d0543fb485564b1cda94

See more details on using hashes here.

File details

Details for the file myugpt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: myugpt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.4

File hashes

Hashes for myugpt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de908b1873a74ece94ecb694ee1055a0c6bec215d496698ec0658bdd6ea14d0a
MD5 124e39ee0f0b21e49bb0a42bb77dd8b7
BLAKE2b-256 69211d97462c256075b4e170e5b30ccc3493d94de098f9a9a7fefdaf5870715f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page