Skip to main content

Reinforcement learning in tinygrad

Project description

tinygym

tinygym reimplements flashrl while using tinygrad instead of torch

🛠️ pip install tinygym or clone the repo & pip install -r requirements.txt

  • If cloned (or if envs changed), compile: python setup.py build_ext --inplace

The README of flashrl is mostly valid for tinygym with the biggest difference being:

  • tinygym is not fast (yet) -> Learns Pong in ~5 minutes instead of 5 seconds (on a RTX 3090)

Just like in flashrl, python train.py should look like this (with the progress bar moving ~60x slower):

Check out the onefile branch, if you want to make it fast(=try to make TinyJit work)!

Implementation differences to flashrl

The most important difference (enabled RL after 2 hours of debugging):

  • Use .abs().clip(min_=1e-8) in ppo to avoid close to zero values in (value - ret)

Without this the optimizer step can result in NaNs and "RL doesn't work" 😜

To potentially enable tinygrad.TinyJit (does not work yet, hence the slowness)

  • Learner does not .setup_data and
  • rollout is a function (instead of a Learner-method) that fills a list with Tensors and .stacks them at the end

Since it somehow performs better

  • .uniform (tinygrad default) instead of .kaiming_uniform (torch default) weight initialization for nn.Linear

Custom tinygrad-rewrites of torch.nn.init.orthogonal_ & torch.nn.utils.clip_grad_norm_are used

You'll find a .detach() here and a .contiguous() but other than that tinygym=flashrl 🤝

Acknowledgements 🙌

I want to thank

and last but not least...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tinygym-0.1.0-cp313-cp313-win_amd64.whl (693.8 kB view details)

Uploaded CPython 3.13Windows x86-64

tinygym-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

tinygym-0.1.0-cp313-cp313-macosx_10_13_universal2.whl (960.5 kB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

tinygym-0.1.0-cp312-cp312-win_amd64.whl (693.9 kB view details)

Uploaded CPython 3.12Windows x86-64

tinygym-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

tinygym-0.1.0-cp312-cp312-macosx_10_13_universal2.whl (964.8 kB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

tinygym-0.1.0-cp311-cp311-win_amd64.whl (688.2 kB view details)

Uploaded CPython 3.11Windows x86-64

tinygym-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

tinygym-0.1.0-cp311-cp311-macosx_10_9_universal2.whl (959.6 kB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

tinygym-0.1.0-cp310-cp310-win_amd64.whl (688.2 kB view details)

Uploaded CPython 3.10Windows x86-64

tinygym-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

tinygym-0.1.0-cp310-cp310-macosx_10_9_universal2.whl (959.0 kB view details)

Uploaded CPython 3.10macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file tinygym-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: tinygym-0.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 693.8 kB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for tinygym-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 d34c486ea82dcae6dbecd81fd8e387ccf6d36ddcd381989c8df60e9d11b8677d
MD5 cf153a595c1c6f564d26bfdcadb207c1
BLAKE2b-256 dedaae10c098b50711e009828560d5c029b1a7d3455b0fea18e21e575d6e2441

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9714d9a6558ace1dc35cd81c427a79b8f0f0994a4cf77a505862c56a47f371a0
MD5 7a4eb3395cfec53b9ae9c1210ed4fb0f
BLAKE2b-256 3b21f9c13e59a070aacb1c741290cd07da46566d68a3891212baf13ac085555e

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 c46dbcc54d228b214ecc5c6dbaabba9274575a01f05f60d3f2d3ff50f56b1b79
MD5 1abe93df995976d33ed254ca54793961
BLAKE2b-256 62c158bb13cb9dd371aace31f1687064df3fe6bbd68cd828d43fd10427c54395

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: tinygym-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 693.9 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for tinygym-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 39786adddb1c77159bf1d23b2dc94d4f9ab96766d9a703f54b973fe39b2e3e8f
MD5 00eb5872d7668e03eef1915b28db9274
BLAKE2b-256 3f1b6319ddf86b29fa06deb9265da7d23b877e7427b903d5e2b57046ce4edb96

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5b88d84f338d09a0b83d54ddd4319523da7e2bd2d9a48017b4a94d8d09d31de8
MD5 a597a749a8a2fab8ce92e008c3cc3787
BLAKE2b-256 942163f44a25c0ea23dcac45c7c98dc28d4244ea609d5347b97d30bd0ad636b6

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 729b4cdb38596577e323c8ff9030c4a17cf6a0b1fe69adf7cbf544dc31f56fc5
MD5 d72bae468782c715b7b42c605f3fe29c
BLAKE2b-256 2b2e6b82909ff66e7107cddf1153ac181c9dccb8f1cfc0577e4910dcdd8116e5

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: tinygym-0.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 688.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for tinygym-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e1bc16635b24db5634e4a978e28671eed1c0d031c85a4cf05172f5446221d069
MD5 eb838fe490baed015e8b52be3baab6f6
BLAKE2b-256 c2832bedd199674ad96b21bf4e212a7ba8f3444f95296906676dbc14e81e7b1d

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8890938d64b3770b62f065231ff48ddd5b2a3f84ab2b68911404043c90dbb3b0
MD5 2b5dab936f61b1e986bb3ae16b32fb80
BLAKE2b-256 7e2d2ae9a0c1b9585f4f4f346ba82a554eaeb8693c7f8e20b780be338df5556f

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 138e23b33c030409b6c23585801e4dd4b7b1b015835e89f2b0d4c22170f08d0a
MD5 8ffeed67d590be56184425dd4d453dc0
BLAKE2b-256 2f128b7824501f73764c8cef9059f3a3ba3cf83d32fe2168717091da6f1d77e2

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: tinygym-0.1.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 688.2 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for tinygym-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 da5f0c264cc58c760df94433a75d9143cf46a8d1927534cf09e5aeb03a0807fe
MD5 31bfecc5ef908708955ead64e370981a
BLAKE2b-256 67adfc91f751a40a6115fc55fe450a5f7f1a6410c72699cb7fab0c510e2053c8

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c9af217dc8f357f82163fb577084f5e31ee375febf1b655b6a305029edc6e606
MD5 1f6e347e6b2343e8f7aad37328d238d5
BLAKE2b-256 26780eb2d79993eb3415131a333336f6a75517a3b0529049e41b50f12c0f85e6

See more details on using hashes here.

File details

Details for the file tinygym-0.1.0-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tinygym-0.1.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 7a58b2befa521293828f96f4baa886de0cf378567cd379e1081c6ea07feb0f6d
MD5 56de83992c4d051214186d40da261c12
BLAKE2b-256 1f3dc28add197dfe9838411ff515bc152674bf97b992fda663afe646da0341ab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page