Skip to main content

SwiftTD: Fast and Robust TD Learning

Project description

SwiftTD: A Fast and Robust Algorithm for Temporal Difference Learning

SwiftTD is an algorithm for learning value functions. It combines the ideas of step-size adaptation with the idea of a bound on the rate of learning. The implementations in this repository use linear function approximation.

Installation

pip install SwiftTD

Usage

After installation, you can use the three implementations of SwiftTD in Python as:

import swifttd

# Version of SwiftTD that expects the full feature vector as input. This should only be used if the feature representation is not sparse. Otherwise, the sparse versions are more efficient.
td_dense = swifttd.SwiftTDNonSparse(
    num_of_features=5,     # Number of input features
    lambda_=0.95,        # Lambda parameter for eligibility traces
    alpha=1e-2,  # Initial learning rate
    gamma=0.99,        # Discount factor
    epsilon=1e-5,          # Small constant for numerical stability
    eta=0.1, # Maximum allowed step size (bound on rate of learning)
    decay=0.999, # Step size decay rate
    meta_step_size=1e-3,  # Meta learning rate
    eta_min=1e-10 # Minimum value of the step-size parameter
)

# Feature vector
features = [1.0, 0.0, 0.5, 0.2, 0.0] 
reward = 1.0
prediction = td_dense.step(features, reward)
print("Dense prediction:", prediction)

# Version of SwiftTD that expects the feature indices as input. This version assumes that the features are binary---0 or 1. For learning, the indices of the features that are 1 are provided. 
td_sparse = swifttd.SwiftTDBinaryFeatures(
    num_of_features=1000,     # Number of input features
    lambda_=0.95,        # Lambda parameter for eligibility traces
    alpha=1e-2,  # Initial learning rate
    gamma=0.99,        # Discount factor
    epsilon=1e-5,          # Small constant for numerical stability
    eta=0.1, # Maximum allowed step size (bound on rate of learning)
    decay=0.999, # Step size decay rate
    meta_step_size=1e-3,  # Meta learning rate
    eta_min=1e-10 # Minimum value of the step-size parameter
)

# Specify the indices of the features that are 1.
active_features = [1, 42, 999]  # Indices of active features
reward = 1.0
prediction = td_sparse.step(active_features, reward)
print("Sparse binary prediction:", prediction)

# Version of SwiftTD that expects the feature indices and values as input. This version does not assume that the features are binary. For learning, it expects a list of (index, value) pairs. Only the indices of the features that are non-zero need to be provided. 

td_sparse_nonbinary = swifttd.SwiftTD(
    num_of_features=1000,     # Number of input features
    lambda_=0.95,        # Lambda parameter for eligibility traces
    alpha=1e-2,  # Initial learning rate
    gamma=0.99,        # Discount factor
    epsilon=1e-5,          # Small constant for numerical stability
    eta=0.1, # Maximum allowed step size (bound on rate of learning)
    decay=0.999, # Step size decay rate
    meta_step_size=1e-3,  # Meta learning rate
    eta_min=1e-10 # Minimum value of the step-size parameter
)

# Specify the indices and values of the features that are non-zero.
feature_values = [(1, 0.8), (42, 0.3), (999, 1.2)]  # (index, value) pairs
reward = 1.0
prediction = td_sparse_nonbinary.step(feature_values, reward)
print("Sparse non-binary prediction:", prediction)

Resources

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

swifttd-0.2.0.tar.gz (6.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

swifttd-0.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (107.0 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ ARM64

swifttd-0.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (106.1 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ ARM64

swifttd-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (114.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (108.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ ARM64

swifttd-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (113.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (106.3 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ ARM64

swifttd-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (113.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (106.6 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ ARM64

swifttd-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (113.2 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

swifttd-0.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (106.3 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ ARM64

File details

Details for the file swifttd-0.2.0.tar.gz.

File metadata

  • Download URL: swifttd-0.2.0.tar.gz
  • Upload date:
  • Size: 6.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for swifttd-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ffcaea4c7148387510fa2c56872059e1a8c7fbfe3449840b07f9bcf8337a048b
MD5 90e0fe315476626c45f09128f5dc6e90
BLAKE2b-256 38f14a178f5a18ee0ed265a3713febf60967dc5464325d04ebc741d235f40939

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 26dc2601d5b0ddfaffb194bb4e92b6b102f0486f2383ead0eaea5be2b2531248
MD5 eab1bfcb2e869544252c85f223370ae7
BLAKE2b-256 369518dd6dfd366308ca97323b7496e29cc324bede55173b4e743f39df2fdb25

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 01a4a8f467bd686a4116c81a82740b5a613ae9e9d4a977bcfb482f8218bbeb0f
MD5 9eb6527a417909837ab5a0b464ca93a0
BLAKE2b-256 6633801ec24e31deb3031f891be42634a330e985a93c09af7a567611ff9d2a8e

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20fb074c45c369f9653f3d7c628ad5358040cf44c53eace3935f5c56933a2220
MD5 930822b5b9bb5e1150e42b76051d218e
BLAKE2b-256 64af4f14480e7cff57ce580b1a533347ed6d29408e7d8fe241109683401f0a17

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 681cf13a55db577420b36839b1a1ba180661dbf828eb387f417a77c1a646ecdb
MD5 44553bbd33a5abb1b661b8f7036ba179
BLAKE2b-256 affaa86c29d5c8480bd714dae93e130f40484723e17dcaf28f952857ab71c659

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 27499396379f9ae2b1cdeb99838709c5286646c92e78891d712d556164ad185d
MD5 462f5be3b1eb61a0f93bd813797f2acd
BLAKE2b-256 a56e02c71a9354de8218cc9d37c7bd81515e8b3a41eabb1608436ffd3d28a9b0

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 15515693cd792a72a533cee5dedfad9a0e5204a5efdbe1e0a0373d6820928484
MD5 190c717a56f587f7c56fe754b18f2c4a
BLAKE2b-256 7279c468845c6460bd4cb7ad686767e1d2155ac5de679871e2f64be0ad88f7f9

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 fb3d1064cb0763166301261590b06487c3dc81656a8cd3657382198027c9efa1
MD5 bfe8fc9144464e61799bfec5667c6104
BLAKE2b-256 4e876e704ae2dff68ca6ac4954e60030090ad258261cb55b3dfa69cccf4c21ab

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b40ffed0a13812743d22e78c1cc2206f3e5dd4e2a3e11d7f3a1359fde261ac3d
MD5 d56bed4e5b703ab67801c8b3d91758eb
BLAKE2b-256 748884338dfc03660c4a02a2b6a552063ce7a11e1a222c14b9d028208c868f1e

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b453fdee918ea3a22dbdce1371775b946e52c3dd182401d308d0beec11242663
MD5 94d2f322e3b2e6b2158e8f8fd96b91f3
BLAKE2b-256 2acb85451fc999b486a21544e2ce10a90b2f93194c5ed4f0b0414ea4dc271e54

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 139d8c3b4940f7200d139915e2c525cc78d1474850cc80d488df7fac272ca844
MD5 9e509fcbf6b9c8e6f37341f80e44a612
BLAKE2b-256 eadc6bb1bbfc29e2863f254fe285edb500d1647350992e212647026261d7cdde

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 02b942d8c0fb089c77e742b8d9f315900c6c19d3f5ae4fb5780f5171a4ff5ea7
MD5 0566fa15c1d0ec923a4b5335c5cd34b7
BLAKE2b-256 b14d8e06a02012fc4bce3d883b2ba59722a5f6c541afb88a6487bc407741bb22

See more details on using hashes here.

File details

Details for the file swifttd-0.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for swifttd-0.2.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 d7d087818d6128a1f3c9304d126d203beb33e008c3c18d5179fdfd5f3713ca25
MD5 a2810fea97984b6a891bd1df148230d6
BLAKE2b-256 cd87c5685e874961a1887b843270cbda032b5f72e566743066514f715a120f70

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page