A package for creating IL datasets

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

IL Datasets

Hi, welcome to the Imitation Learning (IL) Datasets. Something that always bothered me a lot was how difficult it was to find good weights for an expert, or trying to create a dataset for different state-of-the-art methods. For this reason I've created this repository in an effort to make it more accessible for researches to create datasets using experts from the Hugging Face.

How does it work?

This project also works with multithreading, which should accelerate the dataset creation. It consists of one Controller class, which requires two different functions to work: (i) a enjoy function (for the agent to play and record an episode); and a (ii) collate function (for putting all episodes together).

The enjoy function will receive 3 parameters and return 1:

path: str - where the episode is going to be recorded
experiment: Context - A class for recording all information (if you don't want to use print - keeping the console clear)
expert: Policy - A model based on the StableBaselines3 BaseAlgorithm.
returns: bool - Whether it was successfull or not

Obs: To use the model you can call predict, the policy class already has the correct form of using it (a.k.a., how the StableBaselines3 uses).

The collate function will receive 2 parameters and return 1:

path: str - where it should save the final dataset
episodes: list[str] - A list of paths for each file
returns: bool - Whether it was successfull or not

Requirements

I did use Python=3.9 during development.
All other requirements are listed in requirements.txt.

Registering new experts

If you would like to add new experts locally, you can call the Experts class. It uses the following structure:

identifier: str - A name for calling the expert.
policy: Policy - A dataclass with:
- name: str - Gym Environment name
- repo_id: str - Hugging Face repo indentification
- filename: str - Weights file name
- threshold: float - How much reward should the episode accumulate to be considered good
- algo: BaseAlgorithm - The class from StableBaselines3

Obs: If not using StableBaselines, the expert has to have a predict function that receives:

obs: Tensor - Current environment state
state: Tensor - Model's internal state
deterministic: bool - If it should explore or not

This repository is not complete

Here is a list of the upcoming releases:

Collate function support
Support for installing as a dependency
Module for downloading trajectories from a Hugging Face dataset
Create actual documentation
Create some examples
Create tests

If you like this repository be sure to check my other projects:

Development

Academic

Project details

These details have not been verified by PyPI

Project links

Homepage

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

1.0.0

Mar 11, 2026

0.6.1

Mar 21, 2024

0.6.0

Mar 19, 2024

0.5.0

Mar 1, 2024

0.4.0

Dec 15, 2023

0.3.0

Oct 25, 2023

This version

0.2.0

Jul 27, 2023

0.1.1

Jun 15, 2023

0.1.0

Apr 12, 2023

0.0.1

Oct 25, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

il-datasets-0.2.0.tar.gz (9.5 kB view details)

Uploaded Jul 27, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

il_datasets-0.2.0-py3-none-any.whl (11.5 kB view details)

Uploaded Jul 27, 2023 Python 3

File details

Details for the file il-datasets-0.2.0.tar.gz.

File metadata

Download URL: il-datasets-0.2.0.tar.gz
Upload date: Jul 27, 2023
Size: 9.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for il-datasets-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`0e138968d3816027dea006c5ca911d70ee3213a74d516bfab9cb355cabac25a3`
MD5	`633b968ea182eba25b4d57ca010d70df`
BLAKE2b-256	`fc9510d6f3d36587791787b5652d9dc541791edc6bae9149aa941a1255234b77`

See more details on using hashes here.

File details

Details for the file il_datasets-0.2.0-py3-none-any.whl.

File metadata

Download URL: il_datasets-0.2.0-py3-none-any.whl
Upload date: Jul 27, 2023
Size: 11.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.10.12

File hashes

Hashes for il_datasets-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3dc865176ce154997026a181cd5843d758c124b37d5a7dd8ce15cf92d3d2a0ca`
MD5	`6d5575f83b7517623afa2e409802d0ea`
BLAKE2b-256	`9bf31ba7ab2f49747beadb9c8b0e174cd0178a3b90bfece661260ff0a2d9c817`

See more details on using hashes here.

IL-Datasets 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

IL Datasets

How does it work?

Requirements

Registering new experts

This repository is not complete

If you like this repository be sure to check my other projects:

Development

Academic

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes