A re-implementation of NACE, as a pypi package, with a cleaner more general interface.

These details have not been verified by PyPI

Project description

An observational learner, creating a model of the world from subsequent observations, which can resolve
conflicting information, and plan many steps ahead, in an extremely sample efficient manner.

Background

This project builds upon an implementation of X's NACE work (Paper under review) observational learner, which in turn was based on Berick Cook's AIRIS, with added support for partial observability, capabilities to handle non-deterministic and non-stationary environments, as well as changes external to the agent. X achieved this by incorporating relevant components of Non-Axiomatic Logic (NAL).

The aim of this project is to convert the above work, into a foundation that extra experiments can be performed on.

Examples

import sys
import nace

print("Welcome to NACE!")

# This example uses the code from the original nace.world_module which hard codes 
# effects of actions on the 'world'. This complicates the example code, but
# ensures that the use of global variables do not let the planning code to 'cheat'.

if __name__ == "__main__":
    # Configure hypotheses to use Euclidean space properties if desired
    nace.hypothesis.Hypothesis_UseMovementOpAssumptions(
        nace.world_module.left,
        nace.world_module.right,
        nace.world_module.up,
        nace.world_module.down,
        nace.world_module.drop,
        "DisableOpSymmetryAssumption" in sys.argv,
    )
    nace.world_module.set_traversable_board_value(' ')
    # set the mapping of the movements, the rest are expected to be learnt. (these could be learnt from watching gym
    # action and this and last worlds.)
    nace.world_module.set_full_action_list(
        [nace.world_module.up, nace.world_module.right, nace.world_module.down, nace.world_module.left])

    view_dist_x = 3
    view_dist_y = 2
    num_time_steps = 300

    print(
        """ 
        (1) Food collecting         +1 for food (f) 
        (2) cup on table challenge  
        (3) doors and keys          +1 for battery (b)  max score==2
        (4) food collecting with moving object  
        (5) pong  
        (6) bring eggs to chicken  
        (7) soccer                  +1 per goal
        (8) shock world  
        (9) interactive world """)

    _challenge = input()

    if _challenge == "1":
        view_dist_x = 3
        view_dist_y = 2

    if _challenge == "2":
        nace.world_module.World_objective = nace.world_module.World_CupIsOnTable
        num_time_steps = 1000

    if _challenge == "6":
        nace.world_module.set_full_action_list(
            [nace.world_module.up, nace.world_module.right, nace.world_module.down,
             nace.world_module.left, nace.world_module.pick,
             ])

    external_world_nace_format, _, __, ___ = nace.world_module.build_initial_world_object(
        _challenge=_challenge,
        unobserved_code="."
    )
    external_npworld = nace.world_module_numpy.NPWorld(
        with_observed_time=False,
        name="external_npworld",
        view_dist_x=100,
        view_dist_y=100)
    agent_xy_loc, modified_count, _pre_action_world = external_npworld.update_world_from_ground_truth_nace_format(
        external_world_nace_format[nace.world_module.BOARD])  # pass in only the board
    external_npworld.multiworld_print([{"World": external_npworld}])
    global_agent = nace.agent_module.Agent(agent_xy_loc, 0, [])
    stepper = nace.stepper_v4.StepperV4()
    status = {"score": {"v": 0}}
    last_score = 0.0
    print_workings = True

    for time_counter in range(num_time_steps):
        action, behaviour = stepper.get_next_action(
            None,
            agent_xy_loc,
            print_debug_info=print_workings,
            available_actions=nace.world_module.get_full_action_list(),
            view_dist_x=view_dist_x,
            view_dist_y=view_dist_y
            )
        print("About to enact action ", action, behaviour)
        agent_xy_loc, external_world_nace_format, _ = nace.world_module._act(
            agent_xy_loc,
            external_world_nace_format,
            action,
            inject_key=None,
            external_reward_for_last_action=None)

        # copy state from nace format into NPformat
        new_xy_loc, ____, _____ = external_npworld.update_world_from_ground_truth_nace_format(
            external_world_nace_format[nace.world_module.BOARD])  # pass in only the board
        # let stepper update it's internal world state
        stepper.set_world_ground_truth_state(external_npworld, new_xy_loc, time_counter)
        # let stepper get the latest agent state
        status = stepper.set_agent_ground_truth_state(
            xy_loc=agent_xy_loc,
            score=external_world_nace_format[nace.world_module.VALUES][0],
            values_exc_score=external_world_nace_format[nace.world_module.VALUES][1:]
        )

        if status["score"]["v"] > last_score:
            print("Status:", status, "on task", _challenge, "time", time_counter)
            last_score = status["score"]["v"]  # place breakpoint here to observe when score increases
        stepper.predict_and_observe(print_out_world_and_plan=print_workings)

    print("Status:", status, "on task", _challenge, "time", time_counter)

Data Structures

 
  = Rule Object =:
  Action_Value_Precondition:                                            Prediction:    State Value Deltas
  Action   State   Preconditions (old world)                            y  x  board    score     key
           values  precondition0    precondition1    precondition2            value    delta     delta 
           excl    y  x             y  x
           score
  ((left,  (0,),  (0, 0, ' '),     (0, 1, 'x'),     (0, 2, 'u')),      (0, 0, 'x',     (0,       0))),
  ((right, (0,),  (0, -1, 'x'),    (0, 0, 'o')),                       (0, 0, 'o',     (0,       0))),
  
  The following Action_Value_Precondition:
  ((right, (0,),  (0, -1, 'x'),    (0, 0, 'o'))
  can be read: Match if there is a 'o' at the focus point, and a 'x' to the left of it, and the action is right.
  
  The following Action_Value_Precondition, Prediction:
  ((left,  (0,),  (0, 0, ' '),     (0, 1, 'x'),     (0, 2, 'u')),      (0, 0, 'x',     (0,       0))),
  can be read: Match if there is a ' ' at the focus point, 
                        and a 'x' to the right of it, 
                        and a 'u' to the right of the 'x',
                        and the action is left
                And the prediction after the action is:
                        the 'x' will appear at 0,0 relative to the focus point.
                        and there is no change to our score

  The following Action_Value_Precondition, Prediction:
  ((right, (0,), (0, -1, 'x'), (0, 0, 'f')), (0, 0, 'x', (1, 0))),
  can be read: Match if there is a 'f' at the focus point, 
                        and a 'x' to the left of it, 
                        and the action is right
                And the prediction after the action is:
                        the 'x' will appear at 0,0 relative to the focus point.
                        the first State Delta (score) will be +1
                        the first State Delta (key) will be +0
  
  
  Rule_Evidence Object Dictionary
                                 positive       negative
                                 evidence       evidence
                                 counter        counter
  { ((right, ... ))       :    ( 1,             0                ) }
  
 { ((left, (), (0, 0, ' '), (0, 1, 'x')), (0, 0, 'x', (0,))): (1,0) }    
  
  Positive Evidence, and Negative Evidence can be used to calculate:
        Frequency         = positive_count / (positive_count + negative_count)
        Confidence        = (positive_count + negative_count) / (positive_count + negative_count + 1)
        Truth_expectation = confidence * (frequency - 0.5) + 0.5

  Location:  
    xy_loc tuple (x,y) not (0,0) is top left
  
  
  State Values

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.17

Sep 9, 2024

0.0.16

Sep 9, 2024

0.0.15

Sep 9, 2024

0.0.14

Sep 9, 2024

0.0.12

Sep 8, 2024

This version

0.0.11

Sep 8, 2024

0.0.10

Sep 5, 2024

0.0.9

Sep 5, 2024

0.0.8

Sep 5, 2024

0.0.7

Sep 1, 2024

0.0.6

Sep 1, 2024

0.0.5

Sep 1, 2024

0.0.4

Sep 1, 2024

0.0.3

Sep 1, 2024

0.0.2

Sep 1, 2024

0.0.1

Sep 1, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nace-0.0.11.tar.gz (45.4 kB view details)

Uploaded Sep 8, 2024 Source

File details

Details for the file nace-0.0.11.tar.gz.

File metadata

Download URL: nace-0.0.11.tar.gz
Upload date: Sep 8, 2024
Size: 45.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for nace-0.0.11.tar.gz
Algorithm	Hash digest
SHA256	`cbc44dfb8e5d2cd1f0ac888f724c184de8cd6c3496362b35a735c02e542e3ec6`
MD5	`7e3e43b474cd02bddaac532072c5c375`
BLAKE2b-256	`7e77fc57202f62b93e8b5a2964eb78ad8ee149fc38a02fb3180f7c180fb94cc0`