Arcades

Class _ExperiencePool

An experience pool for agents.

Inherits from:

This class will memorize the experiences of an agent and return them when required taking account of a possible history.

Info:
See also:
  • Dump

    Serializable dump of an _ExperiencePool.

  • InitArguments

    Table used as arguments for the ExperiencePool constructor.

    Fields:
    • number pool_size
      Size of the experience pool
    • table state_size
      Size of the states {d, w, h}
    • number history_length
      Length of the history for inference
    • string history_type
      Type of history
    • number history_spacing
      Spacing in history
  • Pool

    Tables used to record interactions/experiences.

    Fields:
  • function

    _convert_tensor

    Function to convert tensors if necessary.

    This function must be called to convert tensors/network to the appropriate format (CUDA or default Tensor type) to avoid computation errors caused by inconsistent types

  • integer

    hashed_states

    The number of elements of the hash table.

  • hash.XXH64

    hasher

    Hasher object used to compute state hashes.

  • number

    history_length

    Number of states in a full historic state.

  • table

    history_offsets

    Offsets of the states to add to las one, when fetching a full historic state.

  • number

    history_spacing

    Parameter of the history_type function.

    See also:
  • {number,...}

    history_stacked_state_size

    Size of a full historic state.

  • string

    history_type

    Function to compute indexes of the historic state.

    See also:
  • torch.Tensor

    nil_state

    A nil state (full of 0) used in history.

  • Pool

    pool

    Actual pool.

  • {Pool,...}

    pushed_pools

    Pushed pools that can be restore by successive calls to pop.

  • Hash table to associate a hash (double) to a Tensor representing a state.

    Usage:
    • s = self.states[self.hasher:hash(s)]
  • __init ( args, dump )

    Default constructor.

    Parameters:

Public Methods

keyboard_arrow_up
  • clear ()

    Clear the current pool (forget all what you experienced).

    Returns:
    • self
  • dump ( [cycles={}] )

    Dump current state of _ExperiencePool.

    Tensors are converted to CPU tensors if necessary

    Parameters:
    • table cycles
      Already dumped components (default {})
    Returns:
  • get_action ( [index=1] )

    Get the action of a record.

    Parameters:
    • number index
      Index of the record to look for (1 is last recorded state) (default 1)
    Returns:
    • number Action executed/recorded
  • get_reward ( [index=1] )

    Get the reward of a record.

    Parameters:
    • number index
      Index of the record to look for (1 is last recorded state) (default 1)
    Returns:
    • number Reward obtained/recorded
  • get_state ( [index=1] )

    Return a full historic state.

    Parameters:
    • number index
      Index of the state to get (1 is last recorded state) (default 1)
    Returns:
    • torch.Tensor A full historic state, history stacked on first dimension
    • boolean Is the returned state terminal?
  • get_terminal ( [index=1] )

    Get the terminal signal of a record.

    Parameters:
    • number index
      Index of the record to look for (1 is last recorded state) (default 1)
    Returns:
    • boolean Is recorded state terminal?
  • pop ()

    Restore a saved pool.

    Restore a pool (if any) saved by a previous push call.

    Returns:
    • self
  • push ()

    Push the current pool.

    This allow you to save the current pool and to restore it later using pop.

    Returns:
    • self
  • record_action ( a )

    Record an action in the pool.

    This function is intended to be called after record_state to record the action executed for the last recorded state.

    Parameters: Returns:
    • self
  • record_reward ( r )

    Record a reward in the pool.

    This function is intended to be called after record_state and record_action to record the received reward for last action executed in last state.

    Parameters: Returns:
    • self
  • record_state ( s, t )

    Record a state in the pool.

    Parameters: Returns:
    • self
  • sample ( [batch_size=1] )

    Return samples from the experience pool.

    Samples returned don't start with terminal states

    Parameters:
    • integer batch_size
      Number of samples to return (default 1)
    Returns:
  • size ()

    Get the size of the experience pool.

    Todo:
    • Check which size is actually returned (current, max?)
    Returns:

Private Methods

keyboard_arrow_up
  • _clean_states ()

    Remove useless states from states hash map to reduce memory size.

    Returns:
    • self
  • _compute_history_offsets ()

    Fill in history_offsets.

    This function will compute the offsets of the states to retrieve to build a full historic state. The offsets depend on the history_type and the history_spacing.

    if history_type is 'linear' :
    offset[i] = history_spacing * i, ∀ i ∈ [1, history_length-1]

    if history_type is 'exp' :
    offset[i] = history_spacing ^ i, ∀ i ∈ [1, history_length-1]

    Returns:
  • _convert_states ( states, f )

    Convert states according to the given function f.

    This is used to convert the states to GPU or CPU Tensors.

    Parameters: Returns:
  • _get_sampling_index ()

    Return an index where to sample a non-terminal state.

    Returns:
    • number Index of the sample in the pool
  • _shift_index ( index )

    Shift a given index according to last_index.

    This function is used to manage a circular memory. history_offsets are computed according to a 0 indexed array. Pool arrays are circular and use a last_index to point the initial element. Thus we need a small computation to shift the offsets dynamicly.

    Parameters: Returns:
    • number A last_index based index