Class
_ExperiencePool
An experience pool for agents.
Inherits from:
This class will memorize the experiences of an agent and return them when required taking account of a possible history.
- edit Alexis BRENON alexis.brenon@imag.fr
Data Types
keyboard_arrow_up-
Dump
Serializable dump of an _ExperiencePool.
-
InitArguments
-
Pool
Tables used to record interactions/experiences.
Fields:-
number
max_size
Maximum number of saved experiences -
number
last_index
Index of the last saved experience -
{number,...}
states
The hashes of the recorded states -
{number,...}
terminals
Is the states terminal (1
) or not (0
) -
{number,...}
actions
Actions executed -
{number,...}
rewards
Rewards received
-
number
Fields
keyboard_arrow_up-
function
_convert_tensor
Function to convert tensors if necessary.
This function must be called to convert tensors/network to the appropriate format (CUDA or default Tensor type) to avoid computation errors caused by inconsistent types
-
integer
hashed_states
The number of elements of the hash table.
-
hash.XXH64
hasher
Hasher object used to compute state hashes.
-
number
history_length
Number of states in a full historic state.
-
table
history_offsets
Offsets of the states to add to las one, when fetching a full historic state.
-
number
history_spacing
Parameter of the history_type function.
See also: -
{number,...}
history_stacked_state_size
Size of a full historic state.
-
string
history_type
Function to compute indexes of the historic state.
See also: -
torch.Tensor
nil_state
A nil state (full of
0
) used in history. -
Pool
pool
Actual pool.
-
{Pool,...}
pushed_pools
-
{[number]=torch.Tensor}
states
Hash table to associate a hash (double) to a Tensor representing a state.
Usage:-
s = self.states[self.hasher:hash(s)]
-
Metamethods
keyboard_arrow_up-
__init ( args, dump )
Default constructor.
Parameters:-
InitArguments
args
-
Dump
dump
-
InitArguments
Public Methods
keyboard_arrow_up-
clear ()
Clear the current pool (forget all what you experienced).
Returns:-
self
-
-
dump ( [cycles={}] )
Dump current state of _ExperiencePool.
Tensors are converted to CPU tensors if necessary
Overrides: ArcadesComponent:dump
-
table
cycles
Already dumped components (default{}
)
-
table
-
get_action ( [index=1] )
-
get_reward ( [index=1] )
-
get_state ( [index=1] )
Return a full historic state.
Parameters:-
number
index
Index of the state to get (1 is last recorded state) (default1
)
- torch.Tensor A full historic state, history stacked on first dimension
- boolean Is the returned state terminal?
-
number
-
get_terminal ( [index=1] )
-
pop ()
-
push ()
Push the current pool.
This allow you to save the current pool and to restore it later using pop.
Returns:-
self
-
-
record_action ( a )
Record an action in the pool.
This function is intended to be called after record_state to record the action executed for the last recorded state.
Parameters:-
number
a
The action index
-
self
-
number
-
record_reward ( r )
Record a reward in the pool.
This function is intended to be called after record_state and record_action to record the received reward for last action executed in last state.
Parameters:-
number
r
The reward
-
self
-
number
-
record_state ( s, t )
Record a state in the pool.
Parameters:-
torch.Tensor
s
The state -
boolean
t
Is the state terminal?
-
self
-
torch.Tensor
-
sample ( [batch_size=1] )
Return samples from the experience pool.
Samples returned don't start with terminal states
Parameters:-
integer
batch_size
Number of samples to return (default1
)
-
torch.Tensor
batch_size
states -
torch.Tensor
batch_size
actions -
torch.Tensor
batch_size
rewards -
torch.Tensor
batch_size
final states -
torch.Tensor
batch_size
final states terminal signal
-
integer
-
size ()
Get the size of the experience pool.
Todo:- Check which size is actually returned (current, max?)
-
number
self.pool.max_size
Private Methods
keyboard_arrow_up-
_clean_states ()
Remove useless states from states hash map to reduce memory size.
Returns:-
self
-
-
_compute_history_offsets ()
Fill in history_offsets.
This function will compute the offsets of the states to retrieve to build a full historic state. The offsets depend on the history_type and the history_spacing.
if history_type is
'linear'
:
offset[i] = history_spacing * i, ∀ i ∈ [1, history_length-1]
if history_type is
Returns:'exp'
:
offset[i] = history_spacing ^ i, ∀ i ∈ [1, history_length-1]
-
_convert_states ( states, f )
Convert states according to the given function
f
.This is used to convert the states to GPU or CPU Tensors.
Parameters:-
{[number]=torch.Tensor}
states
States hash map -
function
f
Function used to convert Tensors
- {[number]=torch.Tensor} A new hash map with converted Tensors
-
{[number]=torch.Tensor}
-
_get_sampling_index ()
Return an index where to sample a non-terminal state.
Returns:- number Index of the sample in the pool
-
_shift_index ( index )
Shift a given index according to
last_index
.This function is used to manage a circular memory. history_offsets are computed according to a
Parameters:0
indexed array. Pool arrays are circular and use alast_index
to point the initial element. Thus we need a small computation to shift the offsets dynamicly.-
number
index
0
based index
-
number
A
last_index
based index
-
number