getValue
Obtain estimated value from a critic given environment observations and actions
Since R2020a
Syntax
Description
Value Function Critic
evaluates the value function critic value
= getValue(valueFcnAppx
,obs
)valueFcnAppx
and returns the
value corresponding to the observation obs
. In this case,
valueFcnAppx
is an rlValueFunction
approximator object.
Q-Value Function Critics
evaluates the discrete-action-space Q-value function critic
value
= getValue(vqValueFcnAppx
,obs
)vqValueFcnAppx
and returns the vector value
,
in which each element represents the estimated value given the state corresponding to the
observation obs
and the action corresponding to the element number of
value
. In this case, vqValueFcnAppx
is an
rlVectorQValueFunction
approximator object.
evaluates the Q-value function critic value
= getValue(qValueFcnAppx
,obs
,act
)qValueFcnAppx
and returns the
scalar value
, representing the value given the observation
obs
and action act
. In this case,
qValueFcnAppx
is an rlQValueFunction
approximator object.
Return Recurrent Neural Network State
Use Forward
___ = getValue(___,UseForward=
allows you to explicitly call a forward pass when computing gradients.useForward
)
Examples
Input Arguments
Output Arguments
Tips
The more general function
evaluate
behaves, for critic objects, similarly togetValue
except thatevaluate
returns results inside a single-cell array.When the elements of the cell array in
inData
aredlarray
objects, the elements of the cell array returned inoutData
are alsodlarray
objects. This allowsgetValue
to be used with automatic differentiation.Specifically, you can write a custom loss function that directly uses
getValue
anddlgradient
within it, and then usedlfeval
anddlaccelerate
with your custom loss function. For an example, see Train Reinforcement Learning Policy Using Custom Training Loop and Custom Training Loop with Simulink Action Noise.
Version History
Introduced in R2020a