setCritic
Set critic of reinforcement learning agent
Description
Examples
Modify Critic Parameter Values
Assume that you have an existing trained reinforcement learning agent. For this example, load the trained agent from Compare DDPG Agent to LQR Controller.
load("DoubleIntegDDPG.mat","agent")
Obtain the critic function approximator from the agent.
critic = getCritic(agent);
Obtain the learnable parameters from the critic.
params = getLearnableParameters(critic)
params=2×1 cell array
{[-5.0182 -1.5718 -0.3493 -0.1067 -0.0540 -0.0029]}
{[ 0]}
Modify the parameter values. For this example, simply multiply all of the parameters by 2
.
modifiedParams = cellfun(@(x) x*2,params,"UniformOutput",false);
Set the parameter values of the critic to the new modified values.
critic = setLearnableParameters(critic,modifiedParams);
Set the critic in the agent to the new modified critic.
setCritic(agent,critic);
Display the new parameter values.
getLearnableParameters(getCritic(agent))
ans=2×1 cell array
{[-10.0364 -3.1436 -0.6987 -0.2135 -0.1080 -0.0059]}
{[ 0]}
Modify Deep Neural Networks in Reinforcement Learning Agent
Create an environment with a continuous action space and obtain its observation and action specifications. For this example, load the environment used in the example Compare DDPG Agent to LQR Controller.
Load the predefined environment.
env = rlPredefinedEnv("DoubleIntegrator-Continuous");
Obtain observation and action specifications.
obsInfo = getObservationInfo(env); actInfo = getActionInfo(env);
Create a PPO agent from the environment observation and action specifications. This agent uses default deep neural networks for its actor and critic.
agent = rlPPOAgent(obsInfo,actInfo);
To modify the deep neural networks within a reinforcement learning agent, you must first extract the actor and critic function approximators.
actor = getActor(agent); critic = getCritic(agent);
Extract the deep neural networks from both the actor and critic function approximators.
actorNet = getModel(actor); criticNet = getModel(critic);
Plot the actor network.
plot(actorNet)
To validate a network, use analyzeNetwork
. For example, validate the critic network.
analyzeNetwork(criticNet)
You can modify the actor and critic networks and save them back to the agent. To modify the networks, you can use the Deep Network Designer app. To open the app for each network, use the following commands.
deepNetworkDesigner(criticNet) deepNetworkDesigner(actorNet)
In Deep Network Designer, modify the networks. For example, you can add additional layers to your network. When you modify the networks, do not change the input and output layers of the networks returned by getModel
. For more information on building networks, see Build Networks with Deep Network Designer.
To validate the modified network in Deep Network Designer, you must click on Analyze, under the Analysis section. To export the modified network structures to the MATLAB® workspace, generate code for creating the new networks and run this code from the command line. Do not use the exporting option in Deep Network Designer. For an example that shows how to generate and run code, see Create DQN Agent Using Deep Network Designer and Train Using Image Observations.
For this example, the code for creating the modified actor and critic networks is in the createModifiedNetworks
helper script.
createModifiedNetworks
Each of the modified networks includes an additional fullyConnectedLayer
and reluLayer
in their main common path. Plot the modified actor network.
plot(modifiedActorNet)
After exporting the networks, insert the networks into the actor and critic function approximators.
actor = setModel(actor,modifiedActorNet); critic = setModel(critic,modifiedCriticNet);
Finally, insert the modified actor and critic function approximators into the actor and critic objects.
agent = setActor(agent,actor); agent = setCritic(agent,critic);
Input Arguments
agent
— Reinforcement learning agent
reinforcement learning agent object
Reinforcement learning agent that contains a critic, specified as one of the following:
rlPGAgent
(when using a critic to estimate a baseline value function)
Note
agent
is an handle object. Therefore is updated by
setCritic
whether agent
is returned as an
output argument or not. For more information about handle objects, see Handle Object Behavior.
critic
— Critic
rlValueFunction
object | rlQValueFunction
object | rlVectorQValueFunction
object | two-element row vector of rlQValueFunction
objects
Critic object, specified as one of the following:
rlValueFunction
object — Returned whenagent
is anrlACAgent
,rlPGAgent
, orrlPPOAgent
object.rlQValueFunction
object — Returned whenagent
is anrlQAgent
,rlSARSAAgent
,rlDQNAgent
,rlDDPGAgent
, orrlTD3Agent
object with a single critic.rlVectorQValueFunction
object — Returned whenagent
is anrlQAgent
,rlSARSAAgent
,rlDQNAgent
, object with a discrete action space, vector Q-value function critic.Two-element row vector of
rlQValueFunction
objects — Returned whenagent
is anrlTD3Agent
orrlSACAgent
object with two critics.
Output Arguments
agent
— Updated reinforcement learning agent
rlQAgent
| rlSARSAAgent
| rlDQNAgent
| rlPGAgent
| rlDDPGAgent
| rlTD3Agent
| rlACAgent
| rlSACAgent
| rlPPOAgent
| rlTRPOAgent
Updated agent, returned as an agent object. Note that agent
is
an handle object. Therefore its actor is updated by setCritic
whether agent
is returned as an output argument or not. For more
information about handle objects, see Handle Object Behavior.
Version History
Introduced in R2019a
See Also
Functions
Comando de MATLAB
Ha hecho clic en un enlace que corresponde a este comando de MATLAB:
Ejecute el comando introduciéndolo en la ventana de comandos de MATLAB. Los navegadores web no admiten comandos de MATLAB.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)