generatePolicyFunction

Create function that evaluates trained policy of reinforcement learning agent

Description

example

generatePolicyFunction(agent) creates a function that evaluates the learned policy of the specified agent using default function, policy, and data file names. After generating the policy evaluation function, you can:

example

generatePolicyFunction(agent,Name,Value) specifies the function, policy, and data file names using one or more name-value pair arguments.

Examples

collapse all

Create and train a reinforcement learning agent. For this example, load the PG agent trained in Train PG Agent to Balance Cart-Pole System.

load('MATLABCartpolePG.mat','agent')

Create a policy evaluation function for this agent using default names.

generatePolicyFunction(agent)

This command creates the evaluatePolicy.m file, which contains the policy function, and the agentData.mat file, which contains the trained deep neural network actor.

To view the generated function, type:

type evaluatePolicy.m

For a given observation, the policy function evaluates a probability for each potential action using the actor network. Then, the policy function randomly selects an action based on these probabilities.

Since the actor network for this PG agent has a single input layer and single output layer, you can generate code for this network using the Deep Learning Toolbox™ code generation functionality. For more information, see Deploy Trained Reinforcement Learning Policies.

Create and train a reinforcement learning agent. For this example, load the Q-Learning agent trained in Train Reinforcement Learning Agent in Basic Grid World.

load('basicGWQAgent.mat','qAgent')

Create a policy evaluation function for this agent, specifying the name of the agent data file.

generatePolicyFunction(qAgent,'MATFileName',"policyFile.mat")

This command creates the evaluatePolicy.m file, which contains the policy function, and the policyFile.mat file, which contains the trained Q table value function.

To view the generated function, type:

type evaluatePolicy.m

For a given observation, the policy function looks up the value function for each potential action using the Q table. Then, the policy function selects the action for which the value function is greatest.

You can generate code for this policy function using MATLAB Coder. For more information, see Deploy Trained Reinforcement Learning Policies.

Create and train a reinforcement learning agent. For this example, load the DQN agent trained in Train DQN Agent to Balance Cart-Pole System.

load('MATLABCartpoleDQN.mat','agent')

Create a policy evaluation function for this agent, specifying the function and file name.

generatePolicyFunction(agent,'FunctionName',"computeAction")

This command creates the computeAction.m file, which contains the policy function, and the agentData.mat file, which contains the trained deep neural network critic.

To view the generated function, type:

type computeAction.m

For a given observation, the policy function evaluates the observation-action value function for each potential discrete action, using the critic network. Then, the policy function selects the action that produces the largest predicted value function.

The Deep Learning Toolbox code generation functionality supports only networks with a single input layer. Therefore, code generation is not supported for computeAction.m, since the critic in a DQN agent has two input layers, one for the observation and one for the action.

Input Arguments

collapse all

Trained reinforcement learning agent, specified as one of the following:

Since Deep Learning Toolbox code generation and prediction functionality do not support deep neural networks with more than one input layer, generatePolicyFunction does not support the following agent configurations:

  • DQN agent with deep neural network critic representations.

  • Any agent with deep neural network actor or critic representations with multiple observation input layers.

To train your agent, use the train function.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'FunctionName',"computeAction"

Name of the generated function, specified as the name-value pair consisting of 'FunctionName' and a string or character vector.

Name of the policy variable within the generated function, specified as the name-value pair consisting of 'PolicyName' and a string or character vector.

Name of the agent data file, specified as the name-value pair consisting of 'MATFileName' and a string or character vector.

Introduced in R2019a