Main Content

Specify Training Options in Reinforcement Learning Designer

To configure the training of an agent in the Reinforcement Learning Designer app, specify training options on the Train tab.

The Train tab, showing example training options.

Specify Basic Options

On the Train tab, you can specify the following basic training options.

OptionDescription
Max EpisodesMaximum number of episodes to train the agent, specified as a positive integer.
Max Episode LengthMaximum number of steps to run per episode, specified as a positive integer.
Stopping Criteria

Training termination condition, specified as one of the following values.

  • AverageSteps — Stop training when the running average number of steps per episode equals or exceeds the critical value specified by Stopping Value.

  • AverageReward — Stop training when the running average reward equals or exceeds the critical value.

  • EpisodeReward — Stop training when the reward in the current episode equals or exceeds the critical value.

  • GlobalStepCount — Stop training when the total number of steps in all episodes (the total number of times the agent is invoked) equals or exceeds the critical value.

  • EpisodeCount — Stop training when the number of training episodes equals or exceeds the critical value.

Stopping ValueCritical value of the training termination condition in Stopping Criteria, specified as a scalar.
Average Window LengthWindow length for averaging the scores, rewards, and number of steps for the agent when either Stopping Criteria or Save agent criteria specify an averaging condition.

Specify Additional Options

To specify additional training options, on the Train tab, click More Options.

In the More Training Options dialog box, you can specify the following options.

OptionDescription
Save agent criteria

Condition for saving agents during training, specified as one of the following values.

  • none — Do not save any agents during training.

  • AverageSteps — Save the agent when the running average number of steps per episode equals or exceeds the critical value specified by Save agent value.

  • AverageReward — Save the agent when the running average reward equals or exceeds the critical value.

  • EpisodeReward — Save the agent when the reward in the current episode equals or exceeds the critical value.

  • GlobalStepCount — Save the agent when the total number of steps in all episodes (the total number of times the agent is invoked) equals or exceeds the critical value.

  • EpisodeCount — Save the agent when the number of training episodes equals or exceeds the critical value.

Save agent valueCritical value of the save agent condition in Save agent criteria, specified as a scalar or "none".
Save directory

Folder for saved agents. If you specify a name and the folder does not exist, the app creates the folder in the current working directory.

To interactively select a folder, click Browse.

Show verbose outputSelect this option to display training progress at the command line.
Stop on ErrorSelect this option to stop training when an error occurs during an episode.
Training plot

Option to graphically display the training progress in the app, specified as one of the following values. "training-progress" or "none".

  • training-progress — Show training progress

  • none — Do not show training progress

Specify Parallel Training Options

To train your agent using parallel computing, on the Train tab, click Parallel computing icon.. Training agents using parallel computing requires Parallel Computing Toolbox™ software. For more information, see Train Agents Using Parallel Computing and GPUs.

To specify options for parallel training, select Use Parallel > Parallel training options.

Parallel training options dialog box.

In the Parallel Training Options dialog box, you can specify the following training options.

OptionDescription
Parallel computing mode

Parallel computing mode, specified as one of the following values.

  • sync — Use parpool to run synchronous training on the available workers. The parallel pool client (the process that starts the training) updates the parameters of its actor and critic, based on the results from all the workers, and sends the updated parameters to all workers. In this case, workers must pause execution until all workers are finished, and as a result the training only advances as fast as the slowest worker allows.

  • async — Use parpool to run asynchronous training on the available workers. In this case, workers send their data back to the client as soon as they finish and receive updated parameters from the client. The workers then continue with their task.

Transfer workspace variables to workers

Select this option to send model and workspace variables to parallel workers. When you select this option, the parallel pool client (the process that starts the training) sends variables used in models and defined in the MATLAB® workspace to the workers.

Random seed for workers

Randomizer initialization for workers, specified as one of the following values.

  • –1 — Assign a unique random seed to each worker. The value of the seed is the worker ID.

  • –2 — Do not assign a random seed to the workers.

  • Vector — Manually specify the random seed for each worker. The number of elements in the vector must match the number of workers.

Files to attach to parallel poolAdditional files to attach to the parallel pool. Specify names of files in the current working directory, with one name on each line.
Worker setup functionFunction to run before training starts, specified as a handle to a function having no input arguments. This function is run once per worker before training begins. Write this function to perform any processing that you need prior to training.
Worker cleanup functionFunction to run after training ends, specified as a handle to a function having no input arguments. You can write this function to clean up the workspace or perform other processing after training terminates.

The following figure shows an example parallel training configuration for the following files and functions.

  • Data file attached to the parallel pool — workerData.mat

  • Worker setup function — mySetup.m

  • Worker cleanup function — myCleanup.m

Parallel training options dialog showing file and function information.

See Also

Apps

Functions

Objects

Related Examples

More About