Description

Creating and Training Reinforcement Learning Agents Interactively

Name: Creating and Training Reinforcement Learning Agents Interactively
Uploaded: 2021-01-31T01:17:00-05:00
Duration: 5 min 38 s
Description: Design, train, and simulate reinforcement learning agents interactively with the Reinforcement Learning Designer app. Use the app to set up a reinforcement learning problem in Reinforcement Learning Toolbox without writing MATLAB code.

Design, train, and simulate reinforcement learning agents using a visual interactive workflow in the Reinforcement Learning Designer app. Use the app to set up a reinforcement learning problem in Reinforcement Learning Toolbox™ without writing MATLAB^® code. Work through the entire reinforcement learning workflow to:

Import an existing environment in the app
Import or create a new agent for your environment and select the appropriate hyperparameters for the agent
Use the default neural network architectures created by Reinforcement Learning Toolbox or import custom architectures
Train the agent on single or multiple workers and simulate the trained agent against the environment
Analyze simulation results and refine agent parameters
Export the final agent to the MATLAB workspace for further use and deployment

Published: 31 Jan 2021

Full Transcript

As of R2021a release of MATLAB, Reinforcement Learning Toolbox lets you interactively design, train, and simulate RL agents with the new Reinforcement Learning Designer app. Open the app from the command line or from the MATLAB toolstrip. First, you need to create the environment object that your agent will train against. Reinforcement Learning Designer lets you import environment objects from the MATLAB workspace, select from several predefined environments, or create your own custom environment. For this example, let’s create a predefined cart-pole MATLAB environment with discrete action space and we will also import a custom Simulink environment of a 4-legged robot with continuous action space from the MATLAB workspace. You can delete or rename environment objects from the Environments pane as needed and you can view the dimensions of the observation and action space in the Preview pane. To create an agent, click New in the Agent section on the Reinforcement Learning tab. Depending on the selected environment, and the nature of the observation and action spaces, the app will show a list of compatible built-in training algorithms. For this demo, we will pick the DQN algorithm. The app will generate a DQN agent with a default critic architecture. You can adjust some of the default values for the critic as needed before creating the agent. The new agent will appear in the Agents pane and the Agent Editor will show a summary view of the agent and available hyperparameters that can be tuned. For example let’s change the agent’s sample time and the critic’s learn rate. Here, we can also adjust the exploration strategy of the agent and see how exploration will progress with respect to number of training steps. To view the critic default network, click View Critic Model on the DQN Agent tab. The Deep Learning Network Analyzer opens and displays the critic structure. You can change the critic neural network by importing a different critic network from the workspace. You can also import a different set of agent options or a different critic representation object altogether. Click Train to specify training options such as stopping criteria for the agent. Here, let’s set the max number of episodes to 1000 and leave the rest to their default values. To parallelize training click on the Use Parallel button. Parallelization options include additional settings such as the type of data workers will send back, whether data will be sent synchronously or not and more. After setting the training options, you can generate a MATLAB script with the specified settings that you can use outside the app if needed. To start training, click Train. During the training process, the app opens the Training Session tab and displays the training progress. If visualization of the environment is available, you can also view how the environment responds during training. You can stop training anytime and choose to accept or discard training results. Accepted results will show up under the Results Pane and a new trained agent will also appear under Agents. To simulate an agent, go to the Simulate tab and select the appropriate agent and environment object from the drop-down list. For this task, let’s import a pretrained agent for the 4-legged robot environment we imported at the beginning. Double click on the agent object to open the Agent editor. You can see that this is a DDPG agent that takes in 44 continuous observations and outputs 8 continuous torques. In the Simulate tab, select the desired number of simulations and simulation length. If you need to run a large number of simulations, you can run them in parallel. After clicking Simulate, the app opens the Simulation Session tab. If available, you can view the visualization of the environment at this stage as well. When the simulations are completed, you will be able to see the reward for each simulation as well as the reward mean and standard deviation. Remember that the reward signal is provided as part of the environment. To analyze the simulation results, click on Inspect Simulation Data. In the Simulation Data Inspector you can view the saved signals for each simulation episode. If you want to keep the simulation results click accept. When you finish your work, you can choose to export any of the agents shown under the Agents pane. For convenience, you can also directly export the underlying actor or critic representations, actor or critic neural networks, and agent options. To save the app session for future use, click Save Session on the Reinforcement Learning tab. For more information please refer to the documentation of Reinforcement Learning Toolbox.

Related Resources

Creating and Training Reinforcement Learning Agents Interactively

Related Products

Reinforcement Learning Toolbox

Up Next:

Related Videos: