How can I plot extra RL training data while using parallel computing?

Question

0 votos

I'm working on a RL project using the Reinforcement Learning Designer, the PPO agent, and a custom environment. To help with diagnostics during the training process I set up some extra plots that track some statistics related to my agent's performance (separate from the RL designer's built in plot that shows reward, average reward, and q0). These plots are intended to capture statistics about the given episode regarding the final state of the environment as well as important moves made during the episode. This data is saved to environment properties inside the step() function during an episode, and then written to plots at the end of an episode in the reset() function.

For example,

classdef custEnv < rl.env.MATLABEnvironment
    properties
        numKeyMoves = 0;
        numTotalMoves = 0;
        numEpisodes = 0;
        Figure1
    end
    
    methods
        
        function [Observation, Reward, IsDone, LoggedSignals] = step(this, Action)
            if 1 % if the move was a key move
                this.numKeyMoves = this.numKeyMoves + 1;
            end
            this.numTotalMoves = this.numTotalMoves + 1;
        end
        
        function InitialObservation = reset(this)
            ha = gca(this.Figure1);
            % plot the percentage of key moves against episode num to track
            % progress
            plot(ha, this.numKeyMoves / this.numTotalMoves, this.episodeNum)
            this.numKeyMoves = 0;
            this.numTotalMoves = 0;
            this.episodeNum = this.episodeNum + 1;
        end
        
        function plot(this)
            this.Figure1 = figure('Visible', 'on', 'HandleVisibility', 'off');
        end
        
    end
    
end

This has worked well to date, but now I'd like to use parallel computing to speed up future training sessions. However it seems like the parallel workers aren't interacting with the step() and reset() functions in the same way, so the plots I've created are never getting populated with any data once parallel training is enabled. Is there a way to recreate the funcitonality above with parallel training, or any other stable way to similarly track environment properties as episodes progress?

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Iniciar sesión para seguir la actividad

Answer 1

Shadaab Siddiqie el 4 de Ag. de 2021

0 votos

Form my understanding you are not getting correct plot when using parallel computing. I have heard that this issue is known, and the concerned parties may be investigating further.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

How can I plot extra RL training data while using parallel computing?

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

How can I plot extra RL training data while using parallel computing?

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos