why can not output optimal solution when validate agent?
Mostrar comentarios más antiguos
Hello everyone,
Topic: Reinforcement Learning, DQN Agent.
I have trained an agent with my dataset (total 28 training data) then validated all these data. Problem is i can not get optimal results at validation. Some of them were good but not every result was good.
- env: I custermized an environment.
- I create critic with this function: critic = rlVectorQValueFunction(nn,obsInfo,actInfo);
- With critic create an dqn agent: agent = rlDQNAgent(critic);
I have tried new agent with only 1 data. Training could get converged. Validation gave also right answer to this data. But i trained an agent with all 28 data using the same hyperparameter. Correctness is not garanteed.... I don't know what is reason. Because of too small dataset? or i gave wrong hyperparameter?
Hyperparameter of agent:
agent.AgentOptions.EpsilonGreedyExploration.EpsilonDecay = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.Epsilon = 0.9;
agent.AgentOptions.EpsilonGreedyExploration.EpsilonMin = 0.001;
agent.AgentOptions.DiscountFactor = 0.99;
agent.AgentOptions.MiniBatchSize = 128;
agent.AgentOptions.CriticOptimizerOptions.LearnRate = 0.0008;
agent.AgentOptions.CriticOptimizerOptions.GradientThreshold = 1;
agent.AgentOptions.SaveExperienceBufferWithAgent=true;
Thank you
Kun
2 comentarios
Emmanouil Tzorakoleftherakis
el 13 de Jun. de 2023
Are you using an IsDone signal? What do you mean by 28 training data? Do you mean 28 episodes? If that's the case, this number is really small. You need to at least give it a few hundred episodes to get an idea of how training progresses.
Respuesta aceptada
Más respuestas (0)
Categorías
Más información sobre Training and Simulation en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

