- Reward Function: Inspect your environment's step function. Ensure that the reward vector (or structure) includes a non-zero value for the first agent (“rlPPOAgent”).
- Agent Configuration: Make sure “rlPPOAgent” is correctly associated with its environment and policy.
- Environment Setup: You can double-check the environment setup to make sure all agents are interacting with it as intended.
- Training Parameters: Review the training parameters specific to the first agent, like the learning rate and discount factor.
I see a zero mean reward for the first agent in multi-agent RL Toolbox
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hello, I have extended the PPO Coverage coverage path planning example of the Matlab for 5 agents. I can see now that always, I have a reward for the first agent, and the problem is always, I see a zero mean reward in the toolbox for the first agent like the following image which is not correct. Do you have any idea what is happening there?

0 comentarios
Respuestas (1)
TARUN
el 22 de Abr. de 2025
I understand that you are experiencing an issue with the reward for the first agent in your multi-agent PPO setup.
Here are a few things you can check to resolve the issue:
These are some of the ways that might help you to fix the problem. If not, please provide the code that you are working with so that I can take a deeper look.
Feel free to refer this documentation on “Agents”:
0 comentarios
Ver también
Categorías
Más información sobre Introduction to Installation and Licensing en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!