MDP robot grid-world example

Applies value iteration to learn a policy for a robot in a grid world.

Aaron T. Becker's Robot Swarm Lab

Versión 1.0.0.0 (7,72 KB)

813 Descargas

(1)

24 nov 2015

Descargar

Abrir en MATLAB Online

Seguir

Descargar

Abrir en MATLAB Online

Seguir

Applies value iteration to learn a policy for a Markov Decision Process (MDP) -- a robot in a grid world.
The world is freespaces (0) or obstacles (1). Each turn the robot can move in 8 directions, or stay in place. A reward function gives one freespace, the goal location, a high reward. All other freespaces have a small penalty, and obstacles have a large negative reward. Value iteration is used to learn an optimal 'policy', a function that assigns a
control input to every possible location.
video at https://youtu.be/gThGerajccM

This function compares a deterministic robot, one that always executes movements perfectly, with a stochastic robot, that has a small probability of moving +/-45degrees from the commanded move. The optimal policy for a stochastic robot avoids narrow passages and tries to move to the center of corridors.

From Chapter 14 in 'Probabilistic Robotics', ISBN-13: 978-0262201629, http://www.probabilistic-robotics.org

Aaron Becker, March 11, 2015

Citar como

Aaron T. Becker's Robot Swarm Lab (2026). MDP robot grid-world example (https://es.mathworks.com/matlabcentral/fileexchange/49992-mdp-robot-grid-world-example), MATLAB Central File Exchange. Recuperado 20 abril, 2026.

Agradecimientos

Inspiración para: Markov Decision Process (MDP) Algorithm, Kilobot Swarm Control using Matlab + Arduino

Categorías

Más información sobre Robotics System Toolbox en Help Center y MATLAB Answers.

Etiquetas

Añadir etiquetas

Compatibilidad con la versión de MATLAB

Compatible con cualquier versión

Compatibilidad con las plataformas

Windows
macOS
Linux

Abrir en una nueva pestaña

Versión	Publicado	Notas de la versión	Action
1.0.0.0	24 nov 2015	added link to video https://youtu.be/gThGerajccM	Descargar