Creating discrete observation space for Reinforcement Learning
11 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I am trying to solve a reinforcement learning problem, where I need to take two arrays of 1xn dimension as observation and each element can have discrete values [1 1000].
How can I implement my requirement using the rlFiniteSetSpec from the Reinforcement Learning toolbox?
0 comentarios
Respuestas (1)
Aditya
el 21 de Feb. de 2024
In reinforcement learning with MATLAB's Reinforcement Learning Toolbox, the `rlFiniteSetSpec` object is used to define a finite set of discrete observations or actions. However, this object is designed to handle a single array of discrete values, not multiple arrays.
If you have two arrays of 1xn dimension, and each element can take on any discrete value in the range [1, 1000], you might run into scalability issues since the observation space becomes extremely large. The total number of possible observations would be \(1000^n \times 1000^n\), which is impractical to handle for any non-trivial value of `n`.
To use `rlFiniteSetSpec` for your problem, you would need to define each unique combination of values in your two arrays as a separate observation. This is not feasible due to the combinatorial explosion of possibilities.
Instead, you should consider the following approaches:
1. Binning/Discretization: Reduce the resolution of your observation space by grouping ranges of values into bins. For example, instead of having each element range from 1 to 1000, you might define 10 bins representing ranges of values. This would drastically reduce the size of your observation space, but at the cost of losing some granularity.
2. Feature Engineering: Transform your raw observations into a set of features that meaningfully represent the state of your environment. This could involve extracting statistics or other properties from your arrays that capture the essential information for decision-making.
3. Use a Continuous Observation Space: If discretizing your observation space is not suitable, you might want to consider using a continuous observation space instead, which is represented by `rlNumericSpec` in MATLAB. Reinforcement learning algorithms that can handle continuous spaces, such as DDPG (Deep Deterministic Policy Gradient) or PPO (Proximal Policy Optimization), may be used in this case.
0 comentarios
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!