Documentation

# rlACAgentOptions

Create options for AC agent

## Syntax

``opt = rlACAgentOptions``
``opt = rlACAgentOptions(Name,Value)``

## Description

example

````opt = rlACAgentOptions` creates an `rlACAgentOptions` object for use as an argument when creating an AC agent using all default settings. You can modify the object properties using dot notation.`opt = rlACAgentOptions(Name,Value)` creates an AC options object using the specified name-value pairs to override default property values.```

## Examples

collapse all

Create an AC agent options object, specifying the discount factor.

`opt = rlACAgentOptions('DiscountFactor',0.95)`
```opt = rlACAgentOptions with properties: NumStepsToLookAhead: 1 EntropyLossWeight: 0 SampleTime: 1 DiscountFactor: 0.9500```

You can modify options using dot notation. For example, set the agent sample time to `0.5`.

`opt.SampleTime = 0.5;`

## Input Arguments

collapse all

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `"ExperienceBufferLengh",8192`

Sample time of agent, specified as the comma-separated pair consisting of `'SampleTime'` and a numeric value.

Discount factor applied to future rewards during training, specified as the comma-separated pair consisting of `'DiscountFactor'` and a positive numeric value less than or equal to 1.

Number of steps to look ahead in model training, specified as the comma-separated pair consisting of `'NumStepsToLookAhead'` and a numeric positive integer value. For AC agents, the number of steps to look ahead corresponds to the training episode length.

Entropy loss weight, specified as the comma-separated pair consisting of `'EntropyLossWeight'` and a scalar value between `0` and `1`. A higher loss weight value promotes agent exploration by applying a penalty for being too certain about which action to take. Doing so can help the agent move out of local optima.

The entropy loss function for episode step t is:

`${H}_{t}=E\sum _{k=1}^{M}{\mu }_{k}\left({S}_{t}|{\theta }_{\mu }\right)\mathrm{ln}{\mu }_{k}\left({S}_{t}|{\theta }_{\mu }\right)$`

Here:

• E is the entropy loss weight.

• M is the number of possible actions.

• μk(St) is the probability of taking action Ak following the current policy.

When gradients are computed during training, an additional gradient component is computed for minimizing this loss function.

## Output Arguments

collapse all

AC agent options, returned as an `rlACAgentOptions` object. The object properties are described in Name-Value Pair Arguments.