rlFunctionEnv

Create custom reinforcement learning environment using your reset and step functions

Description

Use rlFunctionEnv to create a custom reinforcement learning environment by supplying your own reset and step MATLAB^® functions. This object is useful when you want to create an environment different from the built-in ones available with rlPredefinedEnv. To verify the operation of your environment, rlFunctionEnv automatically calls validateEnvironment after creating the environment.

Creation

Syntax

env = rlFunctionEnv(observationInfo,actionInfo,stepFcn,resetFcn)

Description

env = rlFunctionEnv(observationInfo,actionInfo,stepFcn,resetFcn) creates a reinforcement learning environment using the provided observation and action specifications, observationInfo and actionInfo, respectively. The stepFcn and resetFcn arguments are the names of your step and reset MATLAB functions, respectively, and they are used to set the StepFcn and ResetFcn properties of env.

example

Input Arguments

expand all

`observationInfo` — Observation specifications
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

Observation specifications, specified as an rlFiniteSetSpec or rlNumericSpec object or an array containing a mix of such objects. Each element in the array defines the properties of an environment observation channel, such as its dimensions, data type, and name.

`actionInfo` — Action specifications
`rlNumericSpec` object | `rlFiniteSetSpec` object | vector containing one `rlFiniteSetSpec` followed by one `rlNumericSpec` object

Action specification, specified as one of the following:

One rlNumericSpec object (for continuous action spaces)
One rlFiniteSetSpec object (for discrete action spaces)
A vector consisting of one rlFiniteSetSpec followed by one rlNumericSpec object (for hybrid action spaces)

The action specification defines the properties of an environment action channel, such as its dimensions, data type, and name.

Note

For non-hybrid action spaces (either discrete or continuous) only one action channel is allowed. For hybrid action spaces, you must have two action channels, the first one for the discrete part of the action, the second one for the continuous part of the action.

Properties

expand all

`StepFcn` — Environment step function
function name | function handle | anonymous function handle

Environment step function, specified as a function name, function handle, or handle to an anonymous function. The sim and train functions call StepFcn to update the environment at every simulation or training step.

This function must have two inputs and four outputs, as illustrated by the following signature.

[NextObservation,Reward,IsDone,UpdatedInfo] = myStepFunction(Action,Info)

For a given action input, the step function returns the values of the next observation and reward, a logical value indicating whether the episode is terminated, and an updated environment information variable.

Specifically, the required input and output arguments are described as follows.

Action — Current action from the agent, which must match the dimensions and data type specified in actionInfo.
Info — Any data that you want to pass from one step to the next. This can be the environment state or a structure containing state and parameters. The simulation or training functions (train or sim) handle this variable by:
1. Initializing Info using the second output argument returned by ResetFcn, at the beginning of the episode
2. Passing Info as second input argument to StepFcn at each training or simulation step
3. Updating Info using the fourth output argument returned by StepFcn, UpdatedInfo
NextObservation — Next observation. This is the observation generated by the transition, caused by Action, from the current state to the next one. The returned value must match the dimensions and data types specified in observationInfo.
Reward — Reward generated by the transition, caused by Action, from the current state to the next one. The returned value must be a scalar.
IsDone — Logical value indicating whether to end the simulation or training episode.

To use additional input arguments beyond the allowed two, define your additional arguments in the MATLAB workspace, then specify stepFcn as an anonymous function that in turn calls your custom function with the additional arguments defined in the workspace, as shown in the example Create Custom Environment Using Step and Reset Functions.

Example: StepFcn="myStepFcn"

`ResetFcn` — Environment reset function
function name | function handle | anonymous function handle

Environment reset function, specified as a function name, function handle, or handle to an anonymous function. The sim function calls your reset function to reset the environment at the start of each simulation, and the train function calls it at the start of each training episode.

The reset function that you provide must have no inputs and two outputs, as illustrated by the following signature.

[InitialObservation,Info] = myResetFunction

The reset function sets the environment to an initial state and computes the initial value of the observation. For example, you can create a reset function that randomizes certain state values, such that each training episode begins from different initial conditions. The InitialObservation output must match the dimensions and data type of observationInfo.

The Info output of ResetFcn initializes the Info property of your environment and contains any data that you want to pass from one step to the next. This can be the environment state or a structure containing state and parameters. The simulation or training function (train or sim) supplies the current value of Info as the second input argument of StepFcn, then uses the fourth output argument returned by StepFcn to update the value of Info.

To use additional input arguments beyond the allowed two, define your argument in the MATLAB workspace, then specify stepFcn as an anonymous function that in turn calls your custom function with the additional arguments defined in the workspace, as shown in the example Create Custom Environment Using Step and Reset Functions.

Example: ResetFcn="myResetFcn"

`Info` — Information to pass to next step
any MATLAB data

Information to pass to the next step. This can be the environment state or a structure containing state and parameters. When ResetFcn is called, whatever you define as the Info output of ResetFcn initializes this property. When a step occurs the simulation or training function (train or sim) uses the current value of Info as the second input argument for StepFcn. Once StepFcn completes, the simulation or training function then updates the current value of Info using the fourth output argument returned by StepFcn.

Example: Info=[-1 0 2.2]

Object Functions

`getActionInfo`	Obtain action data specifications from reinforcement learning environment, agent, or experience buffer
`getObservationInfo`	Obtain observation data specifications from reinforcement learning environment, agent, or experience buffer
`train`	Train reinforcement learning agents within a specified environment
`sim`	Simulate trained reinforcement learning agents within specified environment
`validateEnvironment`	Validate custom reinforcement learning environment

Examples

collapse all

Create Custom Function Environment

Open Live Script

Create a reinforcement learning environment by supplying custom dynamic functions in MATLAB®. Using rlFunctionEnv, you can create a MATLAB reinforcement learning environment from an observation specification, action specification, and step and reset functions that you define.

For this example, create an environment that represents a system for balancing a pole on a cart. The observations from the environment are the cart position, cart velocity, pendulum angle, and pendulum angular velocity. For additional details about this environment, see Create Custom Environment Using Step and Reset Functions. Create an observation specification for these signals.

obsinfo = rlNumericSpec([4 1]);
obsinfo.Name = "CartPole States";
obsinfo.Description = 'x, dx, theta, dtheta';

The environment has a discrete action space where the agent can apply one of two possible force values to the cart, –10 N or 10 N. Create the action specification for these actions.

actInfo = rlFiniteSetSpec([-10 10]);
actInfo.Name = "CartPole Action";

Next, specify your step and reset functions. For this example, use the supplied functions myResetFunction.m and myStepFunction.m. For details about these functions and how they are constructed, see Create Custom Environment Using Step and Reset Functions.

While the custom reset and step functions that you must pass to rlFunctionEnv must have exactly zero and two arguments, respectively, you can avoid this limitation by using anonymous functions. Specifically, you define the reset and step functions that you pass to rlFunctionEnv as anonymous functions (with zero and two arguments, respectively) that in turn call your custom functions that have additional arguments. For more details on how to do this, see Create Custom Environment Using Step and Reset Functions.

Create the custom environment using the defined observation specification, action specification, and function names.

env = rlFunctionEnv(obsinfo,actInfo,"myStepFunction","myResetFunction")

env = 
  rlFunctionEnv with properties:

     StepFcn: "myStepFunction"
    ResetFcn: "myResetFunction"
        Info: [4×1 double]

You can now create agents for env and train or simulate them as you would for any other environment.

Version History

Introduced in R2019a

expand all

R2023b: The `LoggedSignals` property is no longer active

The LoggedSignals property of the rlFunctionEnv object is no longer active and will be removed in a future release. To pass information from one step to the next, use the Info property instead.

rlFunctionEnv

Description

Creation

Syntax

Description

Input Arguments

`observationInfo` — Observation specifications
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

`actionInfo` — Action specifications
`rlNumericSpec` object | `rlFiniteSetSpec` object | vector containing one `rlFiniteSetSpec` followed by one `rlNumericSpec` object

Properties

`StepFcn` — Environment step function
function name | function handle | anonymous function handle

`ResetFcn` — Environment reset function
function name | function handle | anonymous function handle

`Info` — Information to pass to next step
any MATLAB data

Object Functions

Examples

Create Custom Function Environment

Version History

R2023b: The `LoggedSignals` property is no longer active

See Also

Functions

Objects

Topics

rlFunctionEnv

Description

Creation

Syntax

Description

Input Arguments

observationInfo — Observation specifications rlFiniteSetSpec object | rlNumericSpec object | array

actionInfo — Action specifications rlNumericSpec object | rlFiniteSetSpec object | vector containing one rlFiniteSetSpec followed by one rlNumericSpec object

Properties

StepFcn — Environment step function function name | function handle | anonymous function handle

ResetFcn — Environment reset function function name | function handle | anonymous function handle

Info — Information to pass to next step any MATLAB data

Object Functions

Examples

Create Custom Function Environment

Version History

R2023b: The LoggedSignals property is no longer active

See Also

Functions

Objects

Topics

`observationInfo` — Observation specifications
`rlFiniteSetSpec` object | `rlNumericSpec` object | array

`actionInfo` — Action specifications
`rlNumericSpec` object | `rlFiniteSetSpec` object | vector containing one `rlFiniteSetSpec` followed by one `rlNumericSpec` object

`StepFcn` — Environment step function
function name | function handle | anonymous function handle

`ResetFcn` — Environment reset function
function name | function handle | anonymous function handle

`Info` — Information to pass to next step
any MATLAB data

R2023b: The `LoggedSignals` property is no longer active