dltranspconv

Deep learning transposed convolution

Syntax

Y = dltranspconv(X,weights,bias)

Y = dltranspconv(X,weights,bias,DataFormat=FMT)

Y = dltranspconv(___Name=Value)

Description

The transposed convolution operation upsamples feature maps.

The dltranspconv function applies the deep learning transposed convolution operation to dlarray data. Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the "S", "T", "C", and "B" labels, respectively. For unspecified and other dimensions, use the "U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting the dlarray object directly, or by using the DataFormat option.

Note

This function applies the deep learning transposed convolution operation to dlarray data. If you want to apply transposed convolution within a dlnetwork object, use one of these layers:

Y = dltranspconv(X,weights,bias) computes the deep learning transposed convolution of the input X using the filters defined by weights, and adds the constant bias. The input X must be a formatted dlarray. The output Y is a formatted dlarray with the same dimension format as X.

The function, by default, convolves over up to three dimensions of X labeled "S" (spatial). To convolve over dimensions labeled "T" (time), specify weights with a "T" dimension using a formatted dlarray object or by using the WeightsFormat option.

For unformatted input data, use the DataFormat option.

example

Y = dltranspconv(X,weights,bias,DataFormat=FMT) applies the deep learning transposed convolution operation to the unformatted dlarray object X with format specified by FMT. The output Y is an unformatted dlarray object with dimensions in the same order as X.

Y = dltranspconv(___Name=Value) specifies options using one or more name-value pair arguments in addition to the input arguments in previous syntaxes. For example, Stride=3 sets the stride of the convolution operation.

example

Examples

collapse all

Perform 2-D Transposed Convolution

Open Live Script

Create a formatted dlarray object containing a batch of 128 28-by-28 images with 3 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

miniBatchSize = 128;
inputSize = [28 28];
numChannels = 3;
X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize);
X = dlarray(X,"SSCB");

View the size and format of the input data.

size(X)

ans = 1×4

    28    28     3   128

dims(X)

ans = 
'SSCB'

Initialize the weights and bias for 2-D transposed convolution. For the weights, specify 64 3-by-3 filters. For the bias, specify a vector of zeros.

filterSize = [3 3];
numFilters = 64;

weights = rand(filterSize(1),filterSize(2),numFilters,numChannels);
bias = zeros(1,numFilters);

Apply 2-D transposed convolution using the dltranspconv function.

Y = dltranspconv(X,weights,bias);

View the size and format of the output.

size(Y)

ans = 1×4

    30    30    64   128

dims(Y)

ans = 
'SSCB'

Perform Grouped Transposed Convolution

Open Live Script

Create a formatted dlarray object containing a batch of 128 28-by-28 images with 16 channels. Specify the format "SSCB" (spatial, spatial, channel, batch).

miniBatchSize = 128;
inputSize = [28 28];
numChannels = 16;
X = rand(inputSize(1),inputSize(2),numChannels,miniBatchSize);
X = dlarray(X,"SSCB");

View the size and format of the input data.

size(X)

ans = 1×4

    28    28    16   128

dims(X)

ans = 
'SSCB'

Initialize the weights and bias for 2-D grouped transposed convolution. For the weights, specify two groups of 64 3-by-3 filters. For the bias, specify a vector of zeros.

The number of channels per group is given by the number of channels of the input data divided by the number of groups. The size of the bias vector is the number of filters per group multiplied by the number of groups.

filterSize = [3 3];
numFiltersPerGroup = 64;
numGroups = 2;
numChannelsPerGroup = numChannels / numGroups;

weights = rand(filterSize(1),filterSize(2),numFiltersPerGroup,numChannelsPerGroup,numGroups);
bias = zeros(1,numFiltersPerGroup*numGroups);

Apply 2-D grouped transposed convolution using the dltranspconv function.

Y = dltranspconv(X,weights,bias);

View the size and format of the output.

size(Y)

ans = 1×4

    30    30   128   128

dims(Y)

ans = 
'SSCB'

Input Arguments

collapse all

`X` — Input data
`dlarray` | numeric array

Input data, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

If X is an unformatted dlarray or a numeric array, then you must specify the format using the DataFormat option. If X is a numeric array, then either weights or bias must be a dlarray object.

`weights` — Filters
`dlarray` | numeric array

Filters, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

The size and format of the weights depends on the type of task. If weights is an unformatted dlarray or a numeric array, then the size and shape of weights depends on the WeightsFormat option.

This table describes the size and format of the weights for various tasks. You can specify an array with the dimensions in any order using formatted dlarray objects or by using the WeightsFormat option. When the weights has multiple dimensions with the same label (for example, multiple dimensions labeled "S"), then those dimensions must be in ordered as described in this table.

The dimensions labels of the format indicate the layout of the data:

The "S" (spatial) dimensions of the weights correspond to the spatial dimensions of the filters.
The "T" (time) dimension of the weights correspond to the time dimension of the filters.
The "C" (channel) dimension of the weights corresponds to the channels of the filters.
The "U" (unspecified) dimensions of the weights correspond to the channels of the input data.

Task	Required Dimensions	Size	Example
Task	Required Dimensions	Size	Weights	Format
1-D transposed convolution	`"S"` (spatial) or `"T"` (time)	Filter size	`filterSize`-by-`numFilters`-by-`numChannels` array, where `filterSize` is the size of the 1-D filters, `numFilters` is the number of filters, and `numChannels` is the number of channels of the input data.	`"SCU"` (spatial, channel, unspecified)
	`"C"` (channel)	Number of filters
	`"U"` (unspecified)	Number of input channels
1-D grouped transposed convolution	`"S"` (spatial) or `"T"` (time)	Filter size	`filterSize`-by-`numFiltersPerGroup`-by-`numChannelsPerGroup`-by-`numGroups` array, where `filterSize` is the size of the 1-D filters, `numFiltersPerGroup` is the number of filters per group, `numChannelsPerGroup` is the number of channels per group of the input data, and `numGroups` is the number groups. `numChannelsPerGroup` must equal the number of the channels of the input data divided by `numGroups`.	`"SCUU"` (spatial, channel, unspecified, unspecified)
	`"C"` (channel)	Number of filters per group
	First `"U"` (unspecified)	Number of input channels per group
	Second `"U"` (unspecified)	Number of groups
2-D transposed convolution	First `"S"` (spatial)	Filter height	`filterSize(1)`-by-`filterSize(2)`-by-`numFilters`-by-`numChannels` array, where `filterSize(1)` and `filterSize(2)` are the height and width of the 2-D filters, respectively, `numFilters` is the number of filters, and `numChannels` is the number of channels of the input data.	`"SSCU"` (spatial, spatial, channel, unspecified)
	Second `"S"` (spatial) or `"T"` (time)	Filter width
	`"C"` (channel)	Number of filters
	`"U"` (unspecified)	Number of input channels
2-D grouped transposed convolution	First `"S"` (spatial)	Filter height	`filterSize(1)`-by-`filterSize(2)`-by-`numFiltersPerGroup`-by-`numChannelsPerGroup`-by-`numGroups` array, where `filterSize(1)` and `filterSize(2)` are the height and width of the 2-D filters, respectively, `numFiltersPerGroup` is the number of filters per group, `numChannelsPerGroup` is the number of channels per group of the input data, and `numGroups` is the number of groups. `numChannelsPerGroup` must equal the number of the channels of the input data divided by `numGroups`.	`"SSCUU"` (spatial, spatial, channel, unspecified, unspecified)
	Second `"S"` (spatial) or `"T"` (time)	Filter width
	`"C"` (channel)	Number of filters per group
	First `"U"` (unspecified)	Number of input channels per group
	Second `"U"` (unspecified)	Number of groups
3-D transposed convolution	First `"S"` (spatial)	Filter height	`filterSize(1)`-by-`filterSize(2)`-by-`filterSize(3)`-by-`numFilters`-by-`numChannels` array, where `filterSize(1)`, `filterSize(2)`, and `filterSize(3)` are the height, width, and depth of the 3-D filters, respectively, `numFilters` is the number of filters, and `numChannels` is the number of channels of the input data.	`"SSSCU"` (spatial, spatial, spatial, channel, unspecified)
	Second `"S"` (spatial)	Filter width
	Third `"S"` (spatial) or `"T"` (time)	Filter depth
	`"C"` (channel)	Number of filters
	`"U"` (unspecified)	Number of input channels

Tip

`bias` — Bias constant
`dlarray` vector | `dlarray` scalar | numeric vector | numeric scalar

Bias constant, specified as a formatted or unformatted dlarray vector or dlarray scalar, a numeric vector, or a numeric scalar.

If bias is a scalar or has only singleton dimensions, the same bias is applied to each entry of the output.
If bias has a nonsingleton dimension, each element of bias is the bias applied to the corresponding convolutional filter specified by weights. The number of elements of bias must match the number of filters specified by weights.

If bias is a formatted dlarray, the nonsingleton dimension must be a channel dimension labeled "C".

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: Stride=2 sets the stride of each filter to 2.

`DataFormat` — Description of data dimensions
character vector | string scalar

Description of the data dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding data dimension.

The characters are:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

For example, consider an array that represents a batch of sequences where the first, second, and third dimensions correspond to channels, observations, and time steps, respectively. You can describe the data as having the format "CBT" (channel, batch, time).

You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and "T" once each, at most. The software ignores singleton trailing "U" dimensions after the second dimension.

If the input data is not a formatted dlarray object, then you must specify the DataFormat option.

For more information, see Deep Learning Data Formats.

Data Types: char | string

`WeightsFormat` — Description of weights dimensions
character vector | string scalar

Description of weights dimensions, specified as a character vector or string scalar.

A data format is a string of characters, where each character describes the type of the corresponding dimension of the data.

The characters are:

"S" — Spatial
"C" — Channel
"B" — Batch
"T" — Time
"U" — Unspecified

The dimensions labels of the format indicate the layout of the data:

The "S" (spatial) dimensions of the weights correspond to the spatial dimensions of the filters.
The "T" (time) dimension of the weights correspond to the time dimension of the filters.
The "C" (channel) dimension of the weights corresponds to the channels of the filters.
The "U" (unspecified) dimensions of the weights correspond to the channels of the input data.

The default value of WeightsFormat depends on the task:

Task	Default
1-D transposed convolution	`"SCU"` (spatial, channel, unspecified)
1-D grouped transposed convolution	`"SCUU"` (spatial, channel, unspecified, unspecified)
2-D transposed convolution	`"SSCU"` (spatial, spatial, channel, unspecified)
2-D grouped transposed convolution	`"SSCUU"` (spatial, spatial, channel, unspecified, unspecified)
3-D transposed convolution	`"SSSCU"` (spatial, spatial, spatial, channel, unspecified)

The supported combinations of dimension labels depends on the type of convolution, for more information, see the weights argument.

For more information, see Deep Learning Data Formats.

Tip

Data Types: char | string

`Stride` — Step size for traversing input data
`1` (default) | numeric scalar | numeric vector

Step size for traversing the input data, specified as a numeric scalar or numeric vector.

To use the same step size for all convolution dimensions, specify the stride as a scalar. To specify a different value for each convolution dimension, specify the stride as a vector with elements ordered corresponding to the dimensions labels in the data format.

`DilationFactor` — Filter dilation factor
`1` (default) | numeric scalar | numeric vector

Filter dilation factor, specified as specified as a numeric scalar or numeric vector.

To use the dilation factor all convolution dimensions, specify the dilation factor as a scalar. To specify a different value for each convolution dimension, specify the dilation factor as a vector with elements ordered corresponding to the dimensions labels in the data format.

Use the dilation factor to increase the receptive field of the filter (the area of the input that the filter can see) on the input data. Using a dilation factor corresponds to an effective filter size of filterSize + (filterSize-1)*(dilationFactor-1).

`Cropping` — Cropping applied to edges of data
0 (default) | `"same"` | numeric scalar | numeric vector | numeric matrix

Cropping applied to edges of data, specified as one of the following.

"same" — Cropping is set so that the output size is the same as the input size when the stride is 1. More generally, the output size of each spatial dimension is inputSize*stride, where inputSize is the size of the input along the convolution dimension.
Numeric scalar — The same cropping value is applied to both ends of the convolution dimensions.
Numeric vector — A different cropping value is applied along each convolution dimension. Use a vector of size d, where d is the number of convolution dimensions of the input data. The ith element of the vector specifies the cropping applied to the start and the end along the ith convolution dimension.
Numeric matrix — A different cropping value is applied to the start and end of each convolution dimension. Use a matrix of size 2-by-d, where d is the number of convolution dimensions of the input data. The element (1,d) specifies the cropping applied to the start of convolution dimension d. The element (2,d) specifies the cropping applied to the end of convolution dimension d. For example, in 2-D the format is [top, left; bottom, right].

Output Arguments

collapse all

`Y` — Feature map
`dlarray`

Feature map, returned as a dlarray. The output Y has the same underlying data type as the input X.

If the input data X is a formatted dlarray, then Y has the same format as X. If the input data is not a formatted dlarray, then Y is an unformatted dlarray or numeric array with the same dimension order as the input data.

The size of the "C" (channel) dimension of Y depends on the size of the weights input. The size of the "C" (channel) dimension of output Y is the product of the size of the dimensions numFiltersPerGroup and numGroups in the weights argument. If weights is a formatted dlarray, this product is the same as the product of the size of the "C" (channel) dimension and the second "U" (unspecified) dimension.

Algorithms

collapse all

Transposed Convolution

The standard convolution operation downsamples the input by applying sliding convolutional filters to the input. By flattening the input and output, you can express the convolution operation as $Y = C X + B$ for the convolution matrix C and bias vector B that can be derived from the layer weights and biases.

Similarly, the transposed convolution operation upsamples the input by applying sliding convolutional filters to the input. To upsample the input instead of downsampling using sliding filters, the layer zero-pads each edge of the input with padding that has the size of the corresponding filter edge size minus 1.

By flattening the input and output, the transposed convolution operation is equivalent to $Y = C^{⊤} X + B$ , where C and B denote the convolution matrix and bias vector for standard convolution derived from the layer weights and biases, respectively. This operation is equivalent to the backward function of a standard convolution layer.

Extended Capabilities

expand all

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The dltranspconv function supports GPU array input with these usage notes and limitations:

When at least one of the following input arguments is a gpuArray or a dlarray with underlying data of type gpuArray, this function runs on the GPU.
- X
- weights
- bias

For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced in R2019b

dltranspconv

Syntax

Description

Examples

Perform 2-D Transposed Convolution

Perform Grouped Transposed Convolution

Input Arguments

`X` — Input data
`dlarray` | numeric array

`weights` — Filters
`dlarray` | numeric array

`bias` — Bias constant
`dlarray` vector | `dlarray` scalar | numeric vector | numeric scalar

Name-Value Arguments

`DataFormat` — Description of data dimensions
character vector | string scalar

`WeightsFormat` — Description of weights dimensions
character vector | string scalar

`Stride` — Step size for traversing input data
`1` (default) | numeric scalar | numeric vector

`DilationFactor` — Filter dilation factor
`1` (default) | numeric scalar | numeric vector

`Cropping` — Cropping applied to edges of data
0 (default) | `"same"` | numeric scalar | numeric vector | numeric matrix

Output Arguments

`Y` — Feature map
`dlarray`

Algorithms

Transposed Convolution

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

dltranspconv

Syntax

Description

Examples

Perform 2-D Transposed Convolution

Perform Grouped Transposed Convolution

Input Arguments

X — Input data dlarray | numeric array

weights — Filters dlarray | numeric array

bias — Bias constant dlarray vector | dlarray scalar | numeric vector | numeric scalar

Name-Value Arguments

DataFormat — Description of data dimensions character vector | string scalar

WeightsFormat — Description of weights dimensions character vector | string scalar

Stride — Step size for traversing input data 1 (default) | numeric scalar | numeric vector

DilationFactor — Filter dilation factor 1 (default) | numeric scalar | numeric vector

Cropping — Cropping applied to edges of data 0 (default) | "same" | numeric scalar | numeric vector | numeric matrix

Output Arguments

Y — Feature map dlarray

Algorithms

Transposed Convolution

Extended Capabilities

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

`X` — Input data
`dlarray` | numeric array

`weights` — Filters
`dlarray` | numeric array

`bias` — Bias constant
`dlarray` vector | `dlarray` scalar | numeric vector | numeric scalar

`DataFormat` — Description of data dimensions
character vector | string scalar

`WeightsFormat` — Description of weights dimensions
character vector | string scalar

`Stride` — Step size for traversing input data
`1` (default) | numeric scalar | numeric vector

`DilationFactor` — Filter dilation factor
`1` (default) | numeric scalar | numeric vector

`Cropping` — Cropping applied to edges of data
0 (default) | `"same"` | numeric scalar | numeric vector | numeric matrix

`Y` — Feature map
`dlarray`

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.