detectTextCRAFT

Detect texts in images by using CRAFT deep learning model

Since R2022a

Syntax

bboxes = detectTextCRAFT(I)

bboxes = detectTextCRAFT(I,roi)

bboxes = detectTextCRAFT(___,Name=Value)

Description

bboxes = detectTextCRAFT(I) detects texts in images by using character region awareness for text detection (CRAFT) deep learning model. The detectTextCRAFT function uses a pretrained CRAFT deep learning model to detect texts in an image. The pretrained CRAFT model can detect 9 languages that include Chinese, Japanese, Korean, Italian, English, French, Arabic, German, and Bangla (Indian).

example

Note

To use the pretrained CRAFT model, you must install the Computer Vision Toolbox™ Model for Text Detection. You can download and install the Computer Vision Toolbox Model for Text Detection from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons. To run this function, you will require the Deep Learning Toolbox™.

bboxes = detectTextCRAFT(I,roi) detects texts within a region-of-interest (ROI) in the image.

example

bboxes = detectTextCRAFT(___,Name=Value) specifies additional options by using name-value pair arguments. You can use the name-value pair arguments to fine-tune the detection results.

example

Examples

collapse all

Detect Texts in Images by Using CRAFT Model

This example uses:

Open Live Script

Read an input image into the MATLAB workspace.

I = imread("handicapSign.jpg");

Compute the text detection results by using the detectTextCRAFT function. The region and the affinity thresholds are set to default values. The output is a set of bounding boxes that contain the detected text regions.

bboxes = detectTextCRAFT(I);

Draw the output bounding boxes on the image by using the insertShape function.

Iout = insertShape(I,"rectangle",bboxes,LineWidth=3);

Display the text detection results.

figure
imshow(Iout)

Figure contains an axes object. The hidden axes object contains an object of type image.

Detect Texts in ROI by Using CRAFT

This example uses:

Open Live Script

Read an input image into the MATLAB workspace.

visiondatadir = fullfile(toolboxdir('vision'),'visiondata'); 
I = imread(fullfile(visiondatadir,'imageSets','books','pairOfBooks.jpg'));

Specify a region of interest (ROI) within the input image.

roi = [120,80,250,200];

Detect texts within the specified ROI by using the detectTextCRAFT function. The region and affinity thresholds are set to default values. The output is a set of bounding boxes that contain the detected text regions.

bboxes = detectTextCRAFT(I,roi);

Draw the ROI and the output bounding boxes on the input image. Display the text detection results.

I = insertObjectAnnotation(I,"rectangle",roi,"ROI",Color="green");
Iout = insertShape(I,"rectangle",bboxes,LineWidth=3);
figure
imshow(Iout)

Figure contains an axes object. The hidden axes object contains an object of type image.

Detect Characters by Modifying Affinity Threshold

This example uses:

Open Live Script

This example shows how to detect each character in the text regions of an input image by using the CRAFT model. You can achieve this by modifying the affinity threshold. This example also demonstrates the effect of different affinity threshold values on the detection results.

Read an input image into the MATLAB workspace.

visiondatadir = fullfile(toolboxdir('vision'),'visiondata'); 
I = imread(fullfile(visiondatadir,'bookCovers','book27.jpg'));

Specify the affinity threshold values to consider for detecting the text regions in the image.

threshold = [1 0.1 0.01 0.001 0.0004];

Preallocate a 4-D array Iout to store the output image with detection results.

Iout = zeros(size(I,1),size(I,2),size(I,3),length(threshold));

Compute the output for each affinity threshold value specified at the input. The output is a set of bounding boxes that contain the detected text regions. Draw the output bounding boxes on the image by using the insertShape function. The region threshold is set to the default value, 0.4.

for cnt = 1:length(threshold)
    bboxes = detectTextCRAFT(I,LinkThreshold=threshold(cnt));
    Iout(:,:,:,cnt) = insertShape(I,"rectangle",bboxes,LineWidth=3);
end

Display the text detection results obtained for different values of affinity threshold. You can notice that as the affinity threshold value decrease, the characters with less affinity scores are considered as connected components and are grouped as a single instance. For good localization and detection results, the affinity threshold must be greater than zero.

figure
montage(uint8(Iout),Size=[1 5],BackgroundColor="white");
title(['LinkThreshold = ' num2str(threshold(1)) ' | LinkThreshold = ' num2str(threshold(2)) ' | LinkThreshold = ' num2str(threshold(3)) ...
    ' | LinkThreshold = ' num2str(threshold(4)) ' | LinkThreshold = ' num2str(threshold(5))]);

Figure contains an axes object. The hidden axes object with title LinkThreshold = 1 | LinkThreshold = 0.1 | LinkThreshold = 0.01 | LinkThreshold = 0.001 | LinkThreshold = 0.0004 contains an object of type image.

Input Arguments

collapse all

`I` — Input image
2-D grayscale image | 2-D color image

Input image, specified as a 2-D grayscale image or 2-D color image.

`roi` — Search rectangular region-of-interest
four-element vector

Search a rectangular region-of-interest in an image, specified as a four-element vector of the form [x y width height]. The vector specifies the upper left corner and size of a rectangular region in pixels. The region must be fully contained in the image.

When you specify this value, the detectTextCRAFT function detects texts that are present only within this ROI.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: bboxes = detectTextCRAFT(I,MaxSize=[10,10]) specifies the maximum size of the text region to detect in the input image

`CharacterThreshold` — Region threshold for characters
0.4 (default) | positive scalar

Region threshold for localizing each character in the image, specified as a positive scalar in the range [0, 1]. To increase the number of detections, lower the region threshold value. However, this will also result in false-positives. To reduce the number of false-positives, increase the region threshold value.

Data Types: single | double

`LinkThreshold` — Link threshold
0.4 (default) | positive scalar

Link threshold for grouping adjacent characters into a word, specified as a positive scalar in the range [0, 1]. You can increase the number of character level detections by increasing the link threshold. To detect each character in the image, set this value to 1. For good localization and detection results, the link threshold must be greater than zero.

Data Types: single | double

`MinSize` — Size of smallest detectable text region
[6,6] (default) | two-element vector

Size of smallest detectable text region in the image, specified as a two-element vector of form [height width].

`MaxSize` — Size of largest detectable text region
size of input image (default) | two-element vector

Size of largest detectable text region in the image, specified as a two-element vector of form [height width]. By default, this value is set to the height and width of the input image.

`ExecutionEnvironment` — Hardware resource
`"auto"` (default) | `"cpu"` | `"gpu"`

Hardware resource for processing images with the CRAFT model, specified as "auto", "gpu", or "cpu".

ExecutionEnvironment	Description
`"auto"`	Use a GPU if available. Otherwise, use the CPU. The use of GPU requires Parallel Computing Toolbox™ and a CUDA^® enabled NVIDIA^® GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox).
`"gpu"`	Use the GPU. If a suitable GPU is not available, the function returns an error message.
`"cpu"`	Use the CPU.

Data Types: char | string

`Acceleration` — Performance optimization
`"auto"` (default) | `"mex"` | `"none"`

Performance optimization, specified as "auto", "mex", or "none".

Acceleration	Description
`"auto"`	Automatically apply a number of optimizations suitable for the input network and hardware resource.
`"mex"`	Compile and execute a MEX function. This option is available when using a GPU only. You must also have a C/C++ compiler installed. For setup instructions, see Set Up Compiler (GPU Coder).
`"none"`	Disable all acceleration.

The default option is "auto". If you use the "auto" option, MATLAB^® does not ever generate a MEX function.

Using the "Acceleration" options "auto" and "mex" can offer performance benefits, but at the expense of an increased initial run time. Subsequent calls with compatible parameters are faster. Use performance optimization when you plan to call the function multiple times using new input data.

The "mex" option generates and executes a MEX function based on the network and parameters used in the function call. You can have several MEX functions associated with a single network at one time. Clearing the network variable also clears any MEX functions associated with that network.

The "mex" option is only available when you are using a GPU. Using a GPU requires Parallel Computing Toolbox and a CUDA enabled NVIDIA GPU. For information about the supported compute capabilities, see GPU Computing Requirements (Parallel Computing Toolbox). If Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error.

Output Arguments

collapse all

`bboxes` — Bounding boxes for detected text regions
M-by-4 matrix

Bounding boxes specifying the detected text regions, returned as an M-by-4 matrix. M is the number of detected text regions. Each row in the matrix is a vector of form [x y width height]. The vector specifies the upper left corner and size of the detected region in pixels.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

The roi argument must be a code generation constant (coder.const()) and a 1-by-4 vector.
Code generation does not support variable-size data for the input argument I.
Only the CharacterThreshold, LinkThreshold, MinSize, and MaxSize name-value arguments are supported.
When you set build_type argument of coder.config object to dll, for generating code that does not use any third-party library, the DynamicMemoryAllocationForFixedSizeArrays property of the coder.CodeConfig object must be set to true.
To avoid memory allocation error during library free C++ code generation on Windows platform, the cfg.DynamicMemoryAllocationForFixedSizeArrays property of the coder.CodeConfig object must be set to true.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

The roi argument must be a code generation constant (coder.const()) and a 1-by-4 vector.
Code generation does not support variable-size data for the input argument I.
Only the CharacterThreshold, LinkThreshold, MinSize, and MaxSize name-value arguments are supported.

Version History

Introduced in R2022a

expand all

R2024a: Generate CUDA code using GPU Coder

detectTextCRAFT now supports the generation of optimized CUDA code (requires GPU Coder™).

R2024a: Generate C/C++ code using MATLAB Coder

detectTextCRAFT now supports the generation of C/C++ code (requires MATLAB Coder™).

detectTextCRAFT

Syntax

Description

Examples

Detect Texts in Images by Using CRAFT Model

Detect Texts in ROI by Using CRAFT

Detect Characters by Modifying Affinity Threshold

Input Arguments

`I` — Input image
2-D grayscale image | 2-D color image

`roi` — Search rectangular region-of-interest
four-element vector

Name-Value Arguments

`CharacterThreshold` — Region threshold for characters
0.4 (default) | positive scalar

`LinkThreshold` — Link threshold
0.4 (default) | positive scalar

`MinSize` — Size of smallest detectable text region
[6,6] (default) | two-element vector

`MaxSize` — Size of largest detectable text region
size of input image (default) | two-element vector

`ExecutionEnvironment` — Hardware resource
`"auto"` (default) | `"cpu"` | `"gpu"`

`Acceleration` — Performance optimization
`"auto"` (default) | `"mex"` | `"none"`

Output Arguments

`bboxes` — Bounding boxes for detected text regions
M-by-4 matrix

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024a: Generate CUDA code using GPU Coder

R2024a: Generate C/C++ code using MATLAB Coder

See Also

Topics

detectTextCRAFT

Syntax

Description

Examples

Detect Texts in Images by Using CRAFT Model

Detect Texts in ROI by Using CRAFT

Detect Characters by Modifying Affinity Threshold

Input Arguments

I — Input image 2-D grayscale image | 2-D color image

roi — Search rectangular region-of-interest four-element vector

Name-Value Arguments

CharacterThreshold — Region threshold for characters 0.4 (default) | positive scalar

LinkThreshold — Link threshold 0.4 (default) | positive scalar

MinSize — Size of smallest detectable text region [6,6] (default) | two-element vector

MaxSize — Size of largest detectable text region size of input image (default) | two-element vector

ExecutionEnvironment — Hardware resource "auto" (default) | "cpu" | "gpu"

Acceleration — Performance optimization "auto" (default) | "mex" | "none"

Output Arguments

bboxes — Bounding boxes for detected text regions M-by-4 matrix

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

R2024a: Generate CUDA code using GPU Coder

R2024a: Generate C/C++ code using MATLAB Coder

See Also

Topics

`I` — Input image
2-D grayscale image | 2-D color image

`roi` — Search rectangular region-of-interest
four-element vector

`CharacterThreshold` — Region threshold for characters
0.4 (default) | positive scalar

`LinkThreshold` — Link threshold
0.4 (default) | positive scalar

`MinSize` — Size of smallest detectable text region
[6,6] (default) | two-element vector

`MaxSize` — Size of largest detectable text region
size of input image (default) | two-element vector

`ExecutionEnvironment` — Hardware resource
`"auto"` (default) | `"cpu"` | `"gpu"`

`Acceleration` — Performance optimization
`"auto"` (default) | `"mex"` | `"none"`

`bboxes` — Bounding boxes for detected text regions
M-by-4 matrix

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.