Train Custom OCR Model
Training an Optical Character Recognition (OCR) model to recognize custom text consists of three steps:
Prepare training data
Train an OCR model
Evaluate OCR training
Prepare Training Data
The Computer Vision Toolbox™ provides deep learning based OCR training and supports transfer learning and fine-tuning of OCR models shipped with the toolbox. Training with deep learning requires hundreds of training samples, of each character part of the character set. After collecting training images, you must label, save, and combine the data into a datastore before training an OCR Model. Use these steps to prepare the data.
Label Training Images
You can use the Image Labeler app to interactively label image ground truth data. Ground truth for OCR must contain the location of text regions and the actual text within the regions. You can specify the location and size of the text region using a rectangle ROI label. You can specify the actual text within each rectangle ROI by adding a string Attribute to the rectangle ROI label. Use one of these methods to launch the Image Labeler:
MATLAB® Toolstrip: On the Apps tab, under Image Processing and Computer Vision, click the Image Labeler app icon .
MATLAB command prompt: Enter
imageLabeler
.
The Image Labeler toolstrip provides these buttons to use for labeling OCR data:
Import — Load a collection of images.
Label — Add Rectangle bounding box labels.
Attribute — Add a string Attribute to the rectangle ROI label which defines the type of content in the bounding box.
Export — Export labels and label definitions as a ground truth object.
For more details about using the Image Labeler app, see Get Started with the Image Labeler.
Create Label Data Using Image Labeler
Load an image collection from a folder or an
ImageDatastore
object into the Image Labeler app.Define a rectangle ROI and name it. For example,
Text
.Define a string attribute for the label, which defines the type of text string in the ROI, and name it. For example,
word
.Label the text in the collection of images, or use an automation algorithm to prelabel some of the text automatically. For more details using an automation algorithm, see Automate Ground Truth Labeling for OCR.
Export the labeled data to the workspace or save it to a file. The app exports the labels as a
groundTruth
object.
Load Training Data From Ground Truth
Use the ocrTrainingData
function to load training data from the exported
groundTruth
object. The ocrTrainingData
function returns three datastores for images,
bounding boxes, and text. For the purposes of training, combine those datastores
using the combine
function.
Train an OCR model
Use the trainOCR
function to train an OCR model and configure the training options using the ocrTrainingOptions
function. Optionally, for faster performance, you can
quantize the trained models using the quantizeOCR
function, but this can decrease the accuracy of the model.
This can be helpful if the OCR model will be deployed in resource constrained systems.
For an example that demonstrates how to use these functions, see Train an OCR Model to Recognize Seven-Segment Digits.
Evaluate OCR training
Use the metrics generated by the evaluateOCR
function to evaluate the quality of the OCR model.
See Also
Apps
Functions
ocr
|trainOCR
|evaluateOCR
|quantizeOCR
|ocrTrainingData