Main Content

Text Detection and Recognition

Detect and recognize text using image feature detection and description, deep learning, and OCR

Detecting and recognizing text in images is a common task performed in computer vision applications. For example, you can capture video of a road scene from a moving vehicle, recognize signposts in the captured scene, and alert the driver about the signs.

You can combine detection and recognition combined into a two-step process, where the first step finds regions that contain text, and then the second step recognizes the text within the regions.

Input image showing an accessible parking sign, connected to a detector, which outputs an image with predicted bounding boxes overlaid on the sign text, connected to a recognizer that outputs a list of the words recognized on the sign.

Text detection algorithms use local image features, machine learning or deep learning, to locate or segment text within an image. The examples in the Computer Vision Toolbox™ demonstrate how to use blob analysis, the maximally stable extremal regions (MSER) feature detector, and the character region awareness for text detection (CRAFT) deep learning model for text detection.

Once you have detected the text, text recognition models, based on machine learning or deep learning, process the text regions to return the predicted text. The ocr function uses pretrained language models to recognize text in multiple languages. You can also train a custom language model using the trainOCR function. For more information, see Getting Started with OCR.


Image LabelerLabel images for computer vision applications


expand all

ocrRecognize text using optical character recognition
ocrTextStore OCR results
visionSupportPackagesStart Installer to download, install, or uninstall Computer Vision Toolbox data
trainOCRTrain OCR model to recognize text in image (Since R2023a)
evaluateOCREvaluate OCR results against ground truth (Since R2023a)
ocrMetricsStore OCR quality metrics (Since R2023a)
ocrTrainingOptionsOptions for training OCR model (Since R2023a)
ocrTrainingDataCreate training data for OCR from ground truth (Since R2023a)
quantizeOCRQuantize OCR model (Since R2023a)
detectTextCRAFTDetect texts in images by using CRAFT deep learning model (Since R2022a)
detectMSERFeaturesDetect MSER features
vision.BlobAnalysisProperties of connected regions
extractHOGFeaturesExtract histogram of oriented gradients (HOG) features


Get Started