Contenido principal

Text, Barcode, and Fiducial Marker Detection and Recognition

Detect and recognize text (OCR), barcodes, and fiducial markers using AI models

Computer Vision Toolbox™ supports text, barcode, and fiducial marker detection in images and videos using a combination of deep learning models and classical computer vision techniques. These capabilities are essential for applications such as autonomous driving, industrial automation, document analysis, and augmented reality.

For text detection, you can use a two-step process: first, detect regions in the image that contain text, and then recognize the text within those regions using optical character recognition (OCR). The toolbox offers multiple text detection approaches, including blob analysis, the maximally stable extremal regions (MSER) feature detector, and the deep learning-based CRAFT model. These methods help locate regions containing text in complex scenes. Alternatively, you can use the Image Labeler and Video Labeler apps to perform interactive and AI-assisted annotation of text regions in images.

Once you have identified text regions in an image, you can use OCR to recognize the text using pretrained language models that support multiple languages. For custom applications, you can train your own OCR models using the trainOCR function. For more information, see Getting Started with OCR and Train Custom OCR Model.

For barcode and fiducial marker detection, the toolbox supports reading and decoding 1-D and 2-D barcodes and detecting fiducial markers such as AprilTags and ArUco markers. You can also generate ArUco markers programmatically, which is useful for calibration and tracking tasks in robotics and AR systems.

Apps

Image LabelerLabel images for computer vision applications
Video LabelerLabel video for computer vision applications

Functions

expand all

detectTextCRAFTDetect texts in images by using CRAFT deep learning model (Since R2022a)
detectMSERFeaturesDetect MSER features
vision.BlobAnalysisProperties of connected regions
extractHOGFeaturesExtract histogram of oriented gradients (HOG) features
ocrRecognize text using optical character recognition
ocrTextStore OCR results
visionSupportPackagesStart Installer to download, install, or uninstall Computer Vision Toolbox data
readAprilTagDetect and estimate pose for AprilTag in image
readArucoMarkerDetect and estimate pose for ArUco marker in image (Since R2024a)
generateArucoMarkerGenerate ArUco marker images (Since R2024a)
readBarcodeDetect and decode 1-D or 2-D barcode in image
trainOCRTrain OCR model to recognize text in image (Since R2023a)
evaluateOCREvaluate OCR results against ground truth (Since R2023a)
ocrMetricsStore OCR quality metrics (Since R2023a)
ocrTrainingOptionsOptions for training OCR model (Since R2023a)
ocrTrainingDataCreate training data for OCR from ground truth (Since R2023a)
quantizeOCRQuantize OCR model (Since R2023a)

Topics

Get Started

Featured Examples