Incorporating Machine Learning Models into Safety-Critical Systems - MATLAB
Video Player is loading.
Current Time 0:00
Duration 44:35
Loaded: 0.37%
Stream Type LIVE
Remaining Time 44:35
 
1x
  • descriptions off, selected
  • en (Main), selected
    Video length is 44:35

    Incorporating Machine Learning Models into Safety-Critical Systems

    Overview

    Neural networks can obtain state-of-the-art performance in various tasks, including image classification, object detection, speech recognition, and machine translation. Due to this impressive performance, there has been a desire to utilize neural networks for applications in industries with safety-critical components, such as aerospace, automotive, and healthcare. However, while these industries have established processes for verifying and validating traditional software, it is often unclear how to verify the reliability of neural networks. This issue is especially prevalent in aviation, where there is potential to revolutionize the industry. However, existing airborne certification standards present major incompatibilities with Machine Learning technology. These include issues with ML model traceability and explainability and the inadequacy of traditional coverage metrics. The certification of ML-based airborne systems is problematic due to these incompatibilities. Furthermore, new certification standards intended to address these challenges are not yet released.

    In this session, we’ll introduce a case study for certifying an airborne machine learning system. We’ll build a runway sign classification system that receives images from a forward-facing camera in the aircraft and then detects airport runway signs, aiding the pilot in navigation and situational awareness at the airport. We propose and implement a custom ML certification workflow for machine learning systems based on existing certification standards to tackle the previously mentioned challenges. We will walk you through all the steps in the workflow, from defining the ML requirements, managing the data, training the model, and verifying its performance to the implementation of the system in hardware and validation of the requirements. This case study will provide insights and potential solutions across industries with safety-critical components seeking to integrate neural networks into their operations.

    Highlights

    • Learn about gaps between Machine Learning development and Certification (e.g., for DO-178C, ISO26262).
    • Learn about a custom ML certification workflow designed to tackle these certification challenges within the framework of existing standards.
    • Explore a case study of an airborne system developed in compliance with the proposed custom ML certification workflow.

    About the Presenter

    Lucas Garcia is a principal product manager for deep learning at MathWorks with more than 15 years of machine learning experience and research in the computer software industry. He works with customer-facing and development teams to define, develop, and launch new capabilities and applications that meet customer needs and market trends in deep learning. Lucas joined MathWorks in 2008 as a customer-facing engineer and has worked with engineers and scientists across industries to help them tackle real-world problems in AI. Before joining MathWorks, he worked as a software developer in finance. Lucas holds a Ph.D. in applied mathematics from the Complutense University of Madrid and Polytechnic University of Madrid.

    Recorded: 26 Mar 2024

    AI is becoming more and more present in our daily lives. And what once seemed like the realm of science fiction, is now becoming an everyday convenience enhancing our experiences in ways that were once just unimaginable. And as AI continues to evolve, its applications are expanding into more critical domains. Domains where the stakes are clearly significantly higher.

    Hello, everyone, and welcome to this session titled Incorporating Machine Learning Models into Safety-Critical Systems. My name is Lucas Garcia, and I'm a Product Manager for Deep Learning at MathWorks. I've been with MathWorks for over 15 years working with engineers and scientists across industries to help them tackle real-world problems using AI. I'm a mathematician by training, and my PhD is focused on how neural networks can be used to solve combinatorial optimization problems. Today, we'll explore the exciting topic of Incorporating Machine Learning Models into Safety-Critical Systems. Let's get started.

    As we all know, AI has become useful and very successful for a wide variety of tasks. And this has led to a desire to incorporate and deploy AI algorithms in safety-critical situations like those that you might find in the domains of automotive, medical, and aerospace. And when deploying in these domains, it is important to explain, verify, and validate the behavior of the model.

    Now naturally, the verifying and validating AI-enabled systems comes with a wide set of challenges including data management, traceability, robustness requirements, and much more. This is just a representative sample of some of the challenges that may arise from having AI components in your system.

    Today we will be focusing on addressing many of these challenges. So here's the agenda for today. We'll start with an introduction to AI certification in airborne systems. We will then focus on the certification process for a low-criticality application with a case study on a runway sign classifier. And finally, we'll focus on what moving to certification of higher criticality levels would entail.

    Let's start with an overview of AI certification in airborne systems. Now technology has advanced leaps and bounds reducing cockpit workloads for pilots and making flying safer. Now, not so long ago, it used to be the case that you could find four people in the cockpit of an aircraft.

    Here's an image of the cockpit of a Boeing 707, which doesn't operate commercially anymore. But back then, apart from the pilot and copilot, it was common to have a flight engineer and also a navigation engineer on board. Now, as technology evolved, many tasks have been overtaken by software such that, nowadays, we mainly have a pilot and co-pilot on board of commercial aircraft.

    There are numerous benefits that the incorporation of AI could bring into in the future, including single-pilot operation, predictive maintenance applications, voice-recognition systems in the cockpit, runway-object detection, visual collision avoidance, autonomous flights, and more.

    But we cannot simply design an AI component and just use it in airborne software. There are rules or better said, standards to be followed. And the aircraft standard related to software design is given by DO-178C. So DO-178C is the software standard that outlines the process or the software engineering processes, activities, and tasks necessary to ensure that airborne software is reliable, is safe, and can perform its intended functions.

    DO-178C defines five software levels that are based on the potential impact of a software error on the aircraft's safety or operation. And these are referred to as Design Assurance Levels or DALs, and I'll speak more to those later. DO-178C is widely accepted as the standard for developing airworthy software and is mandatory for software used in civil aviation applications in many countries. Its latest version is DO-178C, which was released in December, 2011 and includes updates to the previous revision DO-178B to reflect the advancements in software technology and development processes.

    Now, the long service history with DO-178C with various refinements based on real-world experience, A, B and C has allowed the industry to build a level of trust in the process as defined by the standard.

    So what about when we throw AI into the mix as a software component? We know how AI and machine learning offers the best solution for many applications, including vision, audio, and more. And it is very challenging to build task-based software with that level of performance. However, it turns out that several DO-178C objectives cannot be directly applied to software with a machine-learning component. And this creates a problem.

    How are we then going to use machine learning into aviation software and certify it against DO-178C? How can we build trust in the verification methods for machine learning? The potential is huge, but the gaps and shortcomings need to be addressed.

    For that matter, industry and regulators are working tirelessly and making significant progress for machine-learning certification in aerospace. EASA, the European Union Aviation Safety Agency has published two iterations of their roadmap and worked together with various companies to produce seminal work for design assurance of neural networks and the use of formal methods for learning assurance. FAA, although, hasn't published as much collateral yet, also stays at the forefront of these developments.

    In that regard, MathWorks is playing an active role in the joint working group from EUROCAE and SAE together with EASA, FAA, and key aviation companies. The goal is to put together a new process standard, ARP6983 for the development and certification approval of aeronautical safety related products implementing AI. This is work in progress, which has been postponed a few times already and is not expected to be released for at least another year.

    So let's start thinking about how big is really the gap between machine learning and traditional software development? We know that for rule-based software, we have well-established definitions and methods. To show compliance with objectives, we have DO-178C.

    And so we can look at some code, and we will be able to trace it back all the way to our requirements. We can produce coverage metrics and understand or explain its behavior. However, this isn't the case when machine learning is incorporated. In fact, there are 15 out of the 71 DO-178C objectives that are incompatible with machine-learning components. And we know this from the working group efforts on the statement of concerns document that was published in 2021.

    So we'd like to be able to answer what each of these concepts would mean Explainability, traceability, coverage. So essentially, there are known gaps. But as we do not have any new standard yet, we wanted to see how far can we go with the current standards?

    Can we develop a workflow to address certification of machine-learning components using the current standards? And the answer is that for low-criticality applications, we can if certain assumptions about the machine-learning development workflow are applied.

    In fact, I firmly believe that focusing on understanding the boundaries of the current standard, will help you and your organization be ahead of the curve whenever the new standards are released. And I firmly believe that this holds true, not just for aviation but other industries as well.

    So can we certify machine learning now? I've already answered this question. So let's focus on how we will do it. This work is based on the joint research between the technical university of Munich, MathWorks, and Nasa mainly carried out by Konstantin Dmitriev, who is a developer at MathWorks and research associate at Tum.

    And their intent is to take an incremental certification approach leveraging existing standards starting with a low-criticality machine-learning workflow. This allows us to reduce the number of objectives and enable black-box verification. Here in the table, we can see the different failure categories, the corresponding Design Assurance Level, or DAL, and it's DO-178C objectives. We can note that for DAL D, we need to fulfill 26 objectives in order to certify the software component. Note that for the highest criticality level DAL A, we would need to fulfill 71 objectives.

    As for step two, which I'll cover later in the presentation, again, we'll be leveraging the existing standards and architectural mitigation strategy that I will introduce later. So let's dive deeper into step one, and that is the low-criticality DAL D machine-learning workflow.

    So for DAL D, we can treat the software component as a black box. We might have traditional non-machine learning source code, commercial off-the-shelf object code, machine-learning models and machine-learning source code all put together into integrated object code. But we don't need to know any of this from the outside.

    And so for DAL D, we can just look at this as a black box. We have our high-level software requirements from which we have derived the software architecture, and within this software architecture, some component is this black box. We will also develop tests from our high-level requirements to test the black box.

    In summary, the origin of the source and object code does not matter and so black box verification according to DO-178C, will satisfy all the objectives. Black box verification techniques are things like testing model accuracy, for example. Something we are already used to as AI practitioners. Here's just an excerpt of the compliance analysis that you have to do according to DO-178C. So you can see, for example, here, that for DAL D, you can omit objectives four and five.

    Another thing that we've been looking at is this W-shaped development process and the concept of learning assurance. The W-shaped development process is this framework put together by EASA and Daedalean to adapt the classical V-shaped cycle to AI applications and that I'll be adopting throughout the presentation.

    As you can see, this process starts with requirements allocation to the machine-learning component moving to data management, all the way down to model training. Then we have a key step regarding the verification of the machine-learning model and learning process to then move to the implementation, integration, and testing under the larger system or within the larger system under design.

    Note that this W-shaped development process is in fact, concurrent with and can co-exist with the V-shaped cycle we typically see for non-machine learning components. So from the system or subsystem requirements, we will derive or allocate requirements to the AI component and follow the W-shaped diagram, where if it's a non-machine learning item, we can follow the traditional V cycle.

    I'd like to show you next how this can be done in practice and pursue certification of a DAL D system with this case study where we'll create a runway sign classifier. So what I'll be showing you, in the next slides, is how MathWorks Tools enable certification of a DAL D AI-enabled system.

    And for the case study that I'll introduce next, I'll be guiding you through the various activities that must be completed in order to achieve or in order to certify a system with an AI component. Now, unlike traditional software, the AI component is the solution of an optimization problem and is data driven. And this optimization problem consists of an objective function of which we wish to minimize, a complex mathematical model, maybe a machine learning or a deep learning model, and a data set.

    So therefore, to verify the resulting AI model, specific activities are going to be required which are not present in the conventional V&V. The MathWorks ecosystem supports these activities and as we will show, enables a workflow that achieves DO compliance.

    This workflow is encapsulated in this W diagram. And as I mentioned earlier, traditional software and hardware V&V activities follow the purple V-shaped diagram and pertain to non-machine learning related items. For DAL D, the criticality level that we will focus on this section, verification activities for the machine learning component are things like testing the model's accuracy on a held-out test set. Provided the model attains sufficiently high performance on a held-out test set, we can be somewhat confident in the model's generalization to the task at hand and correctness of its functionality.

    So let's take a deeper look into this app case study. The example we have is a simple visual sign recognition system that will provide decision support to a pilot while taxiing on the runway. Now, pilots need to follow specific signs, and these may be informational signs or mandatory. And we want to develop a system that provides this information to the pilot, either through some audio output or overlaid on the heads-up display.

    The system consists of a front-mounted camera on an aircraft, and this camera relays images. We do some pre-processing and rescaling, and then we pass it on to a deep neural network. And in the context of this example, we're using YOLO networks, and the role of these YOLO networks is to basically locate and classify airport signs on the runway.

    These signs will either be informational signs, which are yellow, or mandatory signs which are red. And the bounding boxes and classification labels will be identified and then passed to the pilot for information and decision-making process. So this whole system is low criticality. It's a DAL D system. It's just providing information to the pilot, and it's not taking control of the aircraft in any way.

    Note that this case study is comprised of a few elements. The complete system, this includes everything from the camera to the heads-up display, the machine-learning component, which is in this case, the deep neural network, and the data. As for the data set, we will be using instances of different airport runways in various locations and operating conditions, such as snow weather conditions or nighttime, with the goal to adequately represent the operational design domain of the problem. The images have been created synthetically using Simulink, Aerospace Blockset, and FlightGear.

    So to support the discussion for this case study, we will make reference to the system, the component, and the data in the following slides. And the example that I'll be showing today is included as part of Deep Learning Toolbox and also the DO Qualification Kit.

    The Deep Learning Toolbox example offers a lighter version of the end-to-end reference workflow example. Whereas, the DO Qualification Kit example addresses specific activities that support certification against DO-178C. So for this example I just introduced, I'd like to take you through all the steps in this workflow starting with the top-left requirements allocated to the AI constituent.

    So we have written a set of requirements in Requirements Toolbox for this problem. Requirements Toolbox allows you to define requirements and allocate them to the AI or machine-learning constituent. We can see here, we have functional requirements and also system operational requirements.

    These are the system requirements that are derived in the system process ARP4754A, and these will need to allocate to the component or the system. So we have functional requirements, what does my system have to do and operational requirements in which context has my system to perform?

    So we have written a set of requirements that we anticipate you would require for such a system. And as you can see, there are functional requirements that things like detection latency and detection precision, how many signs the system should be able to detect and overlay at once. And we also say here, we see the system operational domain requirements. Things like, what airports to expect the system to be operational and light conditions and so on?

    So we found that the system functional requirements naturally lend themselves to requirements on our machine-learning component while our system operational domain requirements naturally lend themselves to requirements on the data. So using the Requirements Editor, we take our system requirements, for example, the weather condition requirement, and we can link this directly onto our data requirements.

    So here we have four weather conditions fog, snow, rain, and fair. And we can link these requirements to four separate requirements on the data. But in order to really tie these requirements to the data images themselves, we have created a data trace table. And this table is basically automatically generated from a Matlab script. It puts together all the data requirement IDs with the corresponding synthetically generated data.

    When generating the synthetic data, there is a metadata upon construction with the requirements that are met for each of the generated images. So for example, let's say I create an image in San Francisco Airport with fair weather condition. Those requirement IDs will be tagged in the metadata.

    And then for each data requirement ID, we can search through the corpus of images we have generated and pull in all those images that have the corresponding data requirement ID, and then create a reference to those images through a Matlab image data store object as we're doing here. So taking a fair weather condition, for example, we have here 114 images corresponding to fair weather condition. And they are all held by reference in this image data store.

    And so this shows how we can link system requirements to data requirements all the way down to the data itself. So to go to the data level, we can simply click the Open Data Store hyperlink that opens the Image Labeler app. And this means that, basically, we can now review the correctness of that data. We want here to check that the images have been generated correctly or in more general terms, that the image images associated with specific data requirement IDs are the right ones.

    Now that we pull the requirements and linked requirements to objects in Matlab, we can also make use of statistical tools in Matlab to portray the distributions of requirements by coverage. So we can see that for the fair weather condition, we had 114 images.

    We know that by looking at this very simple 1-D histogram, that there might be a disproportionate number of fair weather condition images when compared to rain, snow, and fog. And so this might be flagged either here or perhaps at a later stage when we are verifying the accuracy of the trained model that we may not have enough coverage for rain, snow, and fog.

    Of course, this is a one-dimensional view, and these images are highly correlated with each other. Since we are generating images that have multiple requirements, we could potentially look into a higher-dimensional view to get really a better grasp of coverage. The other thing that this allows us to do is to identify any missing requirement or bias we anticipate we might be seeing in the data set. Note how we have an unfulfilled requirement on the sign rotation images.

    Coming back to the use of the Image Labeler app, you can use it to correct issues with bounding boxes or labels, such as resizing and moving the bounding boxes, deleting boxes, or changing the labels. When doing the synthetic data generation process, we got an incorrect bounding box position, which we can correct manually by just simply dragging that bounding box as we review the data set.

    And then we can also compare the review data with the original data set and see how many boxes have, in fact, moved. Here, we have defined a tolerance as a five-pixel shift off the box.

    Data augmentation improves network accuracy by randomly transforming the original training data during training. And so by using data augmentation, you can add more variety to the training data without having to actually increase the number of labeled training samples. Here, we have applied two techniques, color-jittering augmentation in the HSV space and random scaling by 50%.

    Moving on to the next set of steps, here we want to build our network architecture, train it, and verify it. MathWorks has tools to interactively design your network allowing you to be more productive. You can also interactively track training and performance for different experiments allowing you to select the network that best suits your metrics. And of course, you can accelerate model training leveraging GPUs and the cloud.

    At this point, we need to come up with an architecture to start from and analyze how different variations of the network may perform. Based on their requirements for the machine-learning component, we decided to use the YOLO family for this example. And these models require us to estimate anchor boxes as part of the hyperparameters.

    Anchor boxes are a set of predefined bounding boxes of a certain height and width. These boxes are defined to capture the scale and aspect ratio of specific object classes you want to detect and are typically chosen based on object sizes in your training data set. So that's where we start. We estimate the anchor boxes from our training data set, and once we have that, we can train various YOLO networks with some standard hyperparameters.

    So we train these different YOLO-based object detectors. And here you can see some of the output we are obtaining. We can check the training progress plot. Here that's showing the convergence for these YOLO v2. And we also report all this information back to a table. So for certification, you would like to have a record of this training process. And you can export all this information to, let's say, a PDF, for example.

    For YOLO v3, we show how to train the model with our custom training infrastructure. And this allows to obtain far more control over the training process. So for example, for training object-detector networks, it is preferred to thermalize the initialization of the network. So we set up some random initialization, but then we start up with a very small learning rate. And we kind of warm up those weights.

    And so that is what is being shown here. We see that the learning rate starts to increase at the start for the first 100 iterations. And then we follow the standard training pattern after this. And we find that this improves overall convergence. And of course, we complete this training also with the training of YOLO v4 object detector too.

    We then reached the learning process verification stage. And for the verification, given this is a DAL D component, we only require an overall accuracy on these models. So we define this as part of the system requirements and then derived from the machine-learning component requirements, that the model should have at least an average precision per class of 95%.

    And these are the precision and recall curves that are quite typical for models and object detectors. And we can see that, for instance, here, the YOLO v4 does not meet that requirement. However, YOLO v2 and v3 meet that requirement for all classes. And so we will be using this YOLO v3 for the model implementation going forward and ultimately, the final system.

    We have a quite unique code generation framework that allows models to be developed in Matlab or Simulink to be deployed anywhere without having to rewrite the original model. And automatic code generation eliminates coding errors. And it's an enormous value driver for any organization adopting it.

    In our example, we can deploy to a GPU or a CPU. And in this context, you can see the model compression techniques, such as quantization, allowing you to quantize weights, biases, and activations of layers to a reduced precision scale integer data type.

    We can see that for these YOLO v3 that we selected, the model ' in terms of its learnable parameters, is reduced by a factor of four, which is kind of what we expect when we quantize from floating point to int8. Now here, since the model has a lot of convolutional layers, we can do a lot of compression here.

    And when we evaluate the accuracy of the quantized model, we actually find that the average precision and recall values do not really drop much. And it still maintains 95% requirement that we had. And so then you can generate C, C++, Cuda, or HDL from this quantized network. In this case, we generate int8's Cuda code for this quantized network, and part of the verification process is to spin up this code generation report and have that as one of our certification artifacts.

    Next, I want to integrate my AI model with the larger system under design for which I'll be using Simulink. So basically in the center, you can see the deep-learning object detector that we trained, and we have now a few different blocks. We have a data-acquisition block, a visualization block that takes the role of the heads-up display. And we're just passing a bunch of images through running inference or simulating what the output of the network should be.

    As we move to the final stages of the W diagram, I want to go through the process of requirements verification through testing. So recall we had our requirements. They're right here. And now for each of these requirements, we've written unit tests. And we want to run those unit tests.

    So what I can do is, I can basically right click to run this test. And by running the test, I can see, with a glimpse of an eye, if the requirements are met or if they are unfulfilled. Now, after doing this both at the system level and now also at the component level, I can see how there's going to be an unmet requirement. Let me just run this test.

    OK, and if you recall, a few slides back, we didn't have any rotation sign images. So that's what we're seeing here too. Here's the unfulfilled requirement on sign rotations with a failed test.

    I would like to quickly move on to giving a glimpse of what DAL C may look like. We have seen how to perform assurance activities for a DAL D system. DAL D is the low level of criticality. And because of that, we were able to treat the object detector as a black box, as shown in this diagram. And our main assurance activity was to measure the accuracy of the object detector.

    But what if we now want to use the object-detection system or the object detection in a higher criticality system? So for example, if we consider our use case, we might want to use the results from the detection to affect how the plane navigates and taxis along the runway, instead of just giving information to the pilot.

    So for a DAL C system, it is not enough to simply treat the object detector as a black box. We need to gain a better understanding of what's going on inside the object detector. And this includes placing requirements on the machine-learning model itself and gaining understanding of the emerging properties of the neural network. Standards will not be ready until at least 2025. But we can hypothesize some of the techniques that may come relevant then to verify and run machine-learning models.

    Robustness is one of the main concerns when deploying neural networks in safety-critical situations. The reason being is that it's been shown that neural networks can misclassify inputs due to small imperceptible changes, small perturbations that can change the output of the network. now we will not have a trustworthy system if by changing a single pixel, we get a different output.

    So in our release 2022B, we shipped a Deep Learning Toolbox Verification Library allowing us to verify and test robustness of deep learning networks. Let's see what this library can help you accomplish in terms of verification.

    Given one of our images in the test set, we can choose a perturbation that defines, let's say, a collection of perturbed images for this specific image. It's important to note that this collection of images is extremely large. And this is just a representative sample. And it's actually not practical to test each perturbed image individually.

    So using formal methods, we can prove, for the entire volume, if the output of the network changes. If the output of the network doesn't change, we get verified. If the output does change, then we get the properties violated. And if we're not able to prove it, then we get an unproven result.

    A trustworthy AI system should produce accurate predictions on known context, but it should also identify unknown examples to the model and reject them or transfer them to a human for safe handling. Deep Learning Toolbox Verification Library also includes functionality for out-of-distribution detection.

    So what we have on display here, is an image with fair weather. And that seems like of what we've seen before, which is in distribution. However, I've also created an image that's out of distribution for comparison. As you might notice, there's actually smog on the runway, not just fog.

    To make sense of this, we can explore various methods. And one particularly interesting technique is called HBOS, which stands for Histogram-Based Outlier Scores. And this approach involves computing a score by developing histograms of the principal component features of a certain layer of our model.

    And using this technique, we can determine if data fits within our expected distribution, or if it deviates from it. If you look at this histogram, the original training data is represented in blue. And then in red, you can see the effect of the shift in the distribution when smog is presented in that very same data set.

    The AI model integrated with Simulink, can also be integrated with a runtime-monitoring system. This runtime-monitoring system sees the input image and determines whether it is in or out of distribution with respect to the training data. When that happens, the lamp turns red. But for the most part, these images appear to be in-distribution, and when that happens, the lamp will show green.

    And so what we can do next, is we can use the out-of-distribution detector that we created and test it with images-- or with image instances of smog. Recall that smog images are not present in the training data set. What we've done is we've toggled on the switch to run the model with smog data.

    And when we run the model again, we're going to see that, for the most part, these images are going to be now out of distribution. And so what this is telling us, is that we shouldn't trust the output that the model is providing in this case. And it's probably safer to give control to a human for safe handling.

    A common question people ask me is, can we tackle DAL C today? Now, we're waiting for the standards to be released but can this be done today? And it turns out that we can also address DAL C today, and I'll quickly explain how. And that is through the use of an architectural mitigation strategy through the use of dissimilar DAL D components.

    So basically, what we have here, is we have two DAL D components sitting side by side with each other. And these AI models have to be dissimilar. We'll get to what this means in a minute. We have the input to these neural networks, which is considered to be a DAL C component. And the output, also a DAL C component, is basically going to account for the post-processing analysis and safety monitoring that's going to determine the final output of the system.

    And so then the question might be, how to measure dissimilarity? It turns out that there are many different ways you could have developed this AI model. You could have used different data, different model architecture, different machine-learning frameworks, different software implementation, hardware, humans, and common-mode analysis.

    So coming back to our case study, of course, moving to DAL C requires going back to the requirements and making modifications to some requirements adding new ones and so on. The earlier DAL D end-to-end case study was included in Deep Learning Toolbox. And the DO qualification kit, as part of a reference workflow example, consisting of more than 25 scripts. With 24A, we have updated this reference workflow example to now include or address DAL C.

    In this case, this test harness simulates through a sample of test images performing some pre-processing, this will be a DAL C component, and then predicts bounding boxes from the Matlab YOLO v3 and YOLO v2 detector. These are DAL D components. And then the bounding box information is passed to the safety monitoring or the safety monitor component, which is DAL C for validation.

    In the context of the safety-monitoring component, the way it works is that if the intersection of a Union or IOU of the best detection of the two object detectors is above a threshold, then the safety monitor accepts the detection with the highest score. The detection box and class label is overlaid on the image and shown in the image viewer window. The scope tracks the IOU from the safety monitor. As we can see from the results, in this case, both DAL D components seem to have agreed under all examples, and IOU is above the threshold for all image instances.

    I'll just quickly wrap up. There is a need to incorporate AI in safety-critical situations or applications. And AI standards are still in progress. However, we've shown an end-to-end reference workflow case study and how certification can be achieved today for low-criticality applications. Some methods are still in progress. Others have already been established. But we need to wait to see how these get adopted and to what extent these methods become part of future standards.

    Nevertheless, we have seen how to address DAL C today, through architectural mitigation strategies. And I strongly encourage you to look into this end-to-end reference workflow example in Deep Learning Toolbox and the DO Qualification Kit that can be used as a template for other projects also beyond aerospace.

    And with that, thank you all for your attention.

    View more related videos