Fuzzy Logic Controller Tuning | Fuzzy Logic, Part 4
From the series: Fuzzy Logic
Brian Douglas
Cover the basics of data-driven approaches to fuzzy logic controller tuning and fuzzy inference systems.
See how to tune fuzzy inference parameters to find optimal solutions. Learn how optimization algorithms, like genetic algorithms and pattern search, can efficiently tune the parameters.
Follow along with an example about tuning a fuzzy inference system using data that controls an artificial pancreas.
Published: 4 Oct 2021
In the last video, we tuned a fuzzy inference system using experience and our knowledge of the system. But what if you don't have that experience? Or if the number of tuning parameters is too large for a person to manually configure? In those cases, we can turn to data for automatic tuning.
And in this video, we're going to cover a few basics of data driven approaches. And then, we'll go through an example that tunes a fuzzy inference system that controls an artificial pancreas. This should be interesting. So I hope you stick around for it.
I'm Brian, and welcome to a MATLAB Tech Talk. Tuning is an optimization problem. And this is true whether a person is manually tuning the system, or if an algorithm is tuning the system. In both cases, we are trying to find a set of parameters that will produce the best, or most optimal, result.
And what is most optimal? Well, it depends on what the designer deems important. In optimization problems, a cost function is created and some automatic process tries to minimize the output of that function. And, by choosing a different cost function, we can change what optimal means.
For example, imagine a control system that is trying to track a reference signal. We'll say that we want the system state to step up to 10, at time 1 second. And we have two candidate controllers. The first produces a step response that looks like this. And the second produces this different step response. Which of these two controllers is better or more optimal?
To answer this question, we need a cost function. And one possible function is the root-mean-square of the difference between the reference signal and the output signal. Here we're just trying to minimize the absolute difference between the two. And with this cost function, the upper controller is the most optimal. The system output is a closer match to the desired reference.
However, with a different cost function, the second system may be more optimal. Maybe we don't want the output to overshoot at all. And so we heavily penalize any time the output goes above 10. In this case, we'll find that the second result is more optimal, since it never exceeds that value. So hopefully you can see how the optimal result is driven by the cost function itself. Which again is just a comparison between an ideal response and the actual response.
How do we get an ideal response that we can use in our cost function for developing a fuzzy system? Well, we may be designing a fuzzy controller just like in the previous example. Therefore, the reference signal is our ideal response. Or we may have a function that we're trying to mimic with a fuzzy system.
For example, we may have an AI model, like a trained reinforcement learning policy or some other trained neural network. And we want to make sense of it by representing it as a fuzzy system. In this case, given some input signal, the ideal response is the output of the AI model. Or in other cases, the ideal response comes from the output of an existing physical system that is operated manually. And you want to create a fuzzy system that behaves similarly to the human operator.
Now, however you get your ideal response, you can now use that data to automatically tune a fuzzy system. For example, you can send the input data into your candidate fuzzy system. And then calculate the cost of that system by comparing the evaluated output with that ideal output. Then you can make a change to the inference system. Generate a new evaluated output and assess the cost again. And if the cost of the new system is lower than the old one, then this is a more optimal configuration.
And through many iterations of this, an optimization algorithm can tweak the parameters of your inference system until the cost drops below some target value. At that point, you know that the learned fuzzy system matches the Input/Output characteristics of whatever created your training data.
All right, let's keep going. At this point there's two questions that I want to answer. One, which parameters are we actually tuning in a fuzzy inference system? And two, how do we efficiently change them to lower the overall cost?
Generally speaking, we could change all of the parameters in the inference system. That would be everything, like the number of input membership functions, and their shape, and their parameters. And we could do the same for the output membership functions. And we could tweak the rules and their outputs. And we could even change the deffuzification method, and all sorts of things.
However, the more parameters that we try to change at the same time, the harder the problem becomes. Because the number of possible permutations grow exponentially. So it's better to use some of our human knowledge and experience about the problem to fix most of the parameters. And then just tweak a small subset to optimize the cost function.
For example, you might consider initializing the inference system with uniformly distributed triangular membership functions. And then tune the system by only adjusting the rules of the inference system.
And to understand what that looks like, let's assume we have this inference system that has two inputs, each with three membership functions. And it produces one output with five membership functions. In this case, if we set up a standard rule base, that is where there is a rule for each possible input combination ANDed together.
Then we would have nine rules for this system. Input 1 could be Low, Medium, or High. And Input 2 could be Low, Medium, or High. And so our inference system would need to know what to output for each possible combination of these inputs, or nine rules total. This is the rule base that we want to populate with the optimal set of outputs.
For example, this is one possible set of rules where each output is one of the five output membership functions of Very Low, Low, Medium, High, and Very High. And now we could assess how optimal this configuration is by giving it a set of training inputs, generating an evaluated output, and comparing that output to the training output using a cost function.
And then like we said before-- in the next iteration, we could try a different rule base, like these, and calculate the cost. And if the cost of the new rule base is lower than the old rule base, then this is a more optimal configuration. And that's essentially all we're doing with tuning. We're trying new parameters and seeing if they're better than the old parameters.
So with that in mind, why don't we just loop through every possible combination and find the global optimal parameter set? Well, let's do some math. There are nine rules, and each could have one of five possible outcomes. Therefore the number of possible permutations is 5 raised to the power of 9-- or about 2 million. And that's a lot of iterations.
It is possible to run that many, but keep in mind that this is a very simple example. And just adding a third input jumps this number up to about 7 times 10 to the 18. So in most cases, brute force isn't going to cut it.
And this is where optimization algorithms come into play. Instead of naively trying every combination, we could look for the optimal solution by cleverly sampling just a small subset of the entire solution space. And one such algorithm is a Genetic Algorithm.
This works by randomly sampling a number of points across the entire solution space and calculating their cost. Then, you discard the highest cost solutions and allow the lower cost solutions to survive and multiply. And then in the next iteration, you spawn a new batch of points that are variations of the surviving points. Get the cost of each of those, discard the high cost options and keep the low cost options, and respawn. And you do this generation after generation, until the solution converges on an overall minimum cost.
Another type of optimization algorithm is a Pattern Search. Pattern search works by starting at an initial parameter set, and then stepping each parameter one at a time in either direction from the initial set, and then calculating the cost for each. And if there's a point that minimizes the cost function, that becomes the new current point, and the search continues again-- stepping the parameters in each direction.
But if none of the new points have a lower cost than the current point, then the size of the pattern shrinks and the process starts again. And this continues, moving the pattern and refining its size until the cost becomes lower than some threshold, or a maximum number of iterations has occurred.
So these are two ways of many that can more efficiently search for the optimal solution than just brute force. However, even with these different types of optimization algorithms, it's still beneficial to reduce the solution space as much as possible.
Searching through 2 million possible configurations is almost always going to be preferred over searching through 7 times 10 to the 18 possible configurations. Larger solution space not only take more time for learning, they can also require more training data. And so, if you can find a way to use your knowledge to augment the fuzzy system in a way that reduces the solution space, this can be beneficial.
And one way to do this is to implement a fuzzy inference system as a tree of smaller interconnected inference objects, rather than as a single monolithic one. In a tree structure, the outputs of the low level fuzzy systems are used as inputs to the high level fuzzy systems.
For example, we could design a fuzzy system that has three inputs in one output as a single system. Or we could design a smaller fuzzy system that only has two inputs and one output, and then feed that output plus another input into a second fuzzy system that has two inputs and one output.
And there are at least two benefits to creating a tree structure like this. The first, is that the number of possible configurations of this system is much, much smaller. Assuming each input has three membership functions, and the output has five membership functions, then we already know that there are 7 times 10 to the 18 possible different rule bases for this single fuzzy system. That's 5 raised to the 3 times 3 times 3.
However, each of these smaller systems only have about 2 million combinations each. That's 5 raised to the 3 times 3. And since there are two of them, then the total number of combinations is only about 2 million times 2 million. Or 4 times 10 to the 12. Now that's still a huge number, but much smaller than the single system.
And the second benefit of a tree structure is that it can be easier to interpret. And that's a little harder to explain without an example. So now let's jump over to MATLAB and walk through the artificial pancreas example, and hopefully this will make sense as well as everything else we just covered.
Now to begin, I want to warn you that I'm not going to go step by step through this exercise. I just want to highlight a few things. However, the write up for it is really good, and I've left a link in the description to it and to a video that describes it in more detail if you want to check the whole thing out. So let's get going.
Simply put, the controller we want to develop can monitor blood glucose levels in a diabetic patient, and then determine the proper amount of insulin dosage that maintains the desired levels. When the patient eats, their blood glucose level spikes, which needs to be offset with a larger insulin dosage. A fuzzy logic controller is implemented to determine these insulin dosage levels.
And this kind of a controller makes sense. Because as you'll see, with a fuzzy inference system, we'll still have insight into how it's making decisions. Which will help us understand and modify the controller, even though initially, it will be tuned using data generated from a model of a diabetic patient.
So let's scroll down a bit. Here, two fuzzy systems are created to form a fuzzy tree. The first has two inputs, the measured blood glucose level, and the rate at which it's changing over time. From these two inputs, a pre calculated insulin dosage is determined. Then this pre calculated dose and the glucose acceleration are inputs into the second fuzzy system which calculates the actual insulin dosage.
Now I said that this fuzzy tree is easier to interpret and here's why I think so. For one, there's only 18 rules instead of the 27 that a single system would have, so there's fewer rules to interpret. But also, I think it makes a lot of sense to think of this system in two parts.
The first part calculates a dosage based on current levels and their rate only. Restricting the system to just two inputs makes coming up with a solution easy to understand. Assuming you're not like me and actually have medical experience working with diabetic patients. But let me give you an interpretation anyway.
I could imagine that if levels are high, and the rate is positive, that you would want a large insulin dose to stop that increase and start heading back down. And if the levels are high but the rate is negative, so it's already heading down, maybe hold off on more insulin. So we can see that with two inputs, It can be a very simple, intuitive system.
Now the second fuzzy system just modifies that pre calculated dose. So we can think of it this way. If the rate is accelerating in the positive direction, then maybe slightly increase the pre calculated dose to account for that. And if it's accelerating in the negative direction, than slightly decrease the dosage. And if it's not accelerating, then leave the dosage unchanged. So in this way, the tree structure allowed us to break this problem up and think about it in two stages. Which in a lot of cases can make the problem more interpretable.
Moving down a bit, you can see that we're going to set up this process to only learn the controller rules. For now, we'll leave the membership functions at their default values of uniformly distributed triangular functions. And we'll use a genetic algorithm to search through the 4 times 10 to the 12 possible rule combinations. And assess the result with a cost function.
Here the cost function is just looking at the root-mean-square of the error between the output of the model and the reference glucose level. That's the level we want to maintain. Plus we're heavily penalizing any time the glucose level drops below a specified minimum value. This is apparently really bad for diabetic patients. And so we want to make sure that this controller doesn't cause those situations.
With all of that set, the tunefis function kicks off the learning process and starts to modify the rule base using a genetic algorithm. It's cycling through hundreds of configurations and trying to minimize cost. And after it runs, we're left with this optimal rule base, which we can interpret like this.
If the blood glucose level is low and the rate is negative, then the pre calculated dosage should be very high. Which actually seems opposite of what we want. But let's keep going. So now a very high pre calculated output dosage is equivalent to a high input dosage. And let's say, in this example, acceleration is 0. In this case, the output dosage is modified to Very Low.
So we got to Very Low in the end, which is expected for low glucose levels. But we did so in a really weird way, since the first fuzzy system did the wrong thing, and then the second one had to correct for it. So maybe we need to spend some more time training. But here is what makes fuzzy systems powerful.
We understood that this result was weird since fuzzy systems are interpretable. And so if you have some knowledge of the system, we can now go in and fine tune the result using our experience and knowledge. And that is done here, where a few of the rules are changed to be more in line with our intuition.
Now you can see that Low and Negative produce a pre calculated dosage of Very Low, which is maintained throughout the second system for zero acceleration. And this whole thing makes much more sense. And it turns out that in this case, making these changes actually lowered the overall cost. So the updated system is more optimal.
Now it's not guaranteed that manually changing will always lower the cost. But something you can do is an iterative approach. Where you tune the system with data, then you modify the result with your knowledge to have the system be more intuitive, and then use that as a starting point to refine the system again with data to find a local optimal point. Maybe the method you use is a local minimizing algorithm like pattern search this time instead of the more global genetic approach.
There is a lot of different approaches that you can take to designing and tuning a fuzzy system. In fact, the second part of this example walks through refining the system by keeping the updated rules fixed, and instead tuning just the membership functions.
It's all really interesting and powerful. And I think the best way to just jump into this and learn how it all works, is to just start with an example like this and try it out. Make some changes. Rerun it and see how that impacts the overall result.
This is where I'm going to leave this video for now. Thanks for sticking around to the end. And if you don't want to miss any future Tech Talk videos, don't forget to subscribe to this channel. And if you want to check out my channel, Control System Lectures, I cover more control theory topics there as well. Thanks for watching, and I'll see you next time.