Finding identity matrix-like square matrix given two rectangular matrices
4 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi everyone. This is my first ever question in this platform.
I want to know how to obtain identity matrix-like square matrix.
For example, in the AX = B, where A is rectangular [m x n] matrix and X is square [n x n] matrix and n > m, we can simply multiply A and X to get [m x n] matrix B.
In my case the A and B are known, and they are quite similar. Now, I want to find X, which I expect it is close to identity matrix.
I tried some methods including penrose pseudo-inverse, but all failed to satisfy my expected answer. For example, some diagonal values are near 0 in the penrose pseudo-inverse case, I think it is due to the SVD. In regular inverse cannot be used due to Rank deficiency of the square matrix (I think). I tried linear equation with gradient descent also failed to yield square matrix that is near the identity matrix.
Please let me know if you have this kind of experience
2 comentarios
Vineet Kuruvilla
el 26 de Feb. de 2022
Editada: Jan
el 26 de Feb. de 2022
I don't have an answer but Cleve Moler's latest blog post is on this topic https://blogs.mathworks.com/cleve/2022/02/22/what-is-aa/?doing_wp_cron=1645882047.3156659603118896484375
Respuestas (2)
John D'Errico
el 26 de Feb. de 2022
Editada: John D'Errico
el 26 de Feb. de 2022
Since you have A which is mxn, and n>m, the problem is not that A is singular, or even rank deficient, it is that the problem is underdetermined. So there would be infinitely many solutions, all equally good.
First, I'll make up an example problem with reasonably small A.
m = 4;n = 6;
A = rand(m,n)
A has full rank with probability 1, but the rank is no more than 4, since it has only 4 rows.
rank(A)
Next, I'll create B. I'll add in some tiny perturbations, just to make the problem interesting.
B = A*eye(n) + randn(m,n)/100
Now as you have seen, we see that recovery of X using the pseudo-inverse is not trivial. Even though the perturbations I added to get B were pretty small, see that the diagonal elements of X0 can be quite far from 1.
X0 = pinv(A)*B
diag(X0)
So pretty much crapola for results there. But you should expect that. Instead, you need to think about reformulating the problem as how to solve:
A*(eye(n) + deltaX) = B
So think about deltaX as a matrix of perturbations of the identity matrix. And we want to find the smallest possible perturbations to the identity matrix that still gives a viable solution. But that means we can expand the above expression, then move A onto the right hand side. What we really need to solve is the problem
A*deltaX = B - A
Do you see that? Here we are explicitly looking for the smallest possible perturbations to an identity matrix. And that makes the problem now directly solvable by pinv. (In fact, this is a perfect use of PINV, a tool often used for the wrong reasons. I recall Cleve Moler mentions that in his latest blog post.)
deltaX = pinv(A)*(B-A)
As you can see, the perturbation matrix is indeed uniformly tiny, in fact, it is as small as it could possibly be as a solution to the problem posed. Now we can recover Xhat as the sum of the identitiy matrix plus deltaX.
Xhat = eye(n) + deltaX
This is quite good now. You can actually view this solution as one where we know the true solution is close to an identity matrix, so now we use that information as essentially prior information. In effect, this verges on a Bayesian solution, which also has close links to what Bjorn suggested in his answer that suggested a style of Tikhonov regularization. The difference is the standard regularization solution tries to bias the results towards zero, and while that is appropriate for some problems, it subtly misses the mark here because it uses the wrong target. The solution I proposed uses a different prior, where I implicitly bias the results towards a minimal perturbation of the identity matrix.
Finally, I might also point out that this solution would still work, even if A was truly rank deficient.
2 comentarios
Bjorn Gustavsson
el 26 de Feb. de 2022
Editada: Bjorn Gustavsson
el 26 de Feb. de 2022
This type of solution John suggests is very close to the Backus-Gilbert method - for external reference-reading purposes. Which in turn should be reasonably similar to the first and second-order Tikhonov regularizations where the solution is biased toward small local deviations/spreading by minimizing:
for first-order Tikhonov, or:
for second-order Tikhonov. Here the differential operations on X is to be taken as differential operations on some discretized X representing a scalar function. All these three suggestions (John's "B-G inspired" and the 1st and 2nd order T-R) rely on X having some "reasonably simple" geometric structure or closeness between neiboring elements (I think?).
John D'Errico
el 26 de Feb. de 2022
Bjorn Gustavsson
el 26 de Feb. de 2022
When you have a (set of) mixed-determined linear inverse problem(s) you first have to come to terms that you cannot resolve everything in X. Then you can proceed to use zero-th order Tikhonov (or higher-order, or some combination of orders) regularization. It takes a bit of linear-algebra-reading. But once you've taken these steps you'll solve this type of problems rather straightforwardly. I find the regtools-package very useful for working with this, its documentation is hopefully sufficient for you.
In short: when What you have an underdetermined linear inverse problem (n>m) you only have information enough to determine n parameters of your unknown, at most. If some of the singular values of the A-matrix are really small you might lose further information (due to "measurement"-noise (either true measurement noise of simply digitization-noise) or numerical noise due to singular-values being smaller than eps(1)*max(S(:))). To obtain a meaningful/stable solution these problems have to be dealt with. A standard way to do this is to use a Tikhonov-regularization (instead of straight-forward use of the Moore-Penrose), where the singular-values are filtered such that:
wher λ are the singular-values-array (diag(S)) and α is a regularization-parameter. For singular values larger than alpha the diagonal elements of inv(Z) are very close to the diagonal elements of inv(S), while for singular values smaller than alpha they approach zero - which leads to damping of their contribution to the solution. The matlab-code for a minimalistic 0th-order Tikhonov-solution would be:
[U,S,Vt] = svd(A,0);
alpha = 1;
[m,n] = size(A);
invZ = diag(diag(S(1:m,1:m))./(diag(S(1:m,1:m)).^2+alpha));
X_T0_alpha = Vt(:,1:m)*diag(diag(S(1:m,1:m))./(diag(S(1:m,1:m)).^2+alpha))*U.'*B;
When solving inverse problems you should always look at the singular-values - to see how many are large enough to give retrievable information. If it was an identity-matrix that was multiplied with A to produce B this should give you a filtered model-resolution-matrix for 0th order Tikhonov with alpha = 1;
HTH
0 comentarios
Ver también
Categorías
Más información sobre Linear Algebra en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!