Using fminunc with 'HessianMultiplyFcn' option

4 visualizaciones (últimos 30 días)
Jonathan Pillow
Jonathan Pillow el 5 de Jul. de 2024
Movida: Bruno Luong el 5 de Jul. de 2024
I'm trying to solve a large-scale optimization problem using fminunc, where the number of variables is too large to store the Hessian explicitly.
I have carefully read the Matlab documention on using the 'HessianMultiplyFcn', which allows the user to pass in a function that will compute the Hessian times a vector Y. It says that in this case, the loss function should return the loss, gradient, and a struct called Hinfo.
However, MATLAB returns an error when using this function:
Error using fminunc (line 410)
FMINUNC requires all values returned by functions to be of data type double.
Note that I have set the algorithm to 'trust-region' and 'HessianFcn' to [] in optimoptions, so it should know that the third argument returned by my function is a struct.
Help! Has anyone encountered this before? Or can anyone post an example snippet in which they successfully used fminunc with the HessianMultiplyFcn option?
I'm copying the documentation for the HessianMultiplyFcn in the fminunc documentation in case it is helpful. (From:
----------------
HessianMultiplyFcn
Hessian multiply function, specified as a function handle. For large-scale structured problems, this function computes the Hessian matrix product H*Y without actually forming H. The function is of the form
W = hmfun(Hinfo,Y)
where Hinfo contains the matrix used to compute H*Y.
The first argument is the same as the third argument returned by the objective function fun, for example
[f,g,Hinfo] = fun(x)
Y is a matrix that has the same number of rows as there are dimensions in the problem. The matrix W = H*Y, although H is not formed explicitly. fminunc uses Hinfo to compute the preconditioner.

Respuesta aceptada

Jonathan Pillow
Jonathan Pillow el 5 de Jul. de 2024
Movida: Bruno Luong el 5 de Jul. de 2024
Ok, I think I solved the problem.
My mistake was thinking that Hinfo should be a struct. Once I made it an array (into which I simply placed the items my HessMult function would need for computing the Hessian), the error went away.
The only other wrinkle I noticed is that the HessMult function must be able to accept multiple vectors at once. But the performance boost is impressive once I got it working (>10x speedump for a roughly 3K x 3K Hessian).

Más respuestas (0)

Categorías

Más información sobre Solver Outputs and Iterative Display en Help Center y File Exchange.

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by