File Exchange

image thumbnail

Multivariate Polynomial Regression

version 1.4.0.0 (39.7 KB) by Ahmet Cecen
Performs polynomial regression on multidimensional data.

61 Downloads

Updated 03 Dec 2020

From GitHub

View Version History

View license on GitHub

Performs Multivariate Polynomial Regression on multidimensional data. The fits are limited to standard polynomial bases with minor modification options. Feel free to implement a term reduction heuristic.
The functionality is explained in hopefully sufficient detail within the m.file. Feel free to post a comment or inquiry.
No longer requires ANY additional toolboxes!
Head over to http://ahmetcecen.github.io/MultiPolyRegress-MatlabCentral/ or the GitHub page on the right for a full illustrated tutorial. You can also publish Example.m for the same purpose.
Author: Ahmet Cecen, MINED @ Gatech

Cite As

Ahmet Cecen (2021). Multivariate Polynomial Regression (https://github.com/ahmetcecen/MultiPolyRegress-MatlabCentral), GitHub. Retrieved .

Comments and Ratings (62)

Omar Alahmad

Great work. Very useful. Thanks a lot.

aggg

Jürgen

very useful, if not already there - one would have to implement it with sweat and tears

Eduardo Díaz

This is great! Thanks a million!

Ahmet Cecen

I am unfortunately unable to develop this function any further due to my current obligations.

Sebastian Laechele

Dear Ahmed,
is there a way to apply weights to the data points in order to take the accuracy / relevance of each data point into account?
The MATLAB built-in function fit allows this with the optional Parameter "Weights" is added.
Unfortunately the fit function only allows up to 2 dimensions for the predictors which is why I would love to use your implementation.
But I lack the mathematical understanding to implement the weights into your code myself. Any help would be greatly appreciated!

Ben Krämer

Kaiwen Yang

This method is so elegant. Otherwise, I have to run it those optimizers which might not be this good.

Ben Krämer

Ahmet Cecen

No, there are no data pre-processing or cleaning steps implemented in the function. The data that you feed in has to be final. Depending on your application I would either fill the NaN's via interpolation, a function based on expected physics, or using this same function to estimate the column with sporadic NaNs from other columns in the input data; OR eliminate any rows with NaN by using A(sum(isnan(A),2)>0,:) = [];.

James Eaton

Hello, is there a way to 'omitnan' using this function? Thanks!

jofin george

Hi, Thanks for sharing this robust algorithm. May I know if this is a machine learning based algorithm?

Rohit Bhagwat

Thank you very much, and also thank you for writing this code. It was very helpful to me.

Ahmet Cecen

Yes. And yes you would have to rename them in that case I'm the new software.

Rohit Bhagwat

Hi, Thank you for your reply.
I am sorry, i should have mentioned this in the previous message. Actually i wanted to use Matlab to find this polynomial curvefit having 6 independent variables but use it in a different software (one of those software is Excel), so that is why i was asking that i can use it as a normal polynomial formula starting from the 0.*x6 till the end and it would act similar to a curvefit formula right?

Ahmet Cecen

You don't have to rename them. Just do FUN(YourData1, YourWeirdname2 ... ) etc. The only requirement is that the order you put your variables in MultiPolyRegress during fitting, has to be the same as the order you call this new function FUN.

Rohit Bhagwat

Thanks for your prompt reply,
Oh, now i understand the first term, its just assigning x1 to x6 as variables, so that means if i manually name my data vectors as x1 to x6 and use the polynomial it would work right.

Ahmet Cecen

https://www.mathworks.com/help/matlab/matlab_prog/anonymous-functions.html

Check this out to understand what that "first term" as you put it means. Basically if you called the variable you just printed FUN, you can just do FUN(x1,x2,x3,x4,x5,x6) evaluate the function.

As mentioned in the description, there is no regularization or term reduction heuristics in the code. This means that you can end up having 0 or 0 like coefficients if you have I'll conditioned or poorly correlated variables.

Rohit Bhagwat

I have a doubt,

Hi,

I am getting something like this,
@(x1,x2,x3,x4,x5,x6)+0.*x6+-5.5294e-05.*x6.^2+1.0186e-07.*x6.^3+0.*x5+-0.00029564.*x5.*x6+3.4008e-07.*x5.*x6.^2+-2.1999e-10.*x5.*x6.^3+-0.018347.*x5.^2+1.9021e-06.*x5.^2.*x6+-7.7091e ........................+-8.0344e-08.*x4.^4+-9.6951e-07.*x5.^4+-1.504e-10.*x6.^4

if i want to use it as a polynomial, what should i do with the first term {@(x1,x2,x3,x4,x5,x6)} and the second term is 0.*x6, does that mean 0*x6? wont that be 0 always?

Vincent R

ohad ben horin

Ahmet Cecen

There is no direct way to cite this work. It has only been tangentially mentioned alongside my primary research. Here are a few options:

- Cite the thesis that necessitated the initial writing and continuous update of this code for 8 years: https://smartech.gatech.edu/bitstream/handle/1853/58723/CECEN-DISSERTATION-2017.pdf

- Cite the first work that refers to this code by name specifically and briefly explains it: https://link.springer.com/article/10.1186/2193-9772-3-8

- Don't cite. Link to this URL and refer to the code in your methods explanation. Say something along the lines of " uses MultiPolyRegress written by Ahmet Cecen in MATLAB Central."

Saksham Consul

Can you please tell how to cite this work. Thank you!

Ahmet Cecen

Neither. This code doesn't currently have any uncertainty quantification on the fit parameters themselves. In your very simple case you can refer to the link below to find the uncertainty of the slope:

https://terpconnect.umd.edu/~toh/models/ErrorPropagation.pdf

laurent jalabert

Dear Ahmet,
thank you so much for making this nice function. I tried a simple linear fit on experimental data.
I got the slope a and the constant b of the fit y=ax +b
Now, I need to use the slope a= -0.71744 and the error (standard deviation) of a.
According to the results below, what should I consider as standard deviation on the slope ?
MAESTD = 0.0035 or CVMAESTD= 0.0044 ?
I need to write that the slope like " a +/- std "

reg =

struct with fields:

FitParameters: '-----------------'
PowerMatrix: [2×1 double]
Scores: [14×2 double]
PolynomialExpression: @(x1)+6.9303.*1+-0.71744.*x1
Coefficients: [2×1 double]
Legend: [2×2 char]
yhat: [14×1 double]
Residuals: [14×1 double]
GoodnessOfFit: '-----------------'
RSquare: 0.9999
MAE: 0.0044
MAESTD: 0.0035
Normalization: '1-to-1 (Default)'
LOOCVGoodnessOfFit: '-----------------'
CVRSquare: 0.9998
CVMAE: 0.0052
CVMAESTD: 0.0044
CVNormalization: '1-to-1 (Default)'

Behnam Amiri

Ahmet Cecen

The leave one out cross-validation calculation is done indirectly via the Sherman-Morrison-Woodburry formula, which involves division by a number that can be very close to zero if there is overwhelming over-fitting. You are correct that the number should not be smaller than 0, but I didn't guard against this edge case instability because it provides a comical measure of just how much you are over-fitting.

Long story short, don't use a fit with a CVRSquare "too different" (magnitude left to your imagination) than the regular RSquare.

Nikos Skiadaressis

Ahmet Thank you!
It is a great tool.
Just one question:
When I'm trying to use the sample in the example to calculate a model of higher order than 3 the CVRSquare is:

4rth: -2.55
5th: -9323.04
6th: -19991970873.98
7th: -4025574168920490.50
8th: -Inf

Shouldn't Require be positive and under 1?

Ahmet Cecen

Yep makes sense. I encourage you to submit this change as a pull request in GitHub. Otherwise I will fix it when I get a chance.

Nico Burgelman

Changing line 186-187 from:
H=QQ*QQ';
rCV=r./(1-diag(H));
to
dH=sum(QQ.*QQ,2);
rCV=r./(1-dH);

Speeds things up (and saves memory)

Shukran Sahaar

Great tool!

Nanbo Li

Yodish

this is a brilliant function

Oleg Boiko

Biplab Satpati

vina

I really want this regression method to try my data. But this terrible web page always fails. Can someone send it to me please? My email is 1034223185@qq.com

vina

Ahmet Cecen

I don't think I have explained this code exhaustively in publications. You can e-mail me for explanations of any particular section, my contact info is easy to find online. Otherwise search for the following concepts:
- Polynomial Basis
- Multivariate Regression
- Leave One Out Cross Validation
- Sherman-Morrison Formula
- QR factorization (and regression)

Carlos Ferreira

@Ahmet Cecen Can you give me the papers where the methodologie is explained?

Muhammad Ansab Ali

OSCAR VITERI

Ahmet Cecen

An oversight. Will fix it when I get a chance.

To suppress output that may not be required, please add a semicolon to the expression in line 173.
eval(['PolyExp = ',variablesexp,Poly,';']);

Habib Yajam

Fast and easy to use. MATLAB lacks such a function in its original releases.

Habib Yajam

Fast and easy to use. MATLAB lacks such a function in its stock releases.

Andra St. Quintin

Easy to use.

Ahmet Cecen

If you send me an e-mail I can reply back to it with the zip file. My contact info is everywhere just Google my name, or go to my account.

Ahmet Cecen

I was able to download it just this second. I'll send it anyways if you have contact information on your account.

Aurélien Durel

The file is no longer available.
Can someone send it to me please ?

easumj

Excellent code, I have been looking for multivariate polynomial regression tools for quite some time.

Silpakorn D

Xinyi Gong

Ahmet Cecen

If you send me (it's very easy to find my contact information online, including my profile here) the data and parameters to replicate your situation, I can look into it. Otherwise very hard for me to search for a random bug.

Rita

Hi Thanks for the function.I have tried your function with my seven independent and one dependent variable and R-squared is 0.19 which is not high.How can I get higher R-squared ?I also got error when I used 'range'.Any suggestion would be appreciated in advance.

Karel Macek

Sagar

Hi, I tried to use the function but I have a lot of NaNs in my data. It looks like it cannot handle data with NaNs. Could you please update to include NaNs?

Ahmet Cecen

multidimensional

Morgan

Yuksel Yabansu

Mahdi Roozbahani

Ahmet Cecen

Added examples upon request.

MATLAB Release Compatibility
Created with R2014b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!