Borrar filtros
Borrar filtros

Interpolating Multivariate time series

4 visualizaciones (últimos 30 días)
Andrew
Andrew el 29 de Abr. de 2011
Hi all,
I'm trying to test a multivariate time series dataset which has 2536instances and 73 attributes with missing values(represented by ?) in some rows. I tried looking for interpolating the time series. But all I can see is for 2-3 attributes.
Can someone help me on how to interpolate this dataset?The dataset is in .data format.
Andrew

Respuestas (3)

Andrew
Andrew el 29 de Abr. de 2011
To be clear,the dataset will be something similar to this
1/1/1998,0.8,1.8,2.4,2.1,10330,-55,0,0.
1/2/1998,2.8,3.2,3.3,2.7,10275,-55,0,0.
. . .
1/5/1998,2.6,2.1,1.6,1.4,?,?,?,0.58,0.
. . .
1/22/1998,2.8,3.6,?,?,4.6,10090,-40,0,0.
  4 comentarios
Andrew
Andrew el 29 de Abr. de 2011
@Oleg
not really...all the rows have same number of colums with 73 attributes.
This is the dataset I'm talking about
http://archive.ics.uci.edu/ml/machine-learning-databases/ozone/onehr.data
it has total 75 columns 1 date+73 attributes+1 result column which says if it's ozone day or not.
Andrew
Andrew el 29 de Abr. de 2011
@andrei
I'm not sure on how to use TriScatteredInnterp. Would you mind helping with the code that does the interpolation and save that missing values in the .data file. I need to use that data to test the algorithm
Thanks

Iniciar sesión para comentar.


Richard Willey
Richard Willey el 29 de Abr. de 2011
Handling missing data is a very complicated topic.
There are a number of different approaches that you can use including listwise deletion, substitution models, multiple imputation, yada yada yada. Each approach has its own advantages and disadvantages.
For example, an approach based on substitution (regression substitution, interpolation, what have you) will give you a complete data set to work with, however, this new data set is going to be biased. (As a simple example, supposed that you use a regression substitution model to estimate plausible values for your missing data point. Later on, you fit a regression model to your [complete) data set and report an R^2...)
Alternatively, an approach based on listwise deletion won't [necessarily] run into the same problems with bias, however, you will have issues with loss of statistical power.
I took a quick look at the data set in question. Two observations.
1. You are missing large blocks of data - this is going to cause some real problems for interpolation based techniques
2. Your data doesn't appear to be Missing Completely At Random or even Missing at Random
Personally, I would start with listwise deletion...

Andrew
Andrew el 30 de Abr. de 2011
I guess I can't delete the missing values..
How do we interpolate that with interp1???Can I use this to interpolate the above dataset?
I've read somewhere in the matlab works saying, yi = interp1(Y,xi) assumes that x = 1:N, where N is the length of Y for vector Y, or size(Y,1) for matrix Y.
yi = interp1(x,Y,xi,method) interpolates using alternative methods:
But then, how does it know what dataset to use??when I load dataset using "load onehr.data",it says unknown value '?'...
Can someone help me??

Categorías

Más información sobre Interpolation en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by