Increasing Dimensionality of data

3 visualizaciones (últimos 30 días)
Nirmal
Nirmal el 24 de Feb. de 2011
Here is my question, I am not sure if that can be done at all.
I want to test relation between a property X to dimensionality of the matrix. Saying so, I would like to maintain the original properties of the data as close as possible. So, I thought of following two ways.
1. If I take IRIS data, it has four attributes what i would like to do is to increase the attribute to may be 6 or 12 and so forth. but still have characteristics of original data. I am not sure how to do it.
2. Another thing that might work would be to generate data like 3 Gaussian normal data but with different dimension. Will the data be able to relate to one another? Since, they simply have different dimension.
my question is not how to add extra data in matlab, but how add data still preserving the properties( if that makes sense
I would appreciate any help.
Thank you for looking.
  2 comentarios
Paulo Silva
Paulo Silva el 24 de Feb. de 2011
What's the original matrix size?
Nirmal
Nirmal el 24 de Feb. de 2011
One that i am working right now is IRIS, which is 150*4.

Iniciar sesión para comentar.

Respuesta aceptada

Walter Roberson
Walter Roberson el 24 de Feb. de 2011
Features are independent of the dimensionality of the data. The width of the petal of an Iris is not dependent upon how many other size measurements you took or which of them you included.
There may be correlations between features. For example, you are not going to find a very short Iris that has very long petals. These correlations do not, however, depend upon how many other measurements you included.
Be careful also to note that the scale of each feature is independent. For example it might be most natural to measure the size of the pollen in microns but the height of the plant in centimeters. Thus, a large value in one feature might have less significance than a very modest value in another feature. Therefore the scale of values for any newly introduced feature is not relevant: it is the distribution of values that matters.
Introducing new artificial features that are independent of the existing features is not going to help data classification. Done wrong, you can end up making your classification decisions based upon the new artificial feature entirely. Done right, your classification procedure will notice that your new feature contributes no information, and effectively classifies as if it was not there.
Therefor if you introduce new features, they must be dependent upon the existing features in some way (or upon information from features which you have the data for but did not included.)
When we introduce new features in our classifications, it is always for dimension reduction. For example, in a Magnetic Resonance Spectrum (MRS), we might replace hundreds of spectrum data points (each of which would otherwise be a feature) that are mostly overwhelmed by the water signal, substituting something like the mean and standard deviation of the points.
Anyhow, if by "property X" you are referring to individual features, then the thesis that it is related to dimensionality is not true. If, though, you are referring to something like confidence intervals, then you can work that out from the formulae involved, or you can do it experimentally by adding columns that convey no information at all because they are constant for all samples.

Más respuestas (1)

Paulo Silva
Paulo Silva el 24 de Feb. de 2011
Here's one example, you can adapt it to your needs
a=[1 2 3 4
5 6 7 8]'
b=[a [9 10 11 12]'] %b is a with one more column
c=[a;[9 10]] %c is a with one more line
In your case size(a)=[150 4] and you want to add 2 more lines, example:
a=randn(150,4); %Create an array 150 by 4 with random values
b=(1:150)'; %Create a vector with numbers from 1 to 150
c=2*b; %Create another vector with numbers from 2 to 300
d=[a b c]; %add two more columns to a, 5 column is b and 6 column is c
  4 comentarios
Nirmal
Nirmal el 24 de Feb. de 2011
Sorry if I wasnt clear my question is not how to add extra data in matlab, but how add data still preserving the properties( if that makes sense). Suppose I have a matrix
a=[0.3,0.4;0.1,0,6], if I want to add one extra column then How do i determine what i really have to add? should I add [0.5;0.3] or should I add [100;300]?
Paulo Silva
Paulo Silva el 24 de Feb. de 2011
Good question, maybe the experts can help you (I believe that some of them got magic balls but don't tell anyone).

Iniciar sesión para comentar.

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by