Borrar filtros
Borrar filtros

Why SVM is not giving expected result

1 visualización (últimos 30 días)
Diver
Diver el 26 de Oct. de 2015
Editada: Ilya el 27 de Oct. de 2015
I have training data composed from only one feature.
The feature have around 113K observation.
  • 8K only of those observation have positive class.
  • 105K of those observation have negative class.
  • The 8K observation composed of a number below 1 (90%), and 10% above 1
  • The 105K observation composed of a number above 1 (80%), and 20% below 1Hence, almost, any X value below than 1 show be predicted as positive class, and any X value above 1 should be predicted as negative class.
I used the following fitcsvm call:
svmStruct = fitcsvm(X,Y,'Standardize',true, 'Prior','uniform','KernelFunction','linear','KernelScale','auto','Verbose',1,'IterationLimit',1000000);
the fitcsvm give message at the end saying SVM optimization did not converge to the required tolerance., ... but why ... most of first class X values are below 1 and visa versa ... so it should be easy to find classification boundary. and when I run:
[label,score,cost]= predict(svmStruct, X) ;
it gives wrong prediction.
Below a portion of my X values is listed:
0.9911
0.9836
0.9341
0.9751
0.9880
0.9977
0.9853
0.9861
1.0143
1.0086
0.9594
0.9787
0.9927
0.9839
1.0024
0.9931
0.9930
1.0275
The image below shows a gscatter diagram. Notice, the positive values are only around 8K, while negative around 105K. Since there is only 1 feature, I created an X with values from 1 to length of Y.
I also attached "features.txt" which contains the features column and "Y.txt" which contains the two groups.

Respuesta aceptada

Ilya
Ilya el 27 de Oct. de 2015
Editada: Ilya el 27 de Oct. de 2015
This is a difficult problem for SVM. SVM performs best when two classes are separable or have a modest overlap. This is not the case here. To make things even harder for SVM, less than 7000 points out of your 110k are unique.
Why not use a classifier such as decision tree or linear discriminant?

Más respuestas (0)

Categorías

Más información sobre Statistics and Machine Learning Toolbox en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by