Discrepancies of classify() scores (with googlenet) for the same image
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Using googlenet to classify images from the CIFAR-100 dataset, I found discrepancies in the scores (probabilities for class prediction) depending on how the image is submitted to the classify() function.
To narrow this down, I examined just one single image, "macropus_giganteus_s_000029.jpg", previously extracted from the CIFAR-100 "train.mat" .
The classify() function was then used
A. directly on the image (after resizing with imresize()), and alternatively
B. on the same image as contained in an "augmented" datastore which included only this image, and where only resizing was applied (but not rotation, mirroring, etc.; so no real augmentation).
Below are the script code and the results for the top five scores obtained.
The scores as such are replicable, so there seems to be no random element involved "behind the scenes".
Resizing directly with imresize() or indirectly by "augmentation" should not make a difference either. (Or?)
Note that the point here is not that both classifications arrive at different class predictions (both wrong in this case; googlenet does have a class "wallaby" which was correctly applied to other kangaroo images).
The question is rather why the scores differ numerically in the first place. It does not look like a mere rounding issue, as other tests, with sometimes even larger discrepancies, have also shown.
It also occurred to me that the CIFAR images may be special in that they are small (32x32x3) so that resizing here means enlarging them to the size required by googlenet (224x224x3). But for other, larger original images which really had to be reduced in size, the results were similar.
As a case "C" (not included in the code below) I resized the image to 224x224x3 outside of Matlab, then put it into a non-augmented image data store and submitted that imds to classify(). This resulted in yet another set of scores, different from both A and B.
Should the scores obtained by classify() not be the same in A and B above (apart from rounding differences)?
Does the function work differently depending on the type of its argument (image vs. augmented vs non-augmented datastore )?
Or has this perhaps to do with different resize methods (esp. when the lossy JPEG format is involved)?
All pointers welcome.
Cheers.
Code:
%% googlenet classification of image macropus_giganteus_s_000029.jpg
% as extracted from CIFAR train.mat: data(42758,:)
%% INPUT
rng(37) ;
net = googlenet ;
%% create image datastore
imdsTest = imageDatastore('./DATA', ... % subfolder './DATA/kangaroo' with this 1 image only
'IncludeSubfolders',true, ...
'LabelSource','foldernames');
%% "augmentation" (resizing only)
net.Layers(1) ;
inputSize = net.Layers(1).InputSize ;
augimdsTest = augmentedImageDatastore(inputSize(1:2),imdsTest);
% MiniBatchSize: 128; NumObservations: 1 ; DataAugmentation: 'none'
% OutputSize: [224 224]; OutputSizeMode: 'resize'
%% Classify image
% A. classify from directly loaded image
img = imread('./DATA/kangaroo/macropus_giganteus_s_000029.jpg') ;
img = imresize(img,[inputSize(1),inputSize(2)]) ;
[imgYPred,imgProbs] = classify(net,img);
[imgTop5prob, imgTop5idx] = maxk(imgProbs,5) ;
% Result A:
% pred.class: "wood rabbit"
% ambiguity: 0.545
% highest 5 scores:
% 0.1131 "wood rabbit"
% 0.0617 "fox squirrel"
% 0.0463 "electric ray"
% 0.0459 "platypus"
% 0.0452 "hare"
%% B. classify from "augmented" datastore (resizing only)
[augYPred,augProbs] = classify(net,augimdsTest);
[augTop5prob, augTop5idx] = maxk(augProbs,5) ;
% Result B:
% pred.class: "milk can"
% ambiguity: 0.088
% highest 5 scores:
% 0.6078 "milk can"
% 0.0533 "dugong"
% 0.0465 "platypus"
% 0.0150 "fox squirrel"
% 0.0114 "lion"
0 comentarios
Respuestas (1)
Philip Brown
el 4 de Ag. de 2023
In answer to your question "has this perhaps to do with different resize methods?", augmentedImageDatastore uses bilinear interpolation to resize, while imresize by default uses bicubic. As per the doc:
Note
augmentedImageDatastore uses the bilinear interpolation method of imresize with antialiasing. Bilinear interpolation enables fast image processing while avoiding distortions such as caused by nearest-neighbor interpolation. In contrast, by default imresize uses bicubic interpolation with antialiasing to produce a high-quality resized image at the cost of longer processing time.
I don't know if this explains the discrepancy you're seeing.
Ver también
Categorías
Más información sobre Image Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!