Plotting negative values in boxplot
Mostrar comentarios más antiguos
I am trying to make a box plot of data in 6 different categories. Some of my data points are negative, and I am running into the problem that when I call the boxplot function it cuts off the y axis at 0 and I cannot get a good visual of the negative values. I am using MATLAB R2019a. Any insight on this would be appreciated;
flow_rates_w = NUM_w(1:183,24:29)
boxplot(flow_rates_w)

10 comentarios
dpb
el 3 de Ag. de 2022
I don't see any such effect here; show us the exact code use and attach a sample dataset that causes the problem there.
You can always change ylim manually --
ylm=ylim;
ylim([-ylm(1) ylm(2)])
will set the lower to the negative of the current upper; can be arbitrary to whatever range needed...
boxplot handles negative values.
data = randi(11,100,5)-6
boxplot(data)
yline(0)
Perhaps your negative values are not large enough to appear. For example, these values are all 0:10 except for 3 that are -0.005.
data2 = randg(.25,100,5);
data2([80,220,350]) = -0.005;
min(data2,[],'all')
figure
boxplot(data2)
Investigate the frequency and magnitude of negative values in your data to get a sense of what the plot should look like.
dpb
el 3 de Ag. de 2022
Good point,@Adam Danz -- @Marguerite Lorenzo, NB the axis limit isn't identically zero but is something <0 but not as large as -0.5E5 or the tick mark would've been drawn -- but, it's fairly close to that it appears.
Excepting can't show negative numbers on log axis -- I already checked that nothing has been added to the boxplot function to deal with such a case; it acts the same as any other axes in that regards -- there's a FEX submission I believe that reflects an axis around 0 with a transform on the values to avoid the discontinuity at 0. It's not mathematically correct as the decade around the labelled "0" location covers everything from the actual decade plotted down on the same range as a single decade, but it can be useful visualization tool for the case of very widely dispersed data that is both positive and negative. But, boxplot can't make use of that trick...the negative data will simply not be shown if try to set the axis YScale to 'log'
Marguerite Lorenzo
el 4 de Ag. de 2022
I'm not quite sure what you're expecting to see. Your data has a large range and very small negative values so it's expected that the range, indicated by whiskers, will end at or very close to 0 -- so close that you can't see it. Viewers who know how to read box plots should see the range of the y-axis would understand that there are limitations to how precise the visualization can be. The whisker length is sub-pixel in height and cannot be shown.
[NUM_w,TXT_w,RAW_w] = xlsread('data_boxplot.xlsx');
flow_rates_w = NUM_w(1:183,:);
range(flow_rates_w) %
min(flow_rates_w)
Are the negative values important? Are you worried about viewers thinking that the min values are 0?
What about histograms?
figure
tiledlayout(6,1,'TileSpacing','compact','Padding','compact')
positiveBins = linspace(0, max(flow_rates_w,[],'all'),20);
bins = [positiveBins(1)-positiveBins(2), positiveBins];
for i = 1:6
nexttile
histogram(flow_rates_w(:,i),bins)
end
I would suggest to set ylim at -0.5E5 and then there will be a tick mark and label that will make it clear the axes really isn't terminated at 0. I think in the similar cases, the default range should be even tick values and let the user tighten the range if wish instead??? Or just label the bottom axis value (although there's no tick there that would be added text)???
"The whisker length is sub-pixel in height and cannot be shown."
I wonder if it would be better to "cheat" in the other direction and show that pixel even if it is somewhat exaggerated???
Marguerite Lorenzo
el 4 de Ag. de 2022
dpb
el 4 de Ag. de 2022
" it appears that for categories 2 and 6 the 25th percentiles fall in the negative range which and it would be nice to be able to see those values.."
Zoom the y axis --
ymn=min(flow_rates_w,[],'all');
yup=1E5;
ylim([ymn yup])
adjust as desired.
You could use tiledlayout and present the full-scale plot on one and the detail on a second -- having a builtin inset function would be handy for such things...
Respuestas (1)
You could add a second axes that zooms into the small, negative values.
[NUM_w,TXT_w,RAW_w] = xlsread("data_boxplot");
flow_rates_w = NUM_w(1:183,:);
fig = figure();
tcl = tiledlayout(fig,4,1);
ax1 = nexttile(tcl,[3,1]);
boxplot(ax1,flow_rates_w);
ax2 = nexttile(tcl);
boxplot(ax2,flow_rates_w); % or copyobj
% zoom in to negative values
gmin = min(flow_rates_w,[],'all') * 1.1;
ylim(ax2,abs(gmin).*[-1,1])
yline(0,':','Color',[.6 .6 .6])
linkaxes([ax1,ax2],'x')
2 comentarios
Marguerite Lorenzo
el 4 de Ag. de 2022
dpb
el 4 de Ag. de 2022
Only if you use the aforementioned "trick" of plotting abs(x) and then manually relabelling -- negative values simply don't have results in the real plane; you can't avoid that problem.
There is the FEX submission <sym_log> that shows how for ordinary data; doing the boxplot would take writing something similar for it to draw the various pieces for the negative data...one could possibly manage to extract the necessary pieces from the original although much of the content is hidden I think. I've not done any poking at the internals to see.
Categorías
Más información sobre Exploration and Visualization en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!





