Plot mean and standard deviations along with data on a bell curve

60 visualizaciones (últimos 30 días)
Sunshine
Sunshine el 21 de Mayo de 2020
Comentada: Image Analyst el 19 de Jun. de 2020
I have columns of data, numbering approximately 120 rows. The data is 1 thru 5, representing survey data. I am working on analyzing the data columns. The column data also has some NaN. I can calculate the mean and standard deviations. However, I am attempting to plot the mean, standard deviations, along with the actual data on the bell curve. I found this code that at least plots the data. But I am not sure how to change the code to correctly represent my data on a bell curve. For instance, I don't think I need the randn function, given the amount of data I have. In short, I just want to plot my data, the mean, and standard deviations (to plus and minus 3 sigma) for all columns of data on the bell curve, similar to what this code produces.
x = .03*randn(10000,1)+.34;
[N,X] = hist(x,100);
hfig = figure;
bar(X,N)
hold on;
y = [0 1.2*max(N)];
center = mean(x);
std1 = std(x);
%center plot
plot([center center],y,'r-.')
%1 std
plot([center center]+std1,y,'g-.')
plot([center center]-std1,y,'g-.')
%2 std
plot([center center]+2*std1,y,'k-.')
plot([center center]-2*std1,y,'k-.')
  10 comentarios
Sunshine
Sunshine el 19 de Jun. de 2020
Thanks a bunch! This really worked well for me. I was also able to modify the code to include additional standard deviations.
I am curious. I found code and modified it to what you see below. My goal is to calculate the percentages of the data for column p1. So for instance, I want to determine how much of the data in a particular column are 5s, 4s, 3s, etc.
Code
numberOfBins = max(personality_cols.p1(:));
countsPercentage = 100 * hist(personality_cols.p1(:), numberOfBins) / numel(personality_cols.p1)
Answer
countsPercentage =
9.0909 4.1322 13.2231 33.0579 24.7934
The countsPercentage does not equal 100. countsPercentage's total = 84.3. Is the remaining percentage NaN values? How do I get the percentage of the NaN values to know that the other percentages are correct? How do I exclude NaN values from the countsPercentage values so that 100 percent is only looking at data (5s, 4s, 3s, etc)? How do I know which values are being represented by which set of data (for example, how do I know 9.0909 are 1s or are they 5s from the p1 column data)?
Image Analyst
Image Analyst el 19 de Jun. de 2020
You can use the isnan() function along with sum() to compute the number of nans in a vector.
numNans = sum(isnan(yourVector));
percentNans = 100 * numNans / numel(yourVector);

Iniciar sesión para comentar.

Respuestas (1)

Image Analyst
Image Analyst el 22 de Mayo de 2020
Then if it's not normally distributed data, why do you want to fit a bell curve to it?
Did you try fitdist():
load hospital
x = hospital.Weight;
pd = fitdist(x,'Normal')
x_values = 50:1:250;
y = pdf(pd,x_values);
plot(x_values,y,'LineWidth',2)

Categorías

Más información sobre Data Distribution Plots en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by