Multiply two probability plots (CDF/PDF)

Question

0 votos

MATLAB_ask.mat

I have two probability plots, one generated as a CDF, and one as a pdf. The exact mathematics is not important for my purpose, I only want to extract the qualitative idea.

This is the code I used:

figure()

ax1 = subplot(1,1,1);

cdfplot(temp);

% plot(DE,y)

ax1.XDir = 'reverse';

set(gca, 'YScale', 'log')

figure();

pd_HOLT = fitdist(total_HOLT,'Normal');

DE_HOLT = bingroups_HOLT;

y_HOLT = pdf(pd_HOLT,DE_HOLT);

ax1 = subplot(1,1,1);

plot(DE_HOLT,y_HOLT)

ax1.XDir = 'reverse';

set(gca, 'YScale', 'log')

The x-axis is the same. How can I multiply these plots to convey a (qualitative) idea? Thanks.

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Deepayan Bhadra el 20 de Sept. de 2024

Hi @Malay Agarwal, I have uploaded the data (note: changed A.var7 -> temp)

Deepayan Bhadra el 20 de Sept. de 2024

Hi @Jeff Miller: Since I am trying to do a point-wise multiplication, the idea I am trying to convey is about the (combined) decreasing trend, as we proceed towards -100

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Umar el 20 de Sept. de 2024

0 votos

Hi @Deepayan Bhadra,

You mentioned, “How can I multiply these plots to convey a (qualitative) idea? “

Please see my response to your comments below.

My suggestion to achieve your goal of combining these two plots while maintaining clarity would be overlaying them in a single figure. This would allow you to visualize both the cumulative probabilities and the density of occurrences simultaneously. After analyzing your code, here is example code snippet to help you out.

% Generate some generic data for demonstration
data = randn(1000, 1); % Normal distributed data

% Create CDF plot
figure();
ax1 = subplot(1,1,1);
cdfplot(data);
hold on;

% Fit a normal distribution to the data
pd = fitdist(data,'Normal');

% Generate x values for PDF
x_values = linspace(min(data), max(data), 100);
y_values = pdf(pd, x_values);

% Plot PDF on the same axes
plot(x_values, y_values, 'r-', 'LineWidth', 2);

% Reverse x-axis and set log scale for y-axis
ax1.XDir = 'reverse';
set(gca, 'YScale', 'log');

% Add legends and labels
legend('CDF', 'PDF');
xlabel('Value');
ylabel('Probability');
title('Combined CDF and PDF Plot');
hold off;

So, in above example code, I used randn(1000, 1) to create a sample dataset that follows a normal distribution. The cdfplot(data) function creates the cumulative distribution function plot. You will see that the example code fits a normal distribution to the data and calculates its PDF over a range of x-values. The hold on command allows you to overlay the PDF plot on top of the CDF plot in the same figure and the x-axis is reversed, for more information on this command, please refer to

https://www.mathworks.com/help/matlab/ref/hold.html#

and a logarithmic scale is applied to the y-axis for better visibility of both plots. Make sure when you visualize both plots together, see how the probability density (PDF) at each point contributes to the cumulative probability (CDF). This dual representation will help you understanding both local behavior (PDF) and global behavior (CDF) of your data distribution. Feel free to adjust line styles, colors, and markers according to your preferences for better visual distinction between CDF and PDF.

Please see attached.

If you have any further questions, please let me know.

11 comentarios
Mostrar 9 comentarios más antiguos Ocultar 9 comentarios más antiguos

Umar el 21 de Sept. de 2024

Hi @Deepayan Bhadra,

You mentioned, “ I want one plot that conveys the diminishing trend. I don't even need to preserve them as probability plots.”

Please see my response to your comments below. Thanks for clarifying about your plot requirements. So, to combine the CDF and PDF into a single plot that effectively conveys a diminishing trend, you can normalize both the CDF and PDF so that they can be combined meaningfully. Then, multiply the standardized curves to visualize the interaction between them. Afterwards, create a single plot that represents this product. However, I did modify the above using generic data , please let me know if this resolves the issue.

% Generate some generic data for demonstration
data = randn(1000, 1); % Normally distributed data

% Create CDF
[values_cdf, x_cdf] = ecdf(data);
cdf_standardized = values_cdf / max(values_cdf); % Standardize   CDF

% Fit a normal distribution to the data for PDF
pd = fitdist(data, 'Normal');
x_pdf = linspace(min(data), max(data), 100);
pdf_values = pdf(pd, x_pdf);
pdf_standardized = pdf_values / max(pdf_values); % Standardize   PDF

% Multiply standardized CDF and PDF
combined_curve = cdf_standardized .* interp1(x_pdf,   pdf_standardized, x_cdf, 'linear', 'extrap');

% Plotting
figure();
plot(x_cdf, combined_curve, 'b-', 'LineWidth', 2);
hold on;
plot(x_cdf, cdf_standardized, 'r--', 'LineWidth', 1.5); %   Original CDF for reference
plot(x_pdf, pdf_standardized, 'g--', 'LineWidth', 1.5); %   Original PDF for reference
xlabel('Value');
ylabel('Combined Value');
title('Combined Diminishing Trend from CDF and PDF');
legend('Combined Curve', 'Standardized CDF', 'Standardized PDF');
set(gca, 'YScale', 'log'); % Log scale for better visibility
hold off;

Please see attached.

So, in the above code, you will find out that applying a logarithmic scale helps in visualizing trends better, especially when dealing with probabilities or densities that span several orders of magnitude. Standardizing both curves makes sure that they are comparable and can be meaningfully and the resulting combined curve visually represents how the likelihood of occurrence diminishes as you move away from the peak density. If you still have any further questions or need additional adjustments, please let me know!

Umar el 21 de Sept. de 2024

I believe @Deepayan Bhadra's initial intention was to qualitatively interpret the meaning of the multiplication of the CDF and the PDF, or to derive the interpretation from the diminishing trend.

Your comments are duly noted and I do respect your opinion. Please let me briefly explain about my recent example code snippet provided, the process begins by generating normally distributed data and calculating both the CDF and PDF. Each function is then standardized to make sure they can be meaningfully combined. The key step involves multiplying the standardized CDF and PDF, which allows for a visual representation of their interaction. The resulting combined curve is plotted alongside the original standardized CDF and PDF for reference. This approach not only highlights the diminishing trend but also provides a qualitative interpretation of how the two distributions interact.

The use of a logarithmic scale enhances visibility, making it easier to observe the diminishing nature of the combined curve. This method effectively meets OP’s (@Deepayan Bhadra's) requirements by creating a single plot that conveys the desired trend without the need for preserving the original probability characteristics. Also, I would suggest that @Deepayan Bhadra should point out as well what part of the code not achieving her goal, so I can provide work around or share technical tips to help her out because the whole purpose of this Mathworks community is to share knowledge and help out OPs to achieve their goal which helps them not only understand the concept but also motivates them to lear more.

Deepayan Bhadra el 25 de Sept. de 2024

Editada: Deepayan Bhadra el 25 de Sept. de 2024

@Umar: A follow up question: If you look at the vectors total_HOLT and bingroups_HOLT in the data, how can I plot a simple discrete probability?

>> a = [70,23,3,0,0,0,0,0,0,0];

>> b = [-90,-80,-70,-60,-50,-40,-30,-20,-10,0];

For example, I want to show that if we have a 'b' value somewhere between (-90,-70), then we have a high probability dictated by corresponding values in 'a' but a negligible probability otherwise for other values in 'b'

Umar el 26 de Sept. de 2024

Editada: Umar el 27 de Sept. de 2024

Hi @ Deepayan Bhadra,

After reviewing your comments second time, I do apologize for misunderstanding your comments. I have edited comments in above post. Your goal is to visualize how the values in vector a correspond to the ranges of values in vector b, specifically showing high probabilities within a certain range of b. Also, you want to represent discrete probabilities derived from two vectors, where:

Vector a contains probability values.

Vector b defines corresponding ranges.

In your given example, you are interested in showing that a value in vector b between -90 and -70 corresponds to a high probability from vector a. You already defined the data by representing them in two vectors:

   a = [70, 23, 3, 0, 0, 0, 0, 0, 0, 0];

   b = [-90, -80, -70, -60, -50, -40, -30, -20, -10, 0];

It will be helpful to normalize your probabilities so that they sum to 1 if you are treating them as probabilities. In this case:

     total_probability = sum(a);
     normalized_a = a / total_probability; % Normalize 'a'

Then use a bar plot to represent the probabilities clearly. Here’s how you can implement this in MATLAB:

   % Create the bar plot for probabilities
   figure();
   bar(b, normalized_a); % Use 'b' for x-axis and normalized    probabilities for 
   y-axis
   xlabel('Value (b)');
   ylabel('Probability');
   title('Discrete Probability Distribution');

   % Highlight the range of interest
   hold on;
   xline(-90,'r--','Start    Range','LabelHorizontalAlignment','left');
   xline(-70,'g--','End    Range','LabelHorizontalAlignment','right');

   % Customize y-axis limits if necessary
   ylim([0 max(normalized_a) * 1.1]);

   % Add grid for better visibility
   grid on;

   hold off;

As you can see in example code snippet, the bar function creates a bar plot where each bar represents the probability associated with each value in vector b. The xline function is used to add vertical lines at -90 and -70 to visually indicate the range of interest and normalization makes sure that your representation is consistent with probability principles. Finally, grid enhances readability by allowing viewers to gauge values more easily. This visualization will clearly show that values in vector a corresponding to b values between -90 and -70 are significantly higher than those outside this range. This can be crucial for interpreting data distributions or making decisions based on these probabilities.

Hope this answers your question. If you have further questions or need additional modifications, feel free to ask!

Iniciar sesión para comentar.

Multiply two probability plots (CDF/PDF)

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuesta aceptada

11 comentarios
Mostrar 9 comentarios más antiguos Ocultar 9 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

Multiply two probability plots (CDF/PDF)

4 comentarios Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

Respuesta aceptada

11 comentarios Mostrar 9 comentarios más antiguos Ocultar 9 comentarios más antiguos

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

11 comentarios
Mostrar 9 comentarios más antiguos Ocultar 9 comentarios más antiguos