Borrar filtros
Borrar filtros

Cannot use 'histogram' to compute entropy

2 visualizaciones (últimos 30 días)
z8080
z8080 el 9 de Sept. de 2021
Comentada: Walter Roberson el 10 de Sept. de 2021
I'd like to compute the entropy of various vectors. I was going to use something like:
X = randn(1,100);
h1 = histogram(X, 'Normalization', 'Probability');
probabilities = h1.Values;
entropy = -sum(probabilities .* log2(probabilities ))
The second command however gives the error:
Undefined function 'c:\Program Files\MATLAB\R2019b\toolbox\matlab\specgraph\histogram.m' for input arguments of type 'double'.
But surely that's exactly what the standard Matlab function 'histogram' expects?! Doing a
which histogram
indeed returns
C:\Program Files\MATLAB\R2019b\toolbox\matlab\specgraph\histogram.m
which is the newest file (by modified date) from several of that name that (sadly) exist in my Matlab folder. I believe this should be the standard Matlab function 'histogram'.
If on the other hand in the above example I use 'hist' instead of 'histogram', I get the scalar value for entropy that I expect. However, I know 'hist' is not recommended, not least because with it one cannot specify the normalization type.
So, my question is: is using 'hist' for computing probabilities ok, or should I try something else to be able to use 'histogram' instead?
  13 comentarios
z8080
z8080 el 10 de Sept. de 2021
Editada: z8080 el 10 de Sept. de 2021
Thanks a lot for this excellent answer and derivation. to answer my own question then, I guess that it is acceptable to manually remove all bins with a count of 0, to enable the computation of entropy based on the non-0 bins. This is in fact what you had answered me from the very beginning :)
Thanks again!
Walter Roberson
Walter Roberson el 10 de Sept. de 2021
Depending on your knowledge of the distribution, it might make sense to take ask for the counts, and take max(1,counts) to substitute a nominal hit for each bin, and then calculate probability from that, as adjusted_counts ./ sum(adjusted_counts) .
The fewer samples you have, the more that distorts the probabilities; the more samples you have, the less likely you are to need it.
But I do recommend figuring out the number of bits yourself somehow or else you are going to continue to be at the mercy of its undocumented method of selecting the number of bins.

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Histograms en Help Center y File Exchange.

Productos


Versión

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by