ⓘ Hellinger distance
In probability and statistics, the Hellinger distance is used to quantify the similarity between two probability distributions. It is a type of f divergence. The Hellinger distance is defined in terms of the Hellinger integral, which was introduced by Ernst Hellinger in 1909.
1.1. Definition Measure theory
To define the Hellinger distance in terms of measure theory, let P and Q denote two probability measures that are absolutely continuous with respect to a third probability measure λ. The square of the Hellinger distance between P and Q is defined as the quantity
H 2 P, Q = 1 2 ∫ d P d λ − d Q d λ 2 d λ. {\displaystyle H^{2}P,Q={\frac {1}{2}}\displaystyle \int \left{\sqrt {\frac {dP}{d\lambda }}}{\sqrt {\frac {dQ}{d\lambda }}}\right^{2}d\lambda.}Here, dP / dλ and dQ / d λ are the Radon–Nikodym derivatives of P and Q respectively. This definition does not depend on λ, so the Hellinger distance between P and Q does not change if λ is replaced with a different probability measure with respect to which both P and Q are absolutely continuous. For compactness, the above formula is often written as
H 2 P, Q = 1 2 ∫ d P − d Q 2. {\displaystyle H^{2}P,Q={\frac {1}{2}}\int \left{\sqrt {dP}}{\sqrt {dQ}}\right^{2}.}1.2. Definition Probability theory using Lebesgue measure
To define the Hellinger distance in terms of elementary probability theory, we take λ to be Lebesgue measure, so that dP / dλ and dQ / d λ are simply probability density functions. If we denote the densities as f and g, respectively, the squared Hellinger distance can be expressed as a standard calculus integral
H 2 f, g = 1 2 ∫ f x − g x) 2 d x = 1 − ∫ f x g x d x, {\displaystyle H^{2}f,g={\frac {1}{2}}\int \left{\sqrt {fx}}{\sqrt {gx}}\right)^{2}\,dx=1\int {\sqrt {fxgx}}\,dx,}where the second form can be obtained by expanding the square and using the fact that the integral of a probability density over its domain equals 1.
The Hellinger distance H P, Q satisfies the property derivable from the Cauchy–Schwarz inequality
0 ≤ H P, Q ≤ 1. {\displaystyle 0\leq HP,Q\leq 1.}1.3. Definition Discrete distributions
For two discrete probability distributions P = p 1, …, p k {\displaystyle P=p_{1},\ldots,p_{k}} and Q = q 1, …, q k {\displaystyle Q=q_{1},\ldots,q_{k}}, their Hellinger distance is defined as
H P, Q = 1 2 ∑ i = 1 k p i − q i 2, {\displaystyle HP,Q={\frac {1}{\sqrt {2}}}\;{\sqrt {\sum _{i=1}^{k}{\sqrt {p_{i}}}{\sqrt {q_{i}}}^{2}}},}which is directly related to the Euclidean norm of the difference of the square root vectors, i.e.
H P, Q = 1 2 ‖ P − Q ‖ 2. {\displaystyle HP,Q={\frac {1}{\sqrt {2}}}\;{\bigl \}{\sqrt {P}}{\sqrt {Q}}{\bigr \}_{2}.}Also, 1 − H 2 P, Q = ∑ i = 1 k p i q i. {\displaystyle 1H^{2}P,Q=\sum _{i=1}^{k}{\sqrt {p_{i}q_{i}}}.}
2. Connection with the statistical distance
The Hellinger distance H P, Q {\displaystyle HP,Q} and the total variation distance or statistical distance δ P, Q {\displaystyle \delta P,Q} are related as follows:
H 2 P, Q ≤ δ P, Q ≤ 2 H P, Q. {\displaystyle H^{2}P,Q\leq \delta P,Q\leq {\sqrt {2}}HP,Q\.}These inequalities follow immediately from the inequalities between the 1norm and the 2norm.
3. Properties
The Hellinger distance forms a bounded metric on the space of probability distributions over a given probability space.
The maximum distance 1 is achieved when P assigns probability zero to every set to which Q assigns a positive probability, and vice versa.
Sometimes the factor 1 / 2 {\displaystyle 1/2} in front of the integral is omitted, in which case the Hellinger distance ranges from zero to the square root of two.
The Hellinger distance is related to the Bhattacharyya coefficient B C P, Q {\displaystyle BCP,Q} as it can be defined as
H P, Q = 1 − B C P, Q. {\displaystyle HP,Q={\sqrt {1BCP,Q}}.}Hellinger distances are used in the theory of sequential and asymptotic statistics.
The squared Hellinger distance between two normal distributions P ∼ N μ 1, σ 1 2 {\displaystyle \scriptstyle P\,\sim \,{\mathcal {N}}\mu _{1},\sigma _{1}^{2}} and Q ∼ N μ 2, σ 2 {\displaystyle \scriptstyle Q\,\sim \,{\mathcal {N}}\mu _{2},\sigma _{2}^{2}} is:
H 2 P, Q = 1 − 2 σ 1 σ 2 σ 1 2 + σ 2 e − 1 4 μ 1 − μ 2 σ 1 2 + σ 2 2. {\displaystyle H^{2}P,Q=1{\sqrt {\frac {2\sigma _{1}\sigma _{2}}{\sigma _{1}^{2}+\sigma _{2}^{2}}}}\,e^{{\frac {1}{4}}{\frac {\mu _{1}\mu _{2}^{2}}{\sigma _{1}^{2}+\sigma _{2}^{2}}}}.}The squared Hellinger distance between two multivariate normal distributions P ∼ N μ 1, ∑ 1 {\displaystyle \scriptstyle P\,\sim \,{\mathcal {N}}\mu _{1},\sum _{1}} and Q ∼ N μ 2, ∑ 2 {\displaystyle \scriptstyle Q\,\sim \,{\mathcal {N}}\mu _{2},\sum _{2}} is:
H 2 P, Q = 1 − det ∑ 1 / 4 det ∑ 2 1 / 4 det ∑ 1 + ∑ 2 1 / 2 exp { − 1 8 μ 1 − μ 2 T ∑ 1 + ∑ 2 − 1 μ 1 − μ 2 } {\displaystyle H^{2}P,Q=1{\frac {\det\sum _{1}^{1/4}\det\sum _{2}^{1/4}}{\det \left{\frac {\sum _{1}+\sum _{2}}{2}}\right^{1/2}}}\exp \left\{{\frac {1}{8}}\mu _{1}\mu _{2}^{T}\left{\frac {\sum _{1}+\sum _{2}}{2}}\right^{1}\mu _{1}\mu _{2}\right\}}The squared Hellinger distance between two exponential distributions P ∼ E x p α {\displaystyle \scriptstyle P\,\sim \,{\rm
where B {\displaystyle B} is the Beta function.
 measures of distance include the Hellinger distance histogram intersection, Chi  squared statistic, quadratic form distance match distance Kolmogorov Smirnov
 the distance of one probability distribution to the other on a statistical manifold. The divergence is a weaker notion than that of the distance in
 the original Broadway production of Jesus Christ Superstar at the Mark Hellinger Theatre. Jesus Christ Superstar sold in excess of 12 million albums. Future
 Film producer Mark Hellinger bought the rights to the title from Weegee. In 1948, Weegee s aesthetic formed the foundation for Hellinger s film The Naked
 Heavy  tailed distribution Heckman correction Hedonic regression Hellin s law Hellinger distance Helmert Wolf blocking Herdan s law Herfindahl index Heston model Heteroscedasticity
 definition, previously given. Cramer Rao bound Fisher information Hellinger distance Information geometry Amari, Shun  ichi Nagaoka, Horishi 2000 Chentsov s
 example, the Kullback Leibler divergence, Bregman divergence, and the Hellinger distance With indefinitely large samples, limiting results like the central
 Coupling 2: BRG Hellinger distance 2: R Kullback Leibler divergence 2: DCR Levy metric 2: R Total variation Total variation distance in probability
 Strouse. The Broadway production opened on August 21, 1986 at the Mark Hellinger Theatre with little advance sale and to mostly indifferent reviews, and
 at the Mark Hellinger Theatre on December 18 and ran for 329 performances Dear World Broadway production opened at the Mark Hellinger Theatre on February
 valley of Holzhausen brook Saarbach nearby Heldburg in the Kreck. In Hellinger Tal running the river Helling and flow south of the Heldburger Land at
Bhatia–Davis inequality 
Cokurtosis 
Combinant 
Comonotonicity 
Concomitant (statistics) 
Conditional probability distribution 
Convolution of probability distributions 
Coskewness 
Infinite divisibility (probability) 
Joint probability distribution 
Khmaladze transformation 
Marginal distribution 
Memorylessness 
Mills ratio 
Neutral vector 
Normalizing constant 
Normally distributed and uncorrelated .. 
Pairwise independence 
Popovicius inequality on variances 
Probability integral transform 
Relationships among probability distr .. 
Shape of a probability distribution 
Smoothness (probability theory) 
Stability (probability) 
Tail dependence 
Truncated distribution 
Truncation (statistics) 
Zero bias transform 
Film 
Television show 
Game 
Sport 
Science 
Hobby 
Travel 
Technology 
Brand 
Outer space 
Cinematography 
Photography 
Music 
Literature 
Theatre 
History 
Transport 
Visual arts 
Recreation 
Politics 
Religion 
Nature 
Fashion 
Subculture 
Animation 
Award 
Interest 
Users also searched:
...
