|
|
This article does not cite any references or sources. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed. (December 2009) |
In statistics, an empirical distribution function is a cumulative probability distribution function that concentrates probability 1/n at each of the n numbers in a sample.
Let X1, …, Xn be iid real random variables with the cdf F(x). The empirical distribution function F̂n(x) is a step function defined by
where I(A) is the indicator of event A.
For fixed x, I(Xi ≤ x) is a Bernoulli random variable with parameter p = F(x), hence nF̂n(x) is a binomial random variable with mean nF(x) and variance nF(x)(1 − F(x)).
Asymptotical properties
- By the strong law of large numbers,
-
for fixed x (a.s. denotes almost sure convergence).
- In other words, F̂n(x) is a consistent unbiased estimator of the cumulative distribution function F(x).
- By the central limit theorem,
- converges in distribution to a normal distribution N(0, F(x)(1 − F(x))) for fixed x.
- The Berry–Esséen theorem provides the rate of this convergence.
- By the Glivenko–Cantelli theorem F̂n(x) → F(x) uniformly over x, that is
-
- The Dvoretzky–Kiefer–Wolfowitz inequality provides the rate of this convergence.
- Kolmogorov showed that
-
- converges in distribution to the Kolmogorov distribution, provided that F(x) is continuous.
- The Kolmogorov–Smirnov test for goodness-of-fit is based on this fact.
-
- as a process indexed by x, converges in law in the Skorokhod space
to a Gaussian process B(F(x)), where B(t) is the Brownian bridge.
See also
- Càdlàg functions
- Empirical probability
- Empirical process
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)









