| Dictionary: normal distribution |
n.
A theoretical frequency distribution for a set of variable data, usually represented by a bell-shaped curve symmetrical about the mean. Also called Gaussian distribution.
| Dictionary: normal distribution |
A theoretical frequency distribution for a set of variable data, usually represented by a bell-shaped curve symmetrical about the mean. Also called Gaussian distribution.
| 5min Related Video: normal distribution |
| Statistics Dictionary: normal distribution |
The distribution of a random variable X for which the probability density function f is given by








| Investment Dictionary: Normal Distribution |
A probability distribution that plots all of its values in a symmetrical fashion and most of the results are situated around the probability's mean. Values are equally likely to plot either above or below the mean. Grouping takes place at values that are close to the mean and then tails off symmetrically away from the mean.
Also known as a "Gaussian distribution" or "bell curve".
Investopedia Says:
The normal distribution is the most common type of distribution, and is often found in stock market analysis. Given enough observations within a sample size, it is reasonable to make the assumption that returns follow a normally distributed pattern, but this assumption can be disproved.
As with any distribution, the distributions mean, skewness and kurtosis coefficients should be calculated in order to determine the type of distribution you may be dealing with.
Related Links:
Learn to predict future events through a series of random trials. Monte Carlo Simulation With GBM
Volatility is not the only way to measure risk. Learn about the "new science of risk management". Introduction to Value at Risk (VAR) - Part 1
Volatility is not the only way to measure risk. Learn about the "new science of risk management". Introduction to Value at Risk (VAR) - Part 2
Check out how the assumptions of theoretical risk models compare to actual market performance. The Uses And Limits Of Volatility
| Accounting Dictionary: Normal Distribution |
Probability distribution. It has the following important characteristics: (1) the curve has a single peak; (2) it is bell-shaped; (3) the mean (average) lies at the center of the distribution, and the distribution is symmetrical around the mean; (4) the two tails of the distribution extend indefinitely and never touch the horizontal axis; (5) the shape of the distribution is determined by its Mean (µ) and Standard Deviation (s).

As with any continuous probability function, the area under the curve must equal 1, and the area between two values of X (say, a and b) represents the probability that X lies between a and b as illustrated on Figure 1. Further, since the normal is a symmetric distribution, it has the nice property that a known percentage of all possible values of X lie within ± a certain number of standard deviations of the mean, as illustrated by Figure 2. For example, 68.27% of the values of any normally distributed variable lie within the interval (µ - 1s, µ + 1s).

Percent 99.73% 99% 95.45% 95% 90% 80% 68.27%
No. Of ± s's 3.00 2.58 2.00 1.96 1.645 1.28 1.00
The probability of the normal as given above is difficult to work with in determining areas under the curve, and each set of X values generates another curve as long as the means and standard deviations are translated to a new axis, a Z-axis, with the translation defined as

The resulting values, called Z-values, are the values of a new variable called the standard normal variate, Z. The translation process is depicted in Figure 3.

The new variable Z is normally distributed with a mean of zero and a standard deviation of 1. Tables of areas under this standard normal distribution have been compiled and widely published so that areas under any normal distribution can be found by translating the X values to Z values and then using the tables for the standardized normal. For example, assume the total book value of an inventory is normally distributed with µ = $8000 and Û = $1000. What percent of the population lies between $6000 and $10,000? To answer, first translate these two X-values to Z-values using the Z formula:
Z1 = ($6000 - $8000)/$1000 Z2 = ($10,000 - $8000)/$1000 = -2 = +2
Referring to Figure 2, note that 95.45% of the population lies between these two values. Interpreted as a probability, the statement can be made that total book value will lie between $6000 and $10,000, with a probability of .9545.
| Dental Dictionary: normal distribution |
A curve representing the frequency with which the values of a variable are obtained or observed when the number is infinite and variation is subject only to chance factors. The curve is a symmetrical, bell-shaped curve with the highest frequency occurring in the middle and gradually tapering toward the extremes. In a normal distribution, 68.2% of all scores cluster around the mean within approximately 1 standard deviation, 95.4% within approximately 2 standard deviations, and 99.7% within approximately 3 standard deviations. Also called normal curve, Gauss’ curve.
| Encyclopedia of Public Health: Normal Distributions |
In studies of public health, information is frequently collected for variables that can be measured on a continuous scale in nature. Examples of such variables include age, weight, and blood pressure. The shape of the distribution associated with these variables is useful to describe the frequency of values across different ranges. More specifically, distributions allow for the probability of obtaining a specific value of a variable to be calculated, while providing estimates of the average, and range, of possible values. The normal distribution is the most widely used distribution to describe continuous variables. It is also frequently referred to as the Gaussian distribution, after the well-known German mathematician Karl Friedrich Gauss (1777–1855).
Normal distributions are a family of distributions characterized by the same general shape. These distributions are symmetrical, with the measured values of the variable more concentrated in the middle than in the tails. They are frequently referred to as "bell-shaped." The area under the curve of a normal distribution represents the sum of the probabilities of obtaining every possible value for a variable. In other words, the total area under a normal curve is equal to one. The shape of the normal distribution represents specified mathematically in terms of only two parameters: the mean (µ), and the standard deviation ([.sigma]). The standard deviation specifies the amount of dispersion around the mean, whereas the mean is the average value across sampled values of the variable. It is a characteristic of normal distribution that 95 percent of the possible values for a variable lie within –2 standard deviations. This is illustrated in Figure 1.
Several biological variables are normally distributed (e.g., blood pressure, serum cholesterol, height, and weight). The normal curve can be used to estimate probabilities associated with these variables. For example, in a population where the birth weight of infants is normally distributed with a mean of 7.2 pounds and a standard deviation of2.1 pounds, one might wish to find the probability a randomly chosen infant will have a birth weight of less than 3 pounds. Such information might help in planning for future obstetric services.
Since the normal distribution can have an infinite number of possible values for its mean and standard deviation, it is impossible to calculate the area for each and every curve. Instead, probabilities are calculated for a single curve where the mean is zero and the standard deviation is one. This curve is referred to as a standard normal distribution (Z). A random variable (X) that is normally distributed with mean (µ) and standard deviation ([.sigma]) can be easily transformed to the standard normal distribution by the formula Z = (X−µ)/[.sigma].
The normal distribution is important to statistical work because most hypothesis tests that are used assume that the random variable being considered has an underlying normal distribution. Fortunately, these tests work very well even if the distribution of the variable is only approximately normal. Examples of such tests include those based on the t, F, or chi-square statistics. If the variable is not normal, alternative nonparametric tests should be considered; however, such tests are inconvenient because they typically are less powerful and flexible in terms of types of conclusions that can be drawn. Alternatively, mathematical theory (e.g., the central limit theorem) has proven that normal distribution–based hypothesis testing can be performed if a large enough number of samples are taken. This latter option is based on an important principle that is largely responsible for the popularity of tests based on the normal function—that if the size of the samples is large enough, the shape of the sampling distribution approaches normal shape even if the distribution of the variable in question is not normal.
(SEE ALSO: Chi-Square Test; Sampling; Statistics for Public Health)
— PAUL J. VILLENEUVE
| Geography Dictionary: normal distribution |
The line graph showing the expected frequency of occurrences in each class of any set of data for a given variable. The normal distribution is shown as a bell-shaped curve which is symmetrical about the mean. The laws of probability state that between +1σ and -1σ 68.27% of the items in the data set will be found, between +2σ and -2σ 95.45% of all the items in the data set will be found, and between +3σ and 3σ 99.97% of all the items in the data set will be found. In other words, a difference of more or less than 3 standard deviations from the mean is only to be expected once in every 300 observations. So, if in a sample data set of 50 items, one value exceeds ±3 standard deviations from the mean, the data may be suspect and should be checked.
| Political Dictionary: normal distribution |
The normal distribution is a mathematical model of the distribution of a random variate which is continuous, unimodal, and symmetrical, and in which frequencies fall away rapidly with increasing distance from the mean. The characteristics of the model are precisely known and, with reasonably large numbers, the sampling distributions of many statistics approximate to it regardless of any bias among the populations from which they are drawn. These properties allow the normal distribution to be used as the basis for estimating the magnitude of sampling errors, for example with political opinion polls. There is serious confusion with the normal meaning of ‘normal’, which is not meant here.
— Stan Taylor
| Britannica Concise Encyclopedia: normal distribution |
For more information on normal distribution, visit Britannica.com.
| Sports Science and Medicine: normal distribution |
In statistics, a continuous distribution of a random variable with its mean, median, and mode equal. The normal distribution is depicted graphically by a symmetrical, bell-shaped curve.
| Science Dictionary: normal distribution curve |
In statistics, the theoretical curve that shows how often an experiment will produce a particular result. The curve is symmetrical and bell shaped, showing that trials will usually give a result near the average, but will occasionally deviate by large amounts. The width of the “bell” indicates how much confidence one can have in the result of an experiment — the narrower the bell, the higher the confidence. This curve is also called the Gaussian curve, after the nineteenth-century German mathematician Karl Friedrich Gauss. (See statistical significance.)
| Wikipedia: Normal distribution |
| Probability density function The red line is the standard normal distribution |
|
| Cumulative distribution function Colors match the image above |
|
| Parameters | μ location (real) σ2 > 0 squared scale (real) |
|---|---|
| Support | ![]() |
| Probability density function (pdf) | ![]() |
| Cumulative distribution function (cdf) | ![]() |
| Mean | μ |
| Median | μ |
| Mode | μ |
| Variance | σ2 |
| Skewness | 0 |
| Excess kurtosis | 0 |
| Entropy | ![]() |
| Moment-generating function (mgf) | ![]() |
| Characteristic function | ![]() |
In probability theory and statistics, the normal distribution or Gaussian distribution is a continuous probability distribution that describes data that clusters around a mean or average. The graph of the associated probability density function is bell-shaped, with a peak at the mean, and is known as the Gaussian function or bell curve.
The normal distribution can be used to describe, at least approximately, any variable that tends to cluster around the mean. For example, the heights of adult males in the United States are roughly normally distributed, with a mean of about 70 inches. Most men have a height close to the mean, though a small number of outliers have a height significantly above or below the mean. A histogram of male heights will appear similar to a bell curve, with the correspondence becoming closer if more data is used.
For theoretical reasons (such as the central limit theorem), any variable that is the sum of a large number of independent factors is likely to be normally distributed. For this reason, the normal distribution is used throughout statistics, natural science, and social science[1] as a simple model for complex phenomena. For example, the observational error in an experiment is usually assumed to follow a normal distribution, and the propagation of uncertainty is computed using this assumption.
The probability density function for a normal distribution is given by the formula

where μ is the mean, σ is the standard deviation (a measure of the “width” of the bell), and exp denotes the exponential function. For a mean of 0 and a standard deviation of 1, this formula simplifies to

which is known as the standard normal distribution. When properly scaled and translated, the corresponding cumulative distribution function is known as the error function.
The Gaussian distribution is named for Carl Friedrich Gauss, who used it to analyze astronomical data,[2] and defined the formula for its probability density function.
Contents |
The normal distribution was first introduced by Abraham de Moivre in an article in the year 1733,[3] which was reprinted in the second edition of his The Doctrine of Chances, 1738 in the context of approximating certain binomial distributions for large n. His result was extended by Laplace in his book Analytical Theory of Probabilities (1812), and is now called the theorem of de Moivre-Laplace.
Laplace used the normal distribution in the analysis of errors of experiments. The important method of least squares was introduced by Legendre in 1805. Gauss, who claimed to have used the method since 1794, justified it rigorously in 1809 by assuming a normal distribution of the errors. The fact the distribution is sometimes called Gaussian is an example of Stigler's Law.
The name "bell curve" goes back to Esprit Jouffret who first used the term "bell surface" in 1872 for a bivariate normal with independent components. The name "normal distribution" was coined independently by Charles Sanders Peirce, Francis Galton and Wilhelm Lexis around 1875.[citation needed] Despite this terminology, other probability distributions may be more appropriate in some contexts; see the discussion of occurrence, below.
There are various ways to characterize a probability distribution. The most visual is the probability density function (PDF). Equivalent ways are the cumulative distribution function, the moments, the cumulants, the characteristic function, the moment-generating function, the cumulant-generating function, and Maxwell's theorem. See probability distribution for a discussion.
To indicate that a real-valued random variable X is normally distributed with mean μ and variance σ2 ≥ 0, we write

While it is certainly useful for certain limit theorems (e.g. asymptotic normality of estimators) and for the theory of Gaussian processes to consider the probability distribution concentrated at μ (see Dirac measure) as a distribution with mean μ and variance σ2 = 0. This degenerate case is often excluded from the considerations because no density with respect to the Lebesgue measure exists.
The normal distribution may also be parameterized using a precision parameter τ, defined as the reciprocal of σ2. This parameterization has an advantage in numerical applications where σ2 is very close to zero and is more convenient to work with in analysis as τ is a natural parameter of the normal distribution.
The continuous probability density function of the normal distribution is the Gaussian function

where σ > 0 is the standard deviation, the real parameter μ is the expected value, and

is the density function of the "standard" normal distribution: i.e., the normal distribution with μ = 0 and σ = 1. The integral of
over the real line is equal to one as shown in the Gaussian integral article.
As a Gaussian function with the denominator in the exponent equal to 2, the standard normal density function
is an eigenfunction of the Fourier transform.
The probability density function has notable properties including:
The cumulative distribution function (cdf) of a probability distribution, evaluated at a number (lower-case) x, is the probability of the event that a random variable (capital) X with that distribution is less than or equal to x. The cumulative distribution function of the normal distribution is expressed in terms of the density function as follows:

The standard normal cdf is just the general cdf evaluated with μ = 0 and σ = 1:

The standard normal cdf can be expressed in terms of a special function called the error function, as
![\Phi(x)
=\frac{1}{2} \Bigl[ 1 + \operatorname{erf} \Bigl( \frac{x}{\sqrt{2}} \Bigr) \Bigr],
\quad x\in\mathbb{R},](http://wpcontent.answers.com/math/0/0/3/003dabb870f6a1fc0521a85000ea8090.png)
and the cdf itself can hence be expressed as
![\Phi_{\mu,\sigma^2}(x)
= \Phi\!\left(\frac{x-\mu}{\sigma}\right)
=\frac{1}{2} \Bigl[ 1 + \operatorname{erf} \Bigl( \frac{x-\mu}{\sigma\sqrt{2}} \Bigr) \Bigr],
\quad x\in\mathbb{R}.](http://wpcontent.answers.com/math/2/c/c/2cc0a84c7fcc6e8be608ce2fd2091a1d.png)
The complement of the standard normal cdf, 1 − Φ(x), is often denoted Q(x), and is sometimes referred to simply as the Q-function, especially in engineering texts.[4][5] This represents the tail probability of the Gaussian distribution. Other definitions of the Q-function, all of which are simple transformations of Φ, are also used occasionally.[6]
The inverse standard normal cumulative distribution function, or quantile function, can be expressed in terms of the inverse error function:

and the inverse cumulative distribution function can hence be expressed as

This quantile function is sometimes called the probit function. There is no elementary primitive for the probit function. This is not to say merely that none is known, but rather that the non-existence of such an elementary primitive has been proven. Several accurate methods exist for approximating the quantile function for the normal distribution; see quantile function for a discussion and references.
The values Φ(x) may be approximated very accurately by a variety of methods, such as numerical integration, Taylor series, asymptotic series and continued fractions. See Q-function for one method of approximation valid for large x.
The moment generating function is defined as the expected value of exp(tX). For a normal distribution, the moment generating function is
![\begin{align}
M_X(t) & {} = \mathrm{E} \left[ \exp{(tX)} \right] \\
& {} = \int_{-\infty}^{\infty} \frac{1}{\sigma \sqrt{2\pi} }
\exp{\left( -\frac{(x - \mu)^2}{2 \sigma^2} \right)}
\exp{(tx)} \, dx \\
& {} = \exp{ \left( \mu t + \frac{\sigma^2 t^2}{2} \right)}
\end{align}](http://wpcontent.answers.com/math/e/c/3/ec3e32bd3a987126f3c3b40e239fa768.png)
as can be seen by completing the square in the exponent.
The cumulant generating function is the logarithm of the moment generating function: g(t) = μt + σ2t2/2. Since this is a quadratic polynomial in t, only the first two cumulants are nonzero.
The characteristic function is defined as the expected value of exp(itX), where i is the imaginary unit. So the characteristic function is obtained by replacing t with it in the moment-generating function.
For a normal distribution, the characteristic function is [7]
![\begin{align}
\chi_X(t;\mu,\sigma) &{} = M_X(i t) = \mathrm{E}
\left[ \exp(i t X) \right] \\
&{}=
\int_{-\infty}^{\infty}
\frac{1}{\sigma \sqrt{2\pi}}
\exp
\left(- \frac{(x - \mu)^2}{2\sigma^2}
\right)
\exp(i t x)
\, dx \\
&{}=
\exp
\left(
i \mu t - \frac{\sigma^2 t^2}{2}
\right).
\end{align}](http://wpcontent.answers.com/math/3/7/e/37e8462c5cc0193558226a94aa4f2a03.png)
Some properties of the normal distribution:
and a and b are real numbers, then
(see expected value and variance).
and
are independent normal random variables, then:
(proof). Thus the normal distribution is infinitely divisible. Interestingly, the converse holds: if two independent random variables have a normally-distributed sum, then they must be normal themselves; this is known as Cramér's theorem.
.
and
are independent normal random variables, then:
where K0 is a modified Bessel function of the second kind.
. Thus the Cauchy distribution is a special kind of ratio distribution.
are independent standard normal variables, then
has a chi-square distribution with n degrees of freedom.
are independent standard normal variables, then the sample mean
and sample variance
are independent. This can be proven using Basu's theorem or Cochran's theorem. This property characterizes normal distributions (and helps to explain why the F-test is non-robust with respect to non-normality!) See also Student's t-distribution, which uses a ratio derived from these.As a consequence of Property 1, it is possible to relate all normal random variables to the standard normal.
If X ~ N(μ,σ2), then

is a standard normal random variable: Z ~ N(0,1). An important consequence is that the cdf of a general normal distribution is therefore

Conversely, if Z is a standard normal distribution, Z ~ N(0,1), then
is a normal random variable with mean μ and variance σ2.
The standard normal distribution has been tabulated (usually in the form of value of the cumulative distribution function Φ), and the other normal distributions are the simple transformations, as described above, of the standard one. Therefore, one can use tabulated values of the cdf of the standard normal distribution to find values of the cdf of a general normal distribution.
The first few moments of the normal distribution are:
| Number | Raw moment | Central moment | Cumulant |
|---|---|---|---|
| 0 | 1 | 1 | |
| 1 | μ | 0 | μ |
| 2 | μ2 + σ2 | σ2 | σ2 |
| 3 | μ3 + 3μσ2 | 0 | 0 |
| 4 | μ4 + 6μ2σ2 + 3σ4 | 3σ4 | 0 |
| 5 | μ5 + 10μ3σ2 + 15μσ4 | 0 | 0 |
| 6 | μ6 + 15μ4σ2 + 45μ2σ4 + 15σ6 | 15σ6 | 0 |
| 7 | μ7 + 21μ5σ2 + 105μ3σ4 + 105μσ6 | 0 | 0 |
| 8 | μ8 + 28μ6σ2 + 210μ4σ4 + 420μ2σ6 + 105σ8 | 105σ8 | 0 |
All cumulants of the normal distribution beyond the second are zero.
Higher central moments (of order 2k) are given by the formula
![E\left[(X-\mu)^{2k}\right]=\frac{(2k)!}{2^k k!} \sigma^{2k}.](http://wpcontent.answers.com/math/c/4/5/c45d4d22ee1da09add1af6a69a6e5d7e.png)
The general p-th raw moment (p not necessarily an integer) can be expressed as
![\operatorname{E} \left[\left|N\left(\mu, \sigma^2 \right)\right|^p \right]= \left(2 \sigma^2\right)^\frac p 2 \frac {\Gamma\left(\frac{1+p}2\right)}{\sqrt \pi}\, _1F_1\left(-\frac p 2, \frac 1 2, -\frac{\mu^2}{2 \sigma^2}\right),](http://wpcontent.answers.com/math/4/2/7/42729d900168e9c3309da46311255891.png)
![\operatorname{E} \left[N\left(\mu, \sigma^2 \right)^p \right]=(-2 \sigma^2)^\frac p 2 \cdot U\left(-\frac p 2, \frac 1 2, -\frac{\mu^2}{2 \sigma^2} \right),](http://wpcontent.answers.com/math/0/3/c/03c8a8b2dccce25d5d4fe9d9c8a55fd9.png)
where 1F1 and U are confluent hypergeometric functions (the function's second branch cut can be chosen by multiplying with ( − 1)p).
Under certain conditions (such as being independent and identically-distributed with finite variance), the sum of a large number of random variables is approximately normally distributed; this is the central limit theorem.
The practical importance of the central limit theorem is that the normal cumulative distribution function can be used as an approximation to some other cumulative distribution functions, for example:
Whether these approximations are sufficiently accurate depends on the purpose for which they are needed, and the rate of convergence to the normal distribution. It is typically the case that such approximations are less accurate in the tails of the distribution. A general upper bound of the approximation error of the cumulative distribution function is given by the Berry–Esséen theorem.
The normal distributions are infinitely divisible probability distributions: Given a mean μ, a variance σ 2 ≥ 0, and a natural number n, the sum X1 + . . . + Xn of n independent random variables

has this specified normal distribution (to verify this, use characteristic functions or convolution and mathematical induction).
The normal distributions are strictly stable probability distributions.
About 68% of values drawn from a normal distribution are within one standard deviation σ > 0 away from the mean μ; about 95% of the values are within two standard deviations and about 99.7% lie within three standard deviations. This is known as the "68-95-99.7 rule" or the "empirical rule" or the "3-sigma rule."
To be more precise, the area under the bell curve between μ − nσ and μ + nσ in terms of the cumulative normal distribution function is given by

where erf is the error function. To 12 decimal places, the values for the 1-, 2-, up to 6-sigma points are:
|
![]() |
|---|---|
| 1 | 0.682689492137 |
| 2 | 0.954499736104 |
| 3 | 0.997300203937 |
| 4 | 0.999936657516 |
| 5 | 0.999999426697 |
| 6 | 0.999999998027 |
The next table gives the reverse relation of sigma multiples corresponding to a few often used values for the area under the bell curve. These values are useful to determine (asymptotic) confidence intervals of the specified levels based on normally distributed (or asymptotically normal) estimators:
|
|
|---|---|
| 0.80 | 1.28155 |
| 0.90 | 1.64485 |
| 0.95 | 1.95996 |
| 0.98 | 2.32635 |
| 0.99 | 2.57583 |
| 0.995 | 2.80703 |
| 0.998 | 3.09023 |
| 0.999 | 3.29052 |
| 0.9999 | 3.8906 |
| 0.99999 | 4.4172 |
where the value on the left of the table is the proportion of values that will fall within a given interval and n is a multiple of the standard deviation that specifies the width of the interval.
The Normal distribution is a two-parameter exponential family form with natural parameters μ and 1/σ2, and natural statistics X and X2. The canonical form has parameters
and
and sufficient statistics
and 
Consider the complex Gaussian random variable,

where X and Y are real and independent Gaussian variables with equal variances
. The pdf of the joint variables is then

Because
, the resulting pdf for the complex Gaussian variable Z is

is a Rayleigh distribution if
where
and
are two independent normal distributions.
is a chi-square distribution with ν degrees of freedom if
where
for
and are independent.
is a Cauchy distribution if Y = X1 / X2 for
and
are two independent normal distributions.
is a log-normal distribution if Y = eX and
.
then
.
then truncating X below at A and above at B will lead to a random variable with mean
where
and
is the probability density function of a standard normal random variable.Many scores are derived from the normal distribution, including percentile ranks ("percentiles" or "quantiles"), normal curve equivalents, stanines, z-scores, and T-scores. Additionally, a number of behavioral statistical procedures are based on the assumption that scores are normally distributed; for example, t-tests and ANOVAs (see below). Bell curve grading assigns relative grades based on a normal distribution of scores.
| This section requires expansion. |
Normality tests check a given set of data for similarity to the normal distribution. The null hypothesis is that the data set is similar to the normal distribution, therefore a sufficiently small P-value indicates non-normal data.
For a normal distribution with mean μ and variance σ2, the sample mean
:

As the number of samples grows, the standard error of the sample mean decays as

so if one wishes to decrease the standard error by a factor of 10, one must increase the number of samples by a factor of 100. This fact is widely used in determining sample sizes for opinion polls and number of trials in Monte Carlo simulation.
The sample distribution of the mean depends on the standard deviation σ; it is not an ancillary statistic, and thus to estimate the error of the sample mean, one must estimate the standard deviation.
The sample standard deviation, defined as:

is a common estimator for the population standard deviation:

by Cochran's theorem.Note that:
For the normal distribution, one can compute a correction factor, which depends on n, to arrive at an unbiased estimator of the standard deviation. This is denoted by c4, and the corrected (unbiased) estimator is s / c4. For n=2
, while for n=10
so this correction is rarely used outside of high-precision estimation of small samples.
The standard error of the uncorrected (biased) sample standard deviation s is[8][9]
thus it also decays as
.
The maximum likelihood estimator of the population mean μ from a sample is an unbiased estimator of the mean. The maximum likelihood estimator of the variance is unbiased if we assume the population is known a priori, but in practice that does not happen. However, if we are faced with a sample and have no knowledge of the mean or the variance of the population from which it is drawn, as assumed in the maximum likelihood derivation above, then the maximum likelihood estimator of the variance is biased. An unbiased estimator of the variance σ2 is:

This "sample variance" follows a Gamma distribution if all Xi are independent and identically-distributed:

with mean
and variance 
The maximum likelihood estimate of the standard deviation is the square root of the maximum likelihood estimate of the variance. However, neither this nor the square root of the sample variance provides an unbiased estimate for standard deviation: see unbiased estimation of standard deviation for formulae particular to the normal distribution.
Suppose

are independent and each is normally distributed with expectation μ and variance σ 2 > 0. In the language of statisticians, the observed values of these n random variables make up a "sample of size n from a normally distributed population." It is desired to estimate the "population mean" μ and the "population standard deviation" σ, based on the observed values of this sample. The continuous joint probability density function of these n independent random variables is

As a function of μ and σ, the likelihood function based on the observations X1, ..., Xn is

with some constant C > 0 (which in general would be even allowed to depend on X1, ..., Xn, but will vanish anyway when partial derivatives of the log-likelihood function with respect to the parameters are computed, see below).
In the method of maximum likelihood, the values of μ and σ that maximize the likelihood function are taken as estimates of the population parameters μ and σ.
Usually in maximizing a function of two variables, one might consider partial derivatives. But here we will exploit the fact that the value of μ that maximizes the likelihood function with σ fixed does not depend on σ. Therefore, we can find that value of μ, then substitute it for μ in the likelihood function, and finally find the value of σ that maximizes the resulting expression.
It is evident that the likelihood function is a decreasing function of the sum

So we want the value of μ that minimizes this sum. Let

be the "sample mean" based on the n observations. Observe that

Only the last term depends on μ and it is minimized by

That is the maximum-likelihood estimate of μ based on the n observations X1, ..., Xn. When we substitute that estimate for μ into the likelihood function, we get

It is conventional to denote the "log-likelihood function", i.e., the logarithm of the likelihood function, by a lower-case ℓ, and we have

and then

This derivative is positive, zero, or negative according as σ2 is between 0 and

or equal to that quantity, or greater than that quantity. (If there is just one observation, meaning that n = 1, or if X1 = ... = Xn, which only happens with probability zero, then
by this formula, reflecting the fact that in these cases the likelihood function is unbounded as σ decreases to zero.)
Consequently this average of squares of residuals is the maximum-likelihood estimate of σ2, and its square root is the maximum-likelihood estimate of σ based on the n observations. This estimator
is biased, but has a smaller mean squared error than the usual unbiased estimator, which is n/(n − 1) times this estimator.
The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is subtle. It involves the spectral theorem and the reason it can be better to view a scalar as the trace of a 1×1 matrix than as a mere scalar. See estimation of covariance matrices.
Approximately normal distributions occur in many situations, as explained by the central limit theorem. When there is reason to suspect the presence of a large number of small effects acting additively and independently, it is reasonable to assume that observations will be normal. There are statistical methods to empirically test that assumption, for example the Kolmogorov-Smirnov test.
Effects can also act as multiplicative (rather than additive) modifications. In that case, the assumption of normality is not justified, and it is the logarithm of the variable of interest that is normally distributed. The distribution of the directly observed variable is then called log-normal.
Finally, if there is a single external influence which has a large effect on the variable under consideration, the assumption of normality is not justified either. This is true even if, when the external variable is held constant, the resulting marginal distributions are indeed normal. The full distribution will be a superposition of normal variables, which is not in general normal. This is related to the theory of errors (see below).
To summarize, here is a list of situations where approximate normality is sometimes assumed. For a fuller discussion, see below.
Of relevance to biology and economics is the fact that complex systems tend to display power laws rather than normality.
Light intensity from a single source varies with time, as thermal fluctuations can be observed if the light is analyzed at sufficiently high time resolution. Quantum mechanics interprets measurements of light intensity as photon counting, where the natural assumption is to use the Poisson distribution. When light intensity is integrated over large times longer than the coherence time, the Poisson-to-normal approximation is appropriate.
Normality is the central assumption of the mathematical theory of errors. Similarly, in statistical model-fitting, an indicator of goodness of fit is that the residuals (as the errors are called in that setting) be independent and normally distributed. The assumption is that any deviation from normality needs to be explained. In that sense, both in model-fitting and in the theory of errors, normality is the only observation that need not be explained, being expected. However, if the original data are not normally distributed (for instance if they follow a Cauchy distribution), then the residuals will also not be normally distributed. This fact is usually ignored in practice.
Repeated measurements of the same quantity are expected to yield results which are clustered around a particular value. If all major sources of errors have been taken into account, it is assumed that the remaining error must be the result of a large number of very small additive effects, and hence normal. Deviations from normality are interpreted as indications of systematic errors which have not been taken into account. Whether this assumption is valid is debatable.
A famous and oft-quoted remark attributed to Gabriel Lippmann says: "Everyone believes in the [normal] law of errors: the mathematicians, because they think it is an experimental fact; and the experimenters, because they suppose it is a theorem of mathematics." [10]
The sizes of full-grown animals is approximately lognormal. The evidence and an explanation based on models of growth was first published in the 1932 book Problems of Relative Growth by Julian Huxley.
Differences in size due to sexual dimorphism, or other polymorphisms like the worker/soldier/queen division in social insects, further make the distribution of sizes deviate from lognormality.
The assumption that linear size of biological specimens is normal (rather than lognormal) leads to a non-normal distribution of weight (since weight or volume is roughly proportional to the 2nd or 3rd power of length, and Gaussian distributions are only preserved by linear transformations), and conversely assuming that weight is normal leads to non-normal lengths. This is a problem, because there is no a priori reason why one of length, or body mass, and not the other, should be normally distributed. Lognormal distributions, on the other hand, are preserved by powers so the "problem" goes away if lognormality is assumed.
On the other hand, there are some biological measures where normality is assumed, such as blood pressure of adult humans. This is supposed to be normally distributed, but only after separating males and females into different populations (each of which is normally distributed).
Already in 1900 Louis Bachelier proposed representing price changes of stocks using the normal distribution. This approach has since been modified slightly. Because of the multiplicative nature of compounding of returns, financial indicators such as stock values and commodity prices exhibit "multiplicative behavior". As such, their periodic changes (e.g., yearly changes) are not normal, but rather lognormal - i.e. logarithmic returns as opposed to values are normally distributed. This is still the most commonly used hypothesis in finance, in particular in option pricing in the Black–Scholes model.
However, in reality financial variables exhibit heavy tails, and thus the assumption of normality understates the probability of extreme events such as stock market crashes. Corrections to this model have been suggested by mathematicians such as Benoît Mandelbrot, who observed that the changes in logarithm over short periods (such as a day) are approximated well by distributions that do not have a finite variance, and therefore the central limit theorem does not apply. Rather, the sum of many such changes gives log-Levy distributions.
In standardized testing, results can be scaled to have a normal distribution; for example, the SAT's traditional range of 200–800 is based on a normal distribution with a mean of 500 and a standard deviation of 100. As the entire population is known, this normalization can be done, and allows the use of the Z test in standardized testing.
Sometimes, the difficulty and number of questions on an IQ test is selected in order to yield normal distributed results. Or else, the raw test scores are converted to IQ values by fitting them to the normal distribution. In either case, it is the deliberate result of test construction or score interpretation that leads to IQ scores being normally distributed for the majority of the population. However, the question whether intelligence itself is normally distributed is more involved, because intelligence is a latent variable, therefore its distribution cannot be observed directly.
The probability density function of the normal distribution is closely related to the (homogeneous and isotropic) diffusion equation and therefore also to the heat equation. This partial differential equation describes the time evolution of a mass-density function under diffusion. In particular, the probability density function

for the normal distribution with expected value 0 and variance t satisfies the diffusion equation:

If the mass-density at time t = 0 is given by a Dirac delta, which essentially means that all mass is initially concentrated in a single point, then the mass-density function at time t will have the form of the normal probability density function with variance linearly growing with t. This connection is no coincidence: diffusion is due to Brownian motion which is mathematically described by a Wiener process, and such a process at time t will also result in a normal distribution with variance linearly growing with t.
More generally, if the initial mass-density is given by a function φ(x), then the mass-density at time t will be given by the convolution of φ and a normal probability density function.
The normal distribution arises in many areas of statistics. For example, for a random variable with finite variance, the sampling distribution of the sample mean is approximately normal, even if the distribution of the population from which the sample is taken is not normal. However, for distributions with infinite or undefined variance, such as the Cauchy distribution, the sampling distribution of the sample mean need not be approximately normal.
In addition, the normal distribution maximizes information entropy among all distributions with known mean and variance, which makes it the natural choice of underlying distribution for data summarized in terms of sample mean and variance. The normal distribution is the most widely used family of distributions in statistics and many statistical tests are based on the assumption of normality.
For computer simulations, it is often useful to generate values that have a normal distribution. There are several methods and the most basic is to invert the standard normal cdf. More efficient methods are also known, one such method being the Box-Muller transform. An even faster algorithm is the ziggurat algorithm. These are discussed below. A simple approach that is easy to program is as follows. Simply sum 12 uniform (0,1) deviates and subtract 6 (half of 12). This is quite usable in many applications. The sum over these 12 values has an Irwin–Hall distribution; 12 is chosen to give the sum a variance of exactly one. The resulting random deviates are limited to the range (−6, 6) and have a density which is a 12-section eleventh-order polynomial approximation to the normal distribution.[11]
The Box-Muller method says that, if you have two independent random numbers U and V uniformly distributed on (0, 1], (e.g. the output from a random number generator), then two independent standard normally distributed random variables are X and Y, where:


This formulation arises because the chi-square distribution with two degrees of freedom (see property 4 above) is an easily-generated exponential random variable (which corresponds to the quantity lnU in these equations). Thus an angle is chosen uniformly around the circle via the random variable V, a radius is chosen to be exponential and then transformed to (normally distributed) x and y coordinates.
George Marsaglia developed the Ziggurat algorithm, which is faster than the Box-Muller transform and still exact. In about 97% of all cases it uses only two random numbers, one random integer and one random uniform, one multiplication and an if-test. Only in 3% of the cases where the combination of those two falls outside the "core of the ziggurat" a kind of rejection sampling using logarithms, exponentials and more uniform random numbers has to be employed.
There is also some investigation into the connection between the fast Hadamard transform and the normal distribution, since the transform employs just addition and subtraction and by the central limit theorem random numbers from almost any distribution will be transformed into the normal distribution. In this regard a series of Hadamard transforms can be combined with random permutations to turn arbitrary data sets into a normally-distributed data.
The normal distribution function is widely used in scientific and statistical computing. Therefore, it has been implemented in various ways.
The GNU Scientific Library calculates values of the standard normal cdf using piecewise approximations by rational functions. Another approximation method uses third-degree polynomials on intervals.[12] The article on the bc programming language gives an example of how to compute the cdf in Gnu bc.
For a more detailed discussion of how to calculate the normal distribution, see Knuth's The Art of Computer Programming, section 3.4.1C.
The normal distribution
Online results and applications
Algorithms and approximations
|
|||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)
| Best of the Web: normal distribution |
Some good "normal distribution" pages on the web:
Math mathworld.wolfram.com |
| error equation (statistics) | |
| normal population | |
| normal random variable |
| How the normal distribution could be transformed to a standard normal distribution? Read answer... | |
| A normal distribution has a mean of 60 and a standard deviation of 20 for each of the following scores indicate weather the tail of the distribution is to the left or the right of the score and? Read answer... | |
| What does when the sample size and degrees of freedom is sufficiently large the difference between a t distribution and the normal distribution becomes negligible mean? Read answer... |
| What is the advantage of converting an empirical normal distribution to a standard normal distribution? | |
| Compare and contrast between normal distribution poison distribution and binomial distribution? | |
| Similarities and differences between normal distribution and exponential distribution? |
Copyrights:
![]() | Dictionary. The American Heritage® Dictionary of the English Language, Fourth Edition Copyright © 2007, 2000 by Houghton Mifflin Company. Updated in 2007. Published by Houghton Mifflin Company. All rights reserved. Read more | |
![]() | Statistics Dictionary. A Dictionary of Statistics. Second edition revised. Copyright © Oxford University Press, 2008. All rights reserved. Read more | |
![]() | Investment Dictionary. Copyright ©2000, Investopedia.com - Owned and Operated by Investopedia Inc. All rights reserved. Read more | |
![]() | Accounting Dictionary. Dictionary of Accounting Terms. Copyright © 2005 by Barron's Educational Series, Inc. All rights reserved. Read more | |
![]() | Dental Dictionary. Mosby's Dental Dictionary. Copyright © 2004 by Elsevier, Inc. All rights reserved. Read more | |
![]() | Encyclopedia of Public Health. Encyclopedia of Public Health. Copyright © 2002 by The Gale Group, Inc. All rights reserved. Read more | |
![]() | Geography Dictionary. A Dictionary of Geography. Copyright © Susan Mayhew 1992, 1997, 2004. All rights reserved. Read more | |
![]() | Political Dictionary. The Concise Oxford Dictionary of Politics. Copyright © 1996, 2003 by Oxford University Press. All rights reserved. Read more | |
![]() | Britannica Concise Encyclopedia. Britannica Concise Encyclopedia. © 2006 Encyclopædia Britannica, Inc. All rights reserved. Read more | |
![]() | Sports Science and Medicine. The Oxford Dictionary of Sports Science & Medicine. Copyright © Michael Kent 1998, 2006, 2007. All rights reserved. Read more | |
![]() | Science Dictionary. The New Dictionary of Cultural Literacy, Third Edition Edited by E.D. Hirsch, Jr., Joseph F. Kett, and James Trefil. Copyright © 2002 by Houghton Mifflin Company. Published by Houghton Mifflin. All rights reserved. Read more | |
![]() | Wikipedia. This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia article "Normal distribution". Read more |
Mentioned in