Share on Facebook Share on Twitter Email
Answers.com

binomial distribution

 
Dictionary: binomial distribution

n.

The frequency distribution of the probability of a specified number of successes in an arbitrary number of repeated independent Bernoulli trials. Also called Bernoulli distribution.


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
Statistics Dictionary: binomial distribution
Top

The distribution associated with the random variable, X, defined as the number of 'successes' in n independent trials each having the same probability, p, of success. The random variable X is said to be a binomial variable and to have a binomial distribution with parameters n and p. This is written as X ~ B(n, p). The mean of this distribution is np and the variance is np(1−p). The probability function is given by




The distribution takes its name from the fact that successive probabilities are the terms in the expansion in ascending powers of p, by the binomial theorem, of (q+p)n, where q=1−p. The first published derivation of the distribution was by Jacob Bernoulli in 1713.

As an example, suppose that a computer generates fifteen random integers between 0 and 9 inclusive. The number of these integers that are odd has a B(15, 0.5) distribution. The number that are non-zero has a B(15, 0.9) distribution, and the number that are greater than 7 has a B(15, 0.2) distribution. The diagram shows the graphs of the probability functions for these distributions.

If we note that P(X=0)=qn, successive probabilities can be calculated using the recurrence relation



.
If (n+1)p is not an integer the graph is unimodal, with mode at the (integer) value of r such that and (n+1)p−1 and (n+1)p are both modal values (as in the B(15, 0.5) case illustrated).

A binomial random variable with parameters n and p may be regarded as the sum of n independent observations of a Bernoulli variable with parameter p. The sum of two independent binomial variables with parameters n1, p and n2, p, respectively, is also a binomial variable, with parameters (n1+n2), p.

For large values of np and nq the normal approximation to the binomial distribution may be used:



and Φ is the cumulative distribution function for a standard normal variable. The '½' is a continuity correction. The result, that a binomial distribution with p = ½ may be approximated by a normal distribution, underlies the derivation of the normal distribution by de Moivre in 1733 and is sometimes referred to as the de Moivre–Laplace theorem. For large values of n and small values of p the Poisson approximation to the binomial distribution may be used:



The word 'binomial' was used in its mathematical sense in a 1557 text entitled The Whetstone of Witte by Robert Recorde. The 'binomial distribution' was so named by Yule in 1911. A tabulation of the distribution for small n is given in Appendix IV.



Binomial distribution. The distribution is *skewed to the right if p>0.5 and to the left if p<0.5. It is symmetric if p=0.5.



Investment Dictionary: Binomial Distribution
Top

A probability distribution that summarizes the likelihood that a value will take one of two independent values under a given set of parameters or assumptions. The underlying assumptions of the binomial distribution are that there is only one outcome for each trial, that each trial has the same probability of success and that each trial is mutually exclusive.

Investopedia Says:
A binomial distribution summarizes the number of trials, or observations, when each trial has the same probability of attaining one particular value.

For example, flipping a coin would create a binomial distribution. This is because each trial can only take one of two values (heads or tails), each success has the same probability (i.e. the probability of flipping a head is 0.50) and the results of one trial will not influence the results of another.

Related Links:
Learn how to illustrate an asset return's sensitivity. Find The Right Fit With Probability Distributions
This technique can reduce uncertainty in estimating future outcomes. Introduction To Monte Carlo Simulation
Volatility is not the only way to measure risk. Learn about the "new science of risk management". Introduction to Value at Risk (VAR) - Part 1


Encyclopedia of Public Health: Binomial Distribution
Top

A binomial distribution can be used to describe the number of times an event will occur in a group of patients, a series of clinical trials, or any other sequence of observations. This event is a binary variable: It either occurs or it doesn't. For example, when patients are treated with a new drug they are either cured or not; when a coin is flipped, the result is either a head or tail. The binary outcome associated with each event is typically referred to as either a "success" or a "failure." In general, a binomial distribution is used to characterize the number of successes over a series of observations (or trials), where each observation is referred to as a "Bernoulli trial."

In a series of n Bernoulli trials, the binomial distribution can be used to calculate the probability of obtaining k successful outcomes. If the variable X represents the total number of successes in n trials, it can only take on a value from 0 to n. The binomial distribution can be used to calculate the probability of obtaining k successes in n trials is calculated as follows:

where 0 less than or equal to p less than or equal to 1 is the probability of success, and n!= 1 × 2 × 3[.dotmath][.dotmath][.dotmath][.dotmath] (n−2)×(n−1)×n.

The above formula assumes that the experiment consists of n identical trials that are independent from one another, and that there are only two possible outcomes for each trial (success or failure). The probability of success (p) is also assumed to be the same in each of the trials.

To further illustrate the application of the above formula, if a drug was developed that cured 30 percent of all patients, and it was administered to ten patients, the probability that exactly four patients would be cured is:

Like other distributions, the binomial distribution can be described in terms of a mean and the spread, or variance, of values. The mean value of a binomial random variable X (i.e., the average number of successes in n trials) can be obtained by multiplying the number of trials by p (np). In the above example, the average number of persons cured in any group of 10 patients would thus be 3. The variance of a binomial distribution is np × (1−p). The variance is largest for p = 0.5, while it decreases as p approaches 0 or 1. Intuitively, this makes sense, since when p is very large or small nearly all the outcomes take on the same value. Returning to the example, a drug that cured every patient p would equal one, while for a drug that cured no one, p would equal zero. In contrast, if the drug was effective in curing only half of the population (p = 0.5) it would be more difficult to predict the outcome in any particular patient, and in this case the variability is relatively large.

In studies of public health, the binomial distribution is used when a researcher is interested in the occurrence of an event rather than in its magnitude. For instance, smoking cessation interventions may choose to focus on whether a smoker quit smoking altogether, rather than evaluate daily reductions in the number of cigarettes smoked. The binomial distribution plays an important role in statistics, as it is likely the most frequently used distribution to describe discrete data.

(SEE ALSO: Statistics for Public Health)

Bibliography

Pagano, M., and Gauvreau, K. (2000). Priniciples of Biostatistics, 2nd edition. Pacific Grove, CA: Duxbury Press.

Rosner, B. (2000). Fundamentals of Biostatistics, 5th edition. Pacific Grove, CA: Duxbury Press.

— PAUL J. VILLENEUVE



Geography Dictionary: binomial distribution
Top

A theoretical frequency distribution which is used in sampling to test whether the characteristics of a random sample are representative of the whole: the population. For example, if it is known that half the population is male, then the probability of sampling one male at random is 0.5, of sampling two consecutively at random is 0.52, i.e. 0.25, of sampling three consecutively at random is 0.125 (0.53), and so on. If the findings of the sample match this probability, then it is representative. In large samples, the binomial form has the same pattern as a normal distribution.

Political Dictionary: binomial distribution
Top

The probability distribution for the frequency of a particular event given that the event has the same probability of occurring in each of several independent trials. For example, the number of heads observed in several coin tosses follows a binomial distribution.

— Stephen Fisher

Wikipedia: Binomial distribution
Top
Also see: Negative binomial distribution.

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance. A binomial distribution should not be confused with a bimodal distribution.

It is frequently used to model number of successes in a sample of size n from a population of size N. Since the samples are not independent (this is sampling without replacement), the resulting distribution is a hypergeometric distribution, not a binomial one. However, for N much larger than n, the binomial distribution is a good approximation, and widely used.

Binomial
Probability mass function
Probability mass function for the binomial distribution
Cumulative distribution function
Cumulative distribution function for the binomial distribution
Colors match the image above
Parameters n \geq 0 number of trials (integer)
0\leq p \leq 1 success probability (real)
Support k \in \{0,\dots,n\}\!
Probability mass function (pmf) {n\choose k} p^k (1-p)^{n-k} \!
Cumulative distribution function (cdf) I_{1-p}(n-\lfloor k\rfloor, 1+\lfloor k\rfloor) \!
Mean np\!
Median one of \{\lfloor np\rfloor, \lceil np \rceil\}[1]
Mode \lfloor (n+1)\,p\rfloor\!
Variance np(1-p)\!
Skewness \frac{1-2p}{\sqrt{np(1-p)}}\!
Excess kurtosis \frac{1-6p(1-p)}{np(1-p)}\!
Entropy  \frac{1}{2} \log_2 \left( 2 \pi n e p (1-p) \right) + O \left( \frac{1}{n} \right)
Moment-generating function (mgf) (1-p + pe^t)^n \!
Characteristic function (1-p + pe^{it})^n \!

Contents

Examples

An elementary example is this: Roll a standard die ten times and count the number of sixes. The distribution of this random number is a binomial distribution with n = 10 and p = 1/6.

As another example, flip a coin three times and count the number of heads. The distribution of this random number is a binomial distribution with n = 3 and p = 1/2.

Specification

Probability mass function

In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(np). The probability of getting exactly k successes in n trials is given by the probability mass function:

 \Pr(K = k) = f(k;n,p)
 \Pr(K = k) = {n\choose k}p^k(1-p)^{n-k}

for k = 0, 1, 2, ..., n and where

{n\choose k}=\frac{n!}{k!(n-k)!}

is the binomial coefficient (hence the name of the distribution) "n choose k", also denoted C(nk),  nCk, or nCk. The formula can be understood as follows: we want k successes (pk) and n − k failures (1 − p)n − k. However, the k successes can occur anywhere among the n trials, and there are C(nk) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

f(k;n,p)=f(n-k;n,1-p).\,\!

So, one must look to a different k and a different p (the binomial is not symmetrical in general). However, its behavior is not arbitrary. There is always an integer m that satisfies

(n+1)p-1 < m \leq (n+1)p.\,

As a function of k, the expression ƒ(knp) is monotone increasing for k < m and monotone decreasing for k > m, with the exception of one case where (n + 1)p is an integer. In this case, there are two maximum values for m = (n + 1)p and m − 1. m is known as the most probable (most likely) outcome of Bernoulli trials. Note that the probability of it occurring can be fairly small.

Cumulative distribution function

The cumulative distribution function can be expressed as:

F(x;n,p) = \Pr(X \le x) = \sum_{i=0}^{\lfloor x \rfloor} {n\choose i}p^i(1-p)^{n-i}.

where \scriptstyle \lfloor x\rfloor\, is the "floor" under x, i.e. the greatest integer less than or equal to x.

It can also be represented in terms of the regularized incomplete beta function, as follows:


\begin{align}
F(k;n,p) & = \Pr(X \le k) = I_{1-p}(n-k, k+1) \\
& = (n-k) {n \choose k} \int_0^{1-p} t^{n-k-1} (1-t)^k \, dt.
\end{align}

For knp, upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound

 F(k;n,p) \leq \exp\left(-2 \frac{(np-k)^2}{n}\right), \!

and Chernoff's inequality can be used to derive the bound

 F(k;n,p) \leq \exp\left(-\frac{1}{2\,p} \frac{(np-k)^2}{n}\right). \!

Moreover, these bounds are reasonably tight when p = 1/2, since the following expression holds for all k3n/8[2]

 F(k;n,1/2) \geq \frac{1}{15} \exp\left(- \frac{16 (n/2 - k)^2}{n}\right). \!

Mean, variance, and mode

If X ~ B(n, p) (that is, X is a binomially distributed random variable), then the expected value of X is

\operatorname{E}(X)=np\,\!

and the variance is

\operatorname{Var}(X)=np(1-p).\,\!

This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p. Using the definition of variance, we have

\sigma^2= \left(1 - p\right)^2p + (0-p)^2(1 - p) = p(1-p).

Now suppose that we want the variance for n such trials (i.e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving

\sigma^2_n = \sum_{k=1}^n \sigma^2 = np(1 - p). \quad

The mode of X is the greatest integer less than or equal to (n + 1)p; if m = (n + 1)p is an integer, then m − 1 and m are both modes.

Covariance between two binomials

If two binomially distributed random variables X and Y are observed together, estimating their covariance can be useful. Using the definition of covariance, we have for one such trial

\operatorname{Cov}(X, Y) = \operatorname{E}(X \cdot Y) - \mu_X \mu_Y

The first term is non-zero only when both X and Y are one, and μX and μY are equal to the two probabilities. Defining pB as the probability of both happening at the same time, this gives

\operatorname{Cov}(X, Y) = p_B - p_X p_Y,

and for n such trials again due to independence

\operatorname{Cov}(X, Y)_n = n ( p_B - p_X p_Y ).

If X and Y are the same variable, this reduces to the variance formula given above.

Algebraic derivations of mean and variance

We derive these quantities from first principles. Certain particular sums occur in these two derivations. We rearrange the sums and terms so that sums solely over complete binomial probability mass functions (pmf) arise, which are always unity

 \sum_{k=0}^n \operatorname{Pr}(X=k) = \sum_{k=0}^n {n\choose k}p^k(1-p)^{n-k} = 1.

We apply the definition of the expected value of a discrete random variable to the binomial distribution

\operatorname{E}(X) = \sum_k x_k \cdot \operatorname{Pr}(x_k) = \sum_{k=0}^n k \cdot \operatorname{Pr}(X=k)

= \sum_{k=0}^n k \cdot {n\choose k}p^k(1-p)^{n-k}.

The first term of the series (with index k = 0) has value 0 since the first factor, k, is zero. It may thus be discarded, i.e. we can change the lower limit to: k = 1

\operatorname{E}(X) = \sum_{k=1}^n k \cdot \frac{n!}{k!(n-k)!} p^k(1-p)^{n-k}

=  \sum_{k=1}^n k \cdot \frac{n\cdot(n-1)!}{k\cdot(k-1)!(n-k)!} \cdot p \cdot p^{k-1}(1-p)^{n-k}.

We've pulled factors of n and k out of the factorials, and one power of p has been split off. We are preparing to redefine the indices.

\operatorname{E}(X) = np \cdot \sum_{k=1}^n \frac{(n-1)!}{(k-1)!(n-k)!} p^{k-1}(1-p)^{n-k}

We rename m = n − 1 and s = k − 1. The value of the sum is not changed by this, but it now becomes readily recognizable

\operatorname{E}(X) = np \cdot \sum_{s=0}^m \frac{(m)!}{s!(m-s)!} p^s(1-p)^{m-s}

= np \cdot \sum_{s=0}^m {m\choose s} p^s(1-p)^{m-s}.

The ensuing sum is a sum over a complete binomial pmf (of one order lower than the initial sum, as it happens). Thus

\operatorname{E}(X) = np \cdot 1 = np.

[3]

Variance

It can be shown that the variance is equal to (see: Computational formula for the variance):

\operatorname{Var}(X) = \operatorname{E}(X^2) - (\operatorname{E}(X))^2.

In using this formula we see that we now also need the expected value of X 2:

\operatorname{E}(X^2) = \sum_{k=0}^n k^2 \cdot \operatorname{Pr}(X=k)

= \sum_{k=0}^n k^2 \cdot {n\choose k}p^k(1-p)^{n-k}.

We can use our experience gained above in deriving the mean. We know how to process one factor of k. This gets us as far as

\operatorname{E}(X^2) = np \cdot \sum_{s=0}^m k \cdot {m\choose s} p^s(1-p)^{m-s}
= np \cdot \sum_{s=0}^m (s+1) \cdot {m\choose s} p^s(1-p)^{m-s}

(again, with m = n − 1 and s = k − 1). We split the sum into two separate sums and we recognize each one

\operatorname{E}(X^2) = np \cdot \bigg( \sum_{s=0}^m s \cdot {m\choose s} p^s(1-p)^{m-s} + \sum_{s=0}^m 1 \cdot {m\choose s} p^s(1-p)^{m-s} \bigg).

The first sum is identical in form to the one we calculated in the Mean (above). It sums to mp. The second sum is unity.

\operatorname{E}(X^2) = np \cdot ( mp + 1) = np((n-1)p + 1) = np(np - p + 1).

Using this result in the expression for the variance, along with the Mean (E(X) = np), we get

\operatorname{Var}(X) = \operatorname{E}(X^2) - (\operatorname{E}(X))^2 = np(np - p + 1) - (np)^2 = np(1-p).

Using falling factorials to find E(X2)

We have

\operatorname{E}(X^2) = \sum_{k=0}^n k^2 \cdot \operatorname{Pr}(X=k)
= \sum_{k=0}^n k^2 \cdot {n\choose k}p^k(1-p)^{n-k}.

But

k^2= k(k - 1) + k.\,

So


\begin{align}
\operatorname{E}(X^2) & = \sum_{k=0}^n (k(k - 1)+ k) \cdot {n\choose k}p^k(1-p)^{n-k} \\
& = \sum_{k=0}^n k ( k - 1 ) {n\choose k}p^k(1-p)^{n-k} + \sum_{k=0}^n k {n\choose k}p^k(1-p)^{n-k} \\
& = \sum_{k=2}^n k ( k - 1 ) {n\choose k}p^k(1-p)^{n-k} + \sum_{k=1}^n k {n\choose k}p^k(1-p)^{n-k} \\
& = \sum_{k=2}^n n ( n - 1 ) {n -2\choose k - 2}p^k(1-p)^{n-k} + \sum_{k=1}^n n {n - 1 \choose k - 1} p^k (1-p)^{n-k} \\
& = \sum_{k=0}^{n-2} n ( n - 1 ) {n -2\choose k}p^{k+2}(1-p)^{(n-2)-k} + \sum_{k=0}^{n-1} n {n - 1 \choose k} p^{k+1} (1-p)^{(n-1)-k} \\
& = n(n-1)p^2 \underbrace{\sum_{k=0}^{n-2} {n - 2 \choose k} p^k (1 - p)^{(n-2)-k}}_{= 1} + np \underbrace{ \sum_{k=0}^{n-1} {n - 1 \choose k} p^k (1-p)^{(n-1)-k}}_{=1} \\
& = n(n-1)p^2  + np \\
& = n^2p^2 - np^2 + np.
\end{align}

Thus

\operatorname{Var}(X) = \operatorname{E}(X^2) - (\operatorname{E}(X))^2
= (n^2p^2 - np^2 + np) - n^2p^2 = np(1 - p).

Relationship to other distributions

Sums of binomials

If X ~ B(np) and Y ~ B(mp) are independent binomial variables, then X +  Y is again a binomial variable; its distribution is

X+Y \sim B(n+m, p).\,

Bernoulli distribution

The Bernoulli distribution is a special case of the binomial distribution, where n = 1. Symbolically, X ~ B(1, p) has the same meaning as X ~ Bern(p).

Normal approximation

Binomial PDF and normal approximation for n = 6 and p = 0.5

If n is large enough, then the skew of the distribution is not too great. In this case, if a suitable continuity correction is used, then an excellent approximation to B(np) is given by the normal distribution

 \operatorname{N}(np, np(1-p)).\,\!

The approximation generally improves as n increases and is better when p is not near to 0 or 1.[4] Various rules of thumb may be used to decide whether n is large enough, and p is far enough from the extremes of zero or unity:

  • One rule is that both np and n(1 − p) must be greater than 5. However, the specific number varies from source to source, and depends on how good an approximation one wants; some sources give 10.
  • Another rule of thumb is that for n > 5 the normal approximation is adequate if[4]
\left|\frac{1/\sqrt{n}}{\sqrt{(1-p)/p}-\sqrt{p/(1-p)}}\right|<0.3
  • Another commonly used rule holds that the above normal approximation is appropriate only if everything within 3 standard deviations of its mean is within the range of possible values, that is if
\mu \pm 3 \sigma = np \pm 3 \sqrt{np(1-p)} \in [0,n]. \,

The following is an example of applying a continuity correction: Suppose one wishes to calculate Pr(X ≤ 8) for a binomial random variable X. If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is approximated by Pr(Y ≤ 8.5). The addition of 0.5 is the continuity correction; the uncorrected normal approximation gives considerably less accurate results.

This approximation is a huge time-saver (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1733. Nowadays, it can be seen as a consequence of the central limit theorem since B(np) is a sum of n independent, identically distributed Bernoulli variables with parameter p.

For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good because the standard deviation, as a proportion of the expected value, gets smaller, which allows a more precise estimate of the unknown parameter p.

Poisson approximation

The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed. Therefore the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and np ≤ 10.[5]

Limits of binomial distributions

  • As n approaches ∞ and p approaches 0 while np remains fixed at λ > 0 or at least np approaches λ > 0, then the Binomial(np) distribution approaches the Poisson distribution with expected value λ.
  • As n approaches ∞ while p remains fixed, the distribution of
{X-np \over \sqrt{np(1-p)\ }}
approaches the normal distribution with expected value 0 and variance 1 (this is just a specific case of the Central Limit Theorem).

Generating binomial random variates

See also

References

  1. ^ Hamza, K. (1995). "The smallest uniform upper bound on the distance between the mean and the median of the binomial and Poisson distributions". Statistics & Probability Letters 23: 21–21. doi:10.1016/0167-7152(94)00090-U.  edit
  2. ^ Matousek, J, Vondrak, J: The Probabilistic Method (lecture notes) [1].
  3. ^ Morse, Philip (1969). Thermal Physics. New York: W. A. Benjamin. ISBN 0805372024. 
  4. ^ a b Box, Hunter and Hunter. Statistics for experimenters. Wiley. p. 53. 
  5. ^ NIST/SEMATECH, '6.3.3.1. Counts Control Charts', e-Handbook of Statistical Methods, <http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm> [accessed 25 October 2006]

External links


Best of the Web: binomial distribution
Top

Some good "binomial distribution" pages on the web:


Math
mathworld.wolfram.com
 
 
 

 

Copyrights:

Dictionary. The American Heritage® Dictionary of the English Language, Fourth Edition Copyright © 2007, 2000 by Houghton Mifflin Company. Updated in 2009. Published by Houghton Mifflin Company. All rights reserved.  Read more
Statistics Dictionary. A Dictionary of Statistics. Second edition revised. Copyright © Oxford University Press, 2008. All rights reserved.  Read more
Investment Dictionary. Copyright ©2000, Investopedia.com - Owned and Operated by Investopedia Inc. All rights reserved.  Read more
Encyclopedia of Public Health. Encyclopedia of Public Health. Copyright © 2002 by The Gale Group, Inc. All rights reserved.  Read more
Geography Dictionary. A Dictionary of Geography. Copyright © Susan Mayhew 1992, 1997, 2004. All rights reserved.  Read more
Political Dictionary. The Concise Oxford Dictionary of Politics. Copyright © 1996, 2003 by Oxford University Press. All rights reserved.  Read more
Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Binomial distribution" Read more