In probability theory, the probability-generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability-generating functions are often employed for their succinct description of the sequence of probabilities Pr(X = i), and to make available the well-developed theory of power series with non-negative coefficients.
Definition
If X is a discrete random variable taking values on some subset of the non-negative integers, {0,1, ...}, then the probability-generating function of X is defined as:

where pX is the probability mass function of X. Note that the equivalent notation GX is sometimes used to emphasize the dependence on X.
Properties
Power series
Probability-generating functions obey all the rules of power series with non-negative coefficients. In particular, G(1−) = 1, since the probabilities must sum to one, and where G(1−) = limz→1G(z) from below. So the radius of convergence of any probability-generating function must be at least 1, by Abel's theorem for power series with non-negative coefficients.
Probabilities and expectations
The following properties allow the derivation of various basic quantities related to X:
1. The probability mass function of X is recovered by taking derivatives of G

2. It follows from Property 1 that if we have two random variables X and Y, and GX = GY, then fX = fY. That is, if X and Y have identical probability-generating functions, then they are identically distributed.
3. The normalization of the probability density function can be expressed in terms of the generating function by

The expectation of X is given by

More generally, the kth factorial moment, E(X(X − 1) ... (X − k + 1)), of X is given by

So the variance of X is given by
![\textrm{Var}(X)=G''(1^-) + G'(1^-) - \left [G'(1^-)\right ]^2.](http://wpcontent.answers.com/math/9/9/b/99bc6858b339d967876d09fbd3940d6c.png)
4.GX(et) = MX(t) where X is a random variable, G(t) is the probability generating function and M(t) is the moment-generating function.
Functions of independent random variables
Probability-generating functions are particularly useful for dealing with functions of independent random variables. For example:
- If X1, X2, ..., Xn is a sequence of independent (and not necessarily identically distributed) random variables, and
-

- where the ai are constants, then the probability-generating function is given by
-

- For example, if
-

- then the probability-generating function, GSn(z), is given by
-

- It also follows that the probability-generating function of the difference of two independent random variables S = X1 − X2 is
-

- Suppose that N is also an independent, discrete random variable taking values on the non-negative integers, with probability-generating function GN. If the X1, X2, ..., XN are independent and identically distributed with common probability-generating function GX, then
-

- This can be seen as follows:
-

- This last fact is useful in the study of Galton–Watson processes.
- Suppose again that N is also an independent, discrete random variable taking values on the non-negative integers, with probability-generating function GN. If the X1, X2, ..., XN are independent, but not identically distributed random variables, where
denotes the probability generating function of Xi, then it holds
-

- For identically distributed Xi this simplifies to the identity stated before. The general case is sometimes useful to obtain a decomposition of SN by means of generating functions.
Examples
-

- The probability-generating function of a binomial random variable, the number of successes in n trials, with probability p of success in each trial, is
-
![G(z) = \left[(1-p) + pz\right]^n.](http://wpcontent.answers.com/math/7/b/f/7bfa524b99f1a030664f99077deb54b4.png)
- Note that this is the n-fold product of the probability-generating function of a Bernoulli random variable with parameter p.
- The probability-generating function of a negative binomial random variable, the number of failures occurring before the rth success with probability of success in each trial p, is
-

- Note that this is the r-fold product of the probability generating function of a geometric random variable.
-

Example calculation: two simple univariate probability generating functions
This example illustrates basic computations with basic PGFs. We analyse the expected gain of the player in two probabilistic games. The first game works like this: the player rolls a die n times, obtaining a sequence of values such as e.g. 126346656665. For each consecutive sequence of sixes of length k, she wins 6k + 2 Euros. We ask about the expected amount won after n rolls of the die. For example, for the sequence given, we would win 63 + 64 + 65 = 9288 Euros. In the second game, the player starts with a capital of zero Euros and starts flipping coins. If the coin comes up heads, his capital increases by one Euro, if it comes up tails, his capital is reduced by half. We ask about the expected amount won after n coin flips. We will use one to denote heads and zero to denote tails. For example, the sequence 110110 gives 3/2.
Analysis of the first game
The random variable that we will use is Gn, the number of sixes at the end of the sequence. Let
be the PGF where the term pn,k represents the probability that there is a sequence of k sixes at the end of the total sequence of length n.
We immediately have p0(u) = 1 and for
,

Given the fact that pn(u) is a PGF, one has pn(1) = 1 and one finds

This lets us calculate the closed form of pn(u), although we will not need it in the calculation that follows. It is:

Using the definition of the expected gain, one has

This formula represents two cases. First,
represents the expected gain in going from roll m to the next. On average, one receives a quantity of 62(pm(6) − pm(0)) Euros with a probability of
(end of a run of sixes in a row). Second, if the last roll was a 6, then one receives, on average, the sum 62(pn(6) − pn(0)) with probability one. To find En, we only need to sum these contributions.
The recurrence for pn(u) implies immediately that
when
. What's more,
, which gives
Using these values, one has

We only need some additional simplifications to conclude.

The first few values are 36,96,181,291,426.
Analysis of the second game
Let
be the PGF
where the term pn,k represents the probability of winning k Euros with n coin flips. Thus one has p0(u) = 1 and

We don't need an explicit form of pn(u), given that the value we are looking for is

We therefore continue with the derivative of the recurrence:

From the definition of pn(u) one finds pn(1) = 1,, which gives

We conclude that the recurrence for En is
:

The first few values are 
It is important to note that we might have calculated this recurrence directly, starting from

What we cannot calculate without the PGF is the variance, because only the PGF provides all factorial moments. This calculation is very simple and quite similar to the calculation of the expectation.
We wish to find p''n(1) and indeed, the cases pn'(1) and p''n(1) are alike.
We calculate the derivative of the last recurrence from above (the one for pn + 1'(u)) and find

which immediately yields
:

This recurrence is easy to solve, be it manually or with a CAS. With MAPLE, one finally finds

It is an open problem to prove that the number of different winnings (possible values for the capital) after n coin flips is fn + 3 − 1, with fn the Fibonacci numbers.
These are the first few sets of possible values:
- {0},{0,1},{0,1,2,1 / 2},{0,1,2,3,1 / 2,1 / 4,3 / 2},

The sequence is 1, 2, 4, 7, 12 -- this is indeed fn + 3 − 1.
There is a large amount of additional material at les-mathematiques.net, where these two problems originated (consult the external links for more information).
Example calculation: use of bivariate generating functions
The following example illustrates a very common technique the manipulation of PGFs: the use of bivariate super generating functions to compute the ordinary generating function (OGF) of the PGFs of a sequence of random variables.
Suppose you sample a system that can assume two states, X and Y, X with probability p and Y with probability 1 − p, e.g. a coin being flipped, obtaining the sequence of samples

where the system was sampled n times and has no memory.
Define the random variable Mn to be the number of changes from one sample to the next in a sequence of n samples, i.e. how often Sm was different from Sm − 1. For example, the sequence

has two changes, as does

We want to calculate the PGF of Mn, which we will do by using bivariate generating functions.
We introduce the bivariate GF G(z,u) given by
![G(z, u) = \sum_{n\ge 1} E\left[u^{M_n}\right] z^n,](http://wpcontent.answers.com/math/d/d/7/dd766b62a790f81c1c6d9b5fea1237b7.png)
i.e. G(z,u) is the ordinary generating function of the PGFs of the Mn. This step is completely general and indeed the core of the method.
Now let xn,k be the probability of having k changes in a sequence of n samples, where the last sample was an X. Similarly, let yn,k be the probability of having k changes in a sequence of n samples, where the last sample was a Y, and put

so that

Now we clearly have

because having zero changes means getting a sequence of all Xs or Ys.
For
we find

because e.g. to have k changes in a sequence of length n that ends in X, we either append an X to a sequence having k − 1 changes and ending in Y, or append an X to a sequence having k changes and ending in X.
Summing these equations over n and k and writing X for X(z, u) and Y for Y(z, u), we obtain

and

The solution of this system is

and

We may now use the general identity
![\sum_{n\ge 1} E\left[M_n (M_n-1) \ldots (M_n-r)\right] z^n =
\left( \left(\frac{d}{du}\right)^{r+1} G(z, u) \right)_{u=1}](http://wpcontent.answers.com/math/e/a/c/eacf598ae693b821c15811891bab2318.png)
to calculate the factorial moments of Mn. E.g. the OGF of the expectations is given by
![\sum_{n\ge 1} E[M_n] z^n =
\left( \frac{d}{du} (X + Y) \right)_{u=1} =
-2\,{\frac { \left( -1+p \right) {z}^{2}p}{ \left( -1+z \right) ^{2}}},](http://wpcontent.answers.com/math/e/f/c/efcd8e55b26c454f7926095baa825fce.png)
from which we find (extracting coefficients) that
![E[M_n] = 2 \, p (1-p) \,(n-1).](http://wpcontent.answers.com/math/9/3/1/931de51d42548bfb7ddb118a23ab887b.png)
An extensive discussion of this problem, as well as solutions by other methods, may be found on Les-Mathematiques.net (external links).
Example calculation: bivariate generating functions and differential equations
Consider the following balls and urns problem: suppose we have an urn containing n distinguishable balls, i.e. bearing labels from 1 to n. We pick one of the balls at random and remove it from the urn. We also remove all balls whose labels are larger than the one we picked from the urn. E.g. if we picked ball number one, the urn is emptied after one operation. We repeat until the urn is empty. E.g. for an urn containing ten balls, the sequence of picks 6-3-1 would empty the urn in three operations. We introduce the random variable Xn, which gives the number of picks needed to empty the urn. Our goal is to compute all of its moments, and we will do so using exactly the same bivariate generating function as in the previous example, namely the OGF of the PGFs:
![P(z, u) = \sum_{n\ge 1} E\left[u^{X_n}\right] z^n.](http://wpcontent.answers.com/math/f/0/e/f0e77499d735c530565ef45e0e3d4262.png)
We let pn,k be the probability of emptying an urn containing n balls with k operations, so that

We find that

because to empty the urn with one operation, we must pick the ball labelled 1. The remaining probabilities are computed recursively, e.g. we pick the ball with the largest label with probability 1 / n, leaving n − 1 balls (this is r = 1). We pick the ball with the next-to-largest label with probability 1 / n, leaving n − 2 balls (this is r = 2), etc. The upper bound for r is n − (k − 1), because we must have
(we cannot e.g. empty an urn containing six balls using seven operations).
Next we set pn,k = 0 for n < k, so that we may replace the recursion by

Using the coefficient-extraction operator for formal power series, we thus have
![[z^{n-1} u^k] \frac{d}{dz} P(z, u) =
[z^{n-1} u^{k-1}] \frac{1}{1-z} P(z, u) =
[z^{n-1} u^k] \frac{u}{1-z} P(z, u), \quad k\ge 2.](http://wpcontent.answers.com/math/e/3/7/e375bb3d3e71808988997d639ab85dbe.png)
We note furthermore that
![[z^{n-1} u] \frac{d}{dz} P(z, u) =
[z^{n-1}] \frac{d}{dz} \sum_{n\ge 1} \frac{1}{n} z^n =
[z^{n-1}] \frac{d}{dz} \log \frac{1}{1-z} = [z^{n-1}] \frac{1}{1-z} = 1](http://wpcontent.answers.com/math/c/7/0/c7020b544812190ecf0dfb7904c1bc5b.png)
and that
![[z^{n-1} u] \frac{u}{1-z} P(z, u) = 0,](http://wpcontent.answers.com/math/e/1/a/e1a3b4a619b66b28a75712234688947f.png)
where the first equation results from our "boundary condition" that the probability of emptying an urn with one operation is 1 / n.
Summing over
(there are two contributions to both sides of the equation, one for
and another one for k = 1), we obtain

e.g. through
![\sum_{n\ge 1} z^{n-1} u [z^{n-1} u] \frac{d}{dz} P(z, u) =
\sum_{n\ge 1} z^{n-1} u = \frac{u}{1-z}.](http://wpcontent.answers.com/math/1/4/c/14c0e6e1fa43878a321da7327603de6d.png)
The solution to the differential equation is
,
with C(u) a formal power series in u. We note
![[z^0] P(z, u) = 0 = -1 + C(u)\,](http://wpcontent.answers.com/math/7/9/1/791d1462df34504013648b2a9cced394.png)
which follows from the formal series

Hence C(u) is constant and equal to one, and we finally have

which is, incidentally, the generating function of the Stirling numbers of the first kind.
The moments now follow trivially from the formula given in the first example. E.g. for the expectation, we find
![\sum_{n\ge 1} E[X_n] z^n =
\left. \frac{d}{du} P(z, u) \right|_{u=1} =
\left. \left( \frac{1}{1-z} \right)^u \log \frac{1}{1-z}
\right|_{u=1} =
\frac{1}{1-z} \log \frac{1}{1-z}](http://wpcontent.answers.com/math/8/5/8/858d31ee01398125a1b90bdd45ac0a28.png)
which gives
![E[X_n] = [z^n] \frac{1}{1-z} \log \frac{1}{1-z} = H_n,](http://wpcontent.answers.com/math/f/f/b/ffb54bc80a28932cad57c6b30a07870a.png)
the nth harmonic number, and we need about
operations to empty the urn.
An extensive discussion of this problem, as well as solutions by other methods, may be found on Les-Mathematiques.net (external links).
Related concepts
The probability-generating function is occasionally called the z-transform of the probability mass function. It is an example of a generating function of a sequence (see formal power series).
Other generating functions of random variables include the moment-generating function and the characteristic function.
External links
- Riedel, Marko, et al. Espérance, in French.
- Riedel, Marko, et al. Variables aléatoires, in French.
- Riedel, Marko, et al. Des dés , in French.
- Antonio González, José H. Nieto, Marko Riedel ganancia casi constante , in Spanish.
- Antonio González, José H. Nieto, Marko Riedel ganancia exponencial , in Spanish.