Share on Facebook Share on Twitter Email
Answers.com

Multivariate Polya distribution

 
Wikipedia: Multivariate Polya distribution

The multivariate Pólya distribution, also called the Dirichlet compound multinomial distribution, is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector α, and a set of discrete samples x is drawn from the multinomial distribution with probability vector p. The compounding corresponds to a Polya urn scheme. In document classification, for example, the distribution is used to represent probabilities over word counts for different document types.

The probability of a vector of counts x given the parameter vector α is obtained by integrating out the parameters p of the multinomial distribution:

\textrm{P}(\mathbf{x}\mid\mathbf{\alpha})=\int_{\mathbf{p}}\textrm{P}(\mathbf{x}\mid \mathbf{p})\textrm{P}(\mathbf{p}\mid\mathbf{\alpha})\textrm{d}\mathbf{p}

which results in the following explicit formula:

\textrm{P}(\mathbf{x}\mid\mathbf{\alpha})=\frac{n!}
{\prod_{k}\left(n_{k}!\right)}\frac{\Gamma\left(\sum_{k}\alpha_{k}\right)}
{\Gamma\left(n+\sum_{k}\alpha_{k}\right)}\prod_{k}\frac{\Gamma(n_{k}+\alpha_{k})}{\Gamma(\alpha_{k})}

where Γ is the gamma function, nk is the number of times the outcome in x was k, and

n = nk
k

is the total number of trials.

The two-dimensional version of the multivariate Pólya distribution is known as the Beta-binomial model.

The multivariate Pólya distribution is used in automated document classification and clustering, genetics, economy, combat modeling, and quantitative marketing.

See also

References


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Multivariate Polya distribution" Read more