Share on Facebook Share on Twitter Email
Answers.com

Zipf–Mandelbrot law

 
Wikipedia: Zipf–Mandelbrot law
Zipf–Mandelbrot
parameters: N \in \{1,2,3\ldots\} (integer)
q \in [0;\infty) (real)
s>0\, (real)
support: k \in \{1,2,\ldots,N\}
pmf: \frac{1/(k+q)^s}{H_{N,q,s}}
cdf: \frac{H_{k,q,s}}{H_{N,q,s}}
mean: \frac{H_{N,q,s-1}}{H_{N,q,s}}-q
median:
mode: 1\,
variance:
skewness:
kurtosis:
entropy:
mgf:
cf:


In probability theory and statistics, the Zipf–Mandelbrot law is a discrete probability distribution. Also known as the Pareto-Zipf law, it is a power-law distribution on ranked data, named after the linguist George Kingsley Zipf who suggested a simpler distribution called Zipf's law, and the mathematician Benoît Mandelbrot, who subsequently generalized it.

The probability mass function is given by:

f(k;N,q,s)=\frac{1/(k+q)^s}{H_{N,q,s}}

where HN,q,s is given by:

H_{N,q,s}=\sum_{i=1}^N \frac{1}{(i+q)^s}

which may be thought of as a generalization of a harmonic number. In the limit as N approaches infinity, this becomes the Hurwitz zeta function ζ(q,s). For finite N and q = 0 the Zipf–Mandelbrot law becomes Zipf's law. For infinite N and q = 0 it becomes a Zeta distribution.

Contents

Applications

The distribution of words ranked by their frequency in a random text corpus is generally a power-law distribution, known as Zipf's law.

If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh & Sidorov, 2001).

In ecological field studies, the relative abundance distribution (i.e. the graph of the number of species observed as a function of their abundance) is often found to conform to a Zipf-Mandelbrot law.[1]

Notes

References

  • Mandelbrot, Benoît (1965). "Information Theory and Psycholinguistics". in B.B. Wolman and E. Nagel. Scientific psychology. Basic Books.  Reprinted as
    • Mandelbrot, Benoît (1968) [1965]. "Information Theory and Psycholinguistics". in R.C. Oldfield and J.C. Marchall. Language. Penguin Books. 
  • Zipf, George Kingsley (1932). Selected Studies of the Principle of Relative Frequency in Language. Cambridge, MA: Harvard University Press. 

External links


Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Zipf–Mandelbrot law" Read more