Bayesian probability is an interpretation of the probability calculus which holds that the concept of probability can be defined as the degree to which a person (or community) believes that a proposition is true. Bayesian theory also suggests that Bayes'
theorem can be used as a rule to infer or update the degree of belief in light of new information.
History
Thomas Bayes. (The correct identification of this portrait has been
questioned.)
Bayesian theory and Bayesian probability are named after Thomas Bayes (1702 — 1761), who
proved a special case of what is now called Bayes' theorem. The term Bayesian,
however, came into use only around 1950, and it is not clear that Bayes would have endorsed the
narrow specifically subjectivist interpretation of probability that is now associated with his name. Laplace proved a more general version of Bayes' theorem and used it to solve problems in celestial
mechanics, medical statistics and, by some accounts, even jurisprudence. Laplace, however,
didn't consider this general theorem to be important for the conceptual definition of probability. He instead adhered to the
classical definition of probability.
The subjective theory of probability which interprets 'probability' as 'subjective degree of belief in a proposition' was
proposed independently and at about the same time by Bruno de Finetti in Italy in
Fondamenti Logici del Ragionamento Probabilistico (1930) and Frank Ramsey in
Cambridge in The Foundations of Mathematics (1931).[1] It was devised to solve the problems of the classical definition of probability and replace it. L. J. Savage expanded the idea in The Foundations of Statistics (1954).
Formal attempts have been made to define and apply the intuitive notion of a "degree of belief". One interpretation is based
on betting: a degree of belief is reflected in the odds and stakes that the subject is willing
to bet on the proposition at hand. However, there may be problems with trying to use betting to measure the strength of someone's
belief in a universal scientific law such as Newton's law of inertia or his law of universal gravitation. [2]
On the Bayesian interpretation, the theorems of probability relate to the rationality of partial belief in the way that the
theorems of logic are traditionally seen to relate to the rationality of full belief.
The Bayesian approach has been explored by Harold Jeffreys, Richard T. Cox, Edwin Jaynes and I. J. Good. Other well-known proponents of Bayesian probability have included John Maynard Keynes and B.O. Koopman, and
many philosophers of the 20th century.
Recently, it has been shown that Bayes' Rule and the Principle of Maximum
Entropy (MaxEnt) are completely compatible and can be seen as special cases of the Method of Maximum (relative) Entropy
(ME). This method reproduces every aspect of orthodox Bayesian inference methods. In addition this new method opens the door to
tackling problems that could not be addressed by either the MaxEnt or orthodox Bayesian methods individually.[3]
Varieties
The terms subjective probability, personal probability, epistemic probability and logical
probability describe some of the schools of thought which are customarily called "Bayesian". These overlap but there are
differences of emphasis. Some of the people mentioned here would not call themselves Bayesians.
Subjective Bayesian probability interprets 'probability' as 'the degree of belief (or strength of belief) an
individual has in the truth of a proposition', and is in that respect subjective. Some people who call themselves Bayesians do
not accept this subjectivity. The chief exponents of this objectivist school were Edwin Thompson Jaynes and
Harold Jeffreys. Perhaps the main objectivist Bayesian now living is James Berger of
Duke University. Jose Bernardo and others accept some degree of subjectivity but believe a need exists for "reference priors" in many practical situations.
Advocates of logical (or objective epistemic) probability, such as Harold
Jeffreys, Rudolf Carnap, Richard Threlkeld
Cox and E.T. Jaynes, hope to codify techniques whereby any two persons having the same information relevant to the truth
of an uncertain proposition would calculate the same probability. Such probabilities are not relative to the person but to the
epistemic situation, and thus lie somewhere between subjective and objective. The methods proposed are not without controversy.
Critics challenge the claim that there are grounds for preferring one degree of belief over another in the absence of information
about the facts to which those beliefs refer. However, these criticisms are usually reconciled once the question one is trying to
ask is clear. It now has been shown that Principle of Maximum Entropy and
Bayes' Rule are completely compatible and can be seen as special cases of the Method of Maximum (relative) Entropy (ME).
The Controversy between Bayesian and Frequentist Probability
Bayesian probability - sometimes called credence (i.e. degree of belief) - contrasts with frequency probability, in which probability is derived from observed frequencies in defined
distributions or proportions in populations.
The theory of statistics and probability using frequency probability was
developed by R.A. Fisher, Egon Pearson and
Jerzy Neyman during the first half of the 20th century. A. N. Kolmogorov also used frequency probability to lay the mathematical foundation of probability in
measure theory via the Lebesgue integral in Foundations of the Theory of
Probability (1933). Savage, Koopman, Abraham Wald and others have developed Bayesian
probability since 1950.
The difference between Bayesian and Frequentist interpretations of probability has important consequences in statistical
practice. For example, when comparing two hypotheses using the same data, the theory of hypothesis tests, which is based on the frequency interpretation of probability, allows
the rejection or non-rejection of one model/hypothesis (the 'null' hypothesis) based on
the probability of mistakenly inferring that the data support the other model/hypothesis more. The probability of making such a
mistake, called a Type I error, requires the consideration of hypothetical
data sets derived from the same data source that are more extreme than the data actually observed. This approach allows the
inference that 'either the two hypotheses are different or the observed data are a misleading set'. In contrast, Bayesian methods
condition on the data actually observed, and are therefore able to assign posterior probabilities to any number of hypotheses
directly. The requirement to assign probabilities to the parameters of models representing each hypothesis is the cost of this
more direct approach.
Although there is no reason why different interpretations (senses) of a word cannot be used in different contexts, there is a
history of antagonism between Bayesians and frequentists, with the latter often rejecting the Bayesian interpretation as
ill-grounded. The groups have also disagreed about which of the two senses reflects what is commonly meant by the term
'probable'. More importantly, the groups have agreed that Bayesian and Frequentist analyses answer genuinely different questions,
but disagreed about which class of question it is more important to answer in scientific and engineering contexts.
Applications
Since the 1950s, Bayesian theory and Bayesian probability have been widely applied through Cox's theorem, Jaynes' principle of maximum entropy
and the Dutch book argument. In many applications, Bayesian methods are more general and
appear to give better results than frequency probability. Bayes factors have also been applied with Occam's Razor. See
Bayesian inference and Bayes' theorem for
mathematical applications.
Some regard the scientific method as an application of Bayesian probabilist
inference because they claim Bayes's Theorem is explicitly or implicitly used to update the strength of prior scientific beliefs
in the truth of hypotheses in the light of new information from observation or
experiment. This is said to be done by the use of Bayes's Theorem to calculate a posterior
probability using that evidence and is justified by the Principle of Conditionalisation that P'(h) = P(h/e), where P'(h) is the
posterior probability of the hypothesis 'h' in the light of the evidence 'e', but which principle is denied by some [4] Adjusting original beliefs could mean (coming closer to)
accepting or rejecting the original hypotheses.
Bayesian techniques have recently been applied to filter spam e-mail. A Bayesian spam
filter uses a reference set of e-mails to define what is originally believed to be spam. After the reference has been defined,
the filter then uses the characteristics in the reference to define new messages as either spam or legitimate e-mail. New e-mail
messages act as new information, and if mistakes in the definitions of spam and legitimate e-mail are identified by the user,
this new information updates the information in the original reference set of e-mails with the hope that future definitions are
more accurate. See Bayesian inference and Bayesian filtering.
Probabilities of probabilities
One criticism levelled at the Bayesian probability interpretation has been that a single probability assignment cannot convey
how well grounded the belief is—i.e., how much evidence one has. Consider the
following situations:
- You have a box with white and black balls, but no knowledge as to the quantities
- You have a box from which you have drawn n balls, half black and the rest white
- You have a box and you know that there are the same number of white and black balls
The Bayesian probability of the next ball drawn being black is 0.5 in all three cases. Keynes called this the problem of the "weight of evidence". One approach is to
reflect difference in evidential support by assigning probabilities to these probabilities (so-called metaprobabilities)
in the following manner:
- 1. You have a box with white and black balls, but no knowledge as to the quantities
-
- Letting θ = p represent the statement that the probability of the next ball being black
is p, a Bayesian might assign a uniform Beta prior distribution:
![\forall \theta \in [0,1]](http://content.answers.com/main/content/wp/en/math/0/b/2/0b2981c5ce55fbab39b94d3f791c32ad.png)

-
- Assuming that the ball drawing is modelled as a binomial sampling distribution, the posterior distribution, P(θ | m,n), after drawing m additional black balls and n white balls is still
a Beta distribution, with parameters αB = 1 + m, αW = 1 + n. An intuitive interpretation of the parameters of a Beta distribution is
that of imagined counts for the two events. For more information, see Beta
distribution.
- 2. You have a box from which you have drawn N balls, half black and the rest white
-
- Letting θ = p represent the statement that the probability of the next ball being black
is p, a Bayesian might assign a Beta prior distribution, Β(N / 2
+ 1,N / 2 + 1). The maximum aposteriori estimate (MAP estimate) of
θ is
, precisely Laplace's rule of succession.
- 3. You have a box and you know that there are the same number of white and black balls
-
- In this case a Bayesian would define the prior probability
.
Other Bayesians have argued that probabilities need not be precise numbers.
Because there is no room for metaprobabilities on the frequency interpretation, frequentists have had to find different ways
of representing difference of evidential support. Cedric Smith and Arthur Dempster each developed a theory of upper and
lower probabilities. Glenn Shafer developed Dempster's theory further, and
it is now known as Dempster-Shafer theory.
Footnotes
- ^ See p50-1, Gillies 2000 "The subjective theory of probability was
discovered independently and at about the same time by Frank Ramsey in Cambridge and Bruno de Finetti in Italy." See Gillies'
discussion for its explanation of how the wrong impression came about that Ramsey proposed it first.
- ^ e.g. see Gillies 2000, p55: "My own view is that betting does give a
reasonable measure of the strength of a belief in many cases, but not in all. In particular, betting cannot be used to measure
the strength of someone's belief in a universal scientific law or theory."
- ^ See Giffin and Caticha 2007 "Updating Probabilities with Data and
Moments",(http://arxiv.org/abs/0708.1593)
- ^ See Updating Belief, Chapter 6 of Howson & Urbach 1993, p99-114
and its references to the discussions of Bayesian Conditionalisation of Hacking 1967, Kyburg, Skyrms 1987 and Jeffrey 1965
etc.
See also
External links and references
- tutorial on Bayesian
probabilities
- On-line textbook:
Information Theory, Inference, and Learning Algorithms, by David MacKay, has many chapters on Bayesian methods, including
introductory examples; arguments in favour of Bayesian methods (in the style of Edwin
Jaynes); state-of-the-art Monte Carlo methods, message-passing methods, and variational methods;
and examples illustrating the intimate connections between Bayesian inference and data
compression.
- A nice on-line introductory
tutorial to Bayesian probability from Queen Mary University of London
- An Intuitive Explanation of Bayesian
Reasoning A very gentle introduction by Eliezer Yudkowsky
- Giffin, A. and Caticha, A. 2007 Updating
Probabilities with Data and Moments
- Gillies, D.Philosophical theories of probability Routledge 2000
- Hacking, I. 1965 The Logic of Statistical Inference CUP
- Hacking, I. 1967 'Slightly More Realistic Personal Probability' Philosophy of Science vol34
- Hacking, I. 2006 The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and
Statistical Inference: A Philosophical Study of Early ... on Statistical and Probabilistic Mathematics Cambridge University
Press
- Jaynes, E.T. (2003) Probability Theory : The Logic of Science Cambridge University Press.
- Jaynes, E.T. (1998) Probability Theory : The Logic of Science.
- Jeffrey, R.C. 1983 The Logic of Decision University of Chicago Press
- Jeffrey, R.C. 2004 Subjective Probability: The Real Thing, Cambridge University Press
- Kyburg, H.E. 1974 The Logical Foundations of Statistical Inference Reidel
- Kyburg, H.E. 1983 Epistemology and Inference University of Minnesota Press
- Kyburg, H.E. 1987 'Bayesian versus non-Bayesian Evidential Updating' Artificial Intelligence 31
- Kyburg & Smokler (eds) 1980 Studies in Subjective Probability Robert E. Krieger
- Lakatos, I. 1968 'Changes in the Problem of Inductive Logic' published as Chapter 8 of Philosophical Papers Volume 2
Cambridge University Press 1978
- Bretthorst, G. Larry, 1988, Bayesian
Spectrum Analysis and Parameter Estimation in Lecture Notes in Statistics, 48, Springer-Verlag, New York, New York;
- http://www-groups.dcs.st-andrews.ac.uk/history/Mathematicians/Ramsey.html
- David Howie: Interpreting Probability, Controversies and Developments in the Early Twentieth Century, Cambridge
University Press, 2002, ISBN 0-521-81251-8
- Colin Howson and Peter Urbach: Scientific Reasoning: The Bayesian Approach, Open Court Publishing, 2nd edition, 1993,
ISBN 0-8126-9235-7, focuses on the philosophical underpinnings of Bayesian and frequentist statistics. Argues for the subjective
interpretation of probability.
- Luc Bovens and Stephan Hartmann: Bayesian Epistemology. Oxford: Oxford University Press 2003. Extends the Bayesian
program to more complex decision scenarios (e.g. dependent and partially reliable witnesses and measurement instruments) using
Bayesian Network models. The book also proofs an impossibility theorem for coherence orderings over information sets and offers a
measure that induces a partial coherence ordering.
- Jeff Miller "Earliest Known Uses of Some
of the Words of Mathematics (B)"
- James Franklin The
Science of Conjecture: Evidence and Probability Before Pascal, history from a Bayesian point of view.
- Paul Graham "Bayesian spam
filtering"
- Howard Raiffa Decision Analysis: Introductory Lectures on Choices under Uncertainty. McGraw Hill, College Custom
Series. (1997) ISBN 0-07-052579-X
- Devender Sivia, Data Analysis: A Bayesian Tutorial. Oxford: Clarendon Press (1996), pp. 7-8. ISBN 0-19-851889-7
- Skyrms, B. 1987 'Dynamic Coherence and Probability Kinematics' Philosophy of Science vol 54
- Henk Tijms: Understanding Probability, Cambridge University Press, 2004
- Is the portrait of Thomas Bayes authentic? Who Is this gentleman? When and where was he born? The IMS Bulletin, Vol. 17
(1988), No. 3, pp. 276-278
- Ask the experts on Bayes's Theorem, from Scientific AmericanPhilosophy of science
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)