Ray Solomonoff (born 1926, Cleveland, Ohio) is the founder of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on machine learning in 1956[1].
He is the inventor of algorithmic probability[2], with Kolmogorov complexity as a side product. He first described these results at a Conference at Caltech in 1960,[3] and in a report, Feb. 1960, "A Preliminary Report on a General Theory of Inductive Inference."[4] He clarified these ideas more fully in his 1964 publications, "A Formal Theory of Inductive Inference," Part I[5] and Part II.[6]
Although he is best known for algorithmic probability and his general theory of inductive inference, he made other important early discoveries, many directed toward his goal in artificial intelligence: to develop a machine that could solve hard problems using probabilistic methods.
Contents |
Work
He wrote three papers, two with Rapoport, in 1950-52,[7] that are regarded as the earliest statistical analysis of networks.
He was one of the 10 attendees at the 1956 Dartmouth Summer Research Conference on Artificial Intelligence, the seminal event for artificial intelligence as a field. He wrote and circulated a report among the attendees: "An Inductive Inference Machine"[8]. It viewed machine learning as probabilistic, with an emphasis on the importance of training sequences, and on the use of parts of previous solutions to problems in constructing trial solutions for new problems. He published a version of his findings in 1957[9]. These were the first papers to be written on Machine Learning.
In the late 1950s, he invented probabilistic languages and their associated grammars.[10] A probabilistic language assigns a probability value to every possible string.
Generalizing the concept of probabilistic grammars led him to his breakthrough discovery in 1960 of Algorithmic Probability.
Prior to the 1960s, the usual method of calculating probability was based on frequency: taking the ratio of favorable results to the total number of trials. In his 1960 publication, and, more completely, in his 1964 publications, Solomonoff seriously revised this definition of probability. He called this new form of probability "Algorithmic Probability."
What was later called Kolmogorov Complexity was a side product of his General Theory. He described this idea in 1960: "Consider a very long sequence of symbols ...We shall consider such a sequence of symbols to be 'simple' and have a high a priori probability, if there exists a very brief description of this sequence - using, of course, some sort of stipulated description method. More exactly, if we use only the symbols 0 and 1 to express our description, we will assign the probability 2-N to a sequence of symbols if its shortest possible binary description contains N digits."[11]
Five years later, in 1965, the Russian mathematician Kolmogorov independently presented a similar idea. When he became aware of Solomonoff's work, he acknowledged Solomonoff's priority, and for several years, Solomonoff's work was better known in the Soviet Union than in the Western World. The general consensus in the scientific community, however, was to associate this type of complexity with Kolmogorov, who was more concerned with randomness of a sequence. Algorithmic Probability became associated with Solomonoff, who was focused on prediction - the extrapolation of a sequence.
Later in the same 1960 publication Solomonoff describes his improvement on the single-shortest-code theory. This is Algorithmic Probability. He states: "It would seem that if there are several different methods of describing a sequence, each of these methods should be given some weight in determining the probability of that sequence."[12] He then shows how this idea can be used to generate the universal a priori probability distribution and how it enables the use of Bayes rule in inductive inference. Inductive inference, by adding up the predictions of all models describing a particular sequence, using suitable weights based on the lengths of those models, gets the probability distribution for the extension of that sequence. This method of prediction has since become known as Solomonoff Induction.
He enlarged his theory, publishing a number of reports leading up to the publications in 1964. The 1964 papers give a more detailed description of Algorithmic Probability, and Solomonoff Induction, presenting 5 different models, including the model popularly called the Universal Distribution.
Other scientists who had been at the 1956 Dartmouth Summer Conference (such as Newell and Simons) were developing the branch of Artificial Intelligence which used machines governed by if-then rules, fact based. Solomonoff was developing the branch of Artificial Intelligence that focussed on probability and prediction; his specific view of A.I. described machines that were governed by the Algorithmic Probability distribution. The machine generates theories together with their associated probabilities, to solve problems, and as new problems and theories develop, updates the probability distribution on the theories.
In 1968 he found a proof for the efficacy of Algorithmic Probability[13], but mainly because of lack of general interest at that time, did not publish it until 10 years later. In his report, he published the proof for the convergence theorem.
In the years following his discovery of Algorithmic Probability he focused on how to use this probability and Solomonoff Induction in actual prediction and problem solving for A.I. He also wanted to understand the deeper implications of this probability system.
One important aspect of Algorithmic Probability is that it is complete and incomputable.
In the 1968 report he shows that Algorithmic Probability is complete; that is, if there is any describable regularity in a body of data, Algorithmic Probability will eventually discover that regularity, requiring a relatively small sample of that data. Algorithmic Probability is the only probability system know to be complete in this way. As a necessary consequence of its completeness it is incomputable. The incomputability is because some algorithms - a subset of those that are partially recursive - can never be evaluated fully because it would take too long. But these programs will at least be recognized as possible solutions. On the other hand, any computable system is incomplete. There will always be descriptions outside that system's search space which will never be acknowledged or considered, even in an infinite amount of time. Computable prediction models hide this fact by ignoring such algorithms.
In many of his papers he described how to search for solutions to problems and in the 1970s and early 1980s developed what he felt was the best way to update the machine.
The use of probability in A.I., however, did not have a completely smooth path. In the early years of A.I., the relevance of probability was problematic. Many in the A.I. community felt probability was not usable in their work. The area of pattern recognition did use a form of probability, but because there was no broadly based theory of how to incorporate probability in any A.I. field, most fields did not use it at all.
There were, however, researchers such as Judea Pearl and Peter Chessman who argued that probability could be used in artificial intelligence.
About 1984, at an annual meeting of the American Association for Artificial Intelligence (AAAI), it was decided that probability was in no way relevant to A.I.
A protest group formed, and the next year there was a workshop at the AAAI meeting devoted to "Probability and Uncertainty in AI." This yearly workshop has continued to the present day.[14]
As part of the protest at the first workshop, Solomonoff gave a paper on how to apply the universal distribution to problems in A.I.[15] This was an early version of the system he has been developing since that time.
In that report, he described the search technique he had developed. In search problems, the best order of search, is time Ti / Pi, where Ti is the time needed to test the trial and Pi is the probability of success of that trial. He called this the "Conceptual Jump Size" of the problem. Levin's search technique approximates this order[16], and so Solomonoff, who had studied Levin's work, called this search technique Lsearch.
In other papers he explored how to limit the time needed to search for solutions, writing on resource bounded search. The search space is limited by available time or computation cost rather than by cutting out search space as is done in some other prediction methods, such as Minimum Description Length.
Throughout his career Solomonoff has been concerned with the potential benefits and dangers of A.I., discussing it in many of his published reports. In 1985 he analyzed a likely evolution of A.I., giving a formula predicting when it would reach the "Infinity Point"[17]. This Infinity Point is an early version of the "Singularity" later made popular by Ray Kurzweil.
Originally algorithmic induction methods extrapolated ordered sequences of strings. Methods were needed for dealing with other kinds of data.
A 1999 report[18], generalizes the Universal Distribution and associated convergence theorems to unordered sets of strings and a 2008 report[19], to unordered pairs of strings.
In 1997[20],2003 and 2006 he showed that incomputability and subjectivity are both necessary and desirable characteristics of any high performance induction system.
In 1970 he formed his own one man company, Oxbridge Research, and has continued his research there except for periods at other institutions such as MIT, University of Saarland in Germany and IDSIA in Switzerland. In 2003 he was the first recipient of the Kolmogorov Award by The Computer Learning Research Center at the Royal Holloway, University of London, where he gave the inaugural Kolmogorov Lecture. Solomonoff is currently visiting Professor at the CLRC.
In 2006 he spoke at AI@50, "Dartmouth Artificial Intelligence Conference: the Next Fifty Years" commemorating the fiftieth anniversary of the original Dartmouth summer study group. Solomonoff was one of five original participants to attend.
In Feb. 2008, he gave the keynote address at the Conference "Current Trends in the Theory and Application of Computer Science" (CTTACS), held at Notre Dame University in Lebanon. He followed this with a short series of lectures, and began research on new applications of Algorithmic Probability.
Algorithmic Probability and Solomonoff Induction have many advantages for Artificial Intelligence. Algorithmic Probability gives extremely accurate probability estimates. These estimates can be revised by a reliable method so that they continue to be acceptable. It utilizes search time in a very efficient way. In addition to probability estimates, Algorithmic Probability "has for AI another important value: its multiplicity of models gives us many different ways to understand our data;
A very conventional scientist understands his science using a single 'current paradigm' --- the way of understanding that is most in vogue at the present time. A more creative scientist understands his science in very many ways, and can more easily create new theories, new ways of understanding, when the 'current paradigm' no longer fits the current data"[21] .
A description of Solomonoff's life and work prior to 1997 is in "The Discovery of Algorithmic Probability", Journal of Computer and System Sciences, Vol 55, No. 1, pp 73–88, August 1997. The paper, as well as most of the others mentioned here, are available on his website at the publications page.
See also
- Kolmogorov complexity
- Inductive inference
- Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications. Springer-Verlag, N.Y., 1997, includes historical notes on Solomonoff as well as a description and analysis of his work.
References
- ^ (pdf scanned copy of the original)
- ^ detailed description of Algorithmic Probability in Scholarpedia
- ^ Paper from conference on "Cerebral Systems and Computers", California Institute of Technology, Feb 8-11, 1960, cited in "A Formal Theory of Inductive Inference, Part 1, 1964, p. 1
- ^ Solomonoff, R., "A Preliminary Report on a General Theory of Inductive Inference", Report V-131, Zator Co., Cambridge, Ma. Feb 4, 1960.
- ^ Solomonoff, R., "A Formal Theory of Inductive Inference, Part I" Information and Control, Vol 7, No. 1 pp 1-22, March 1964.
- ^ Solomonoff, R., "A Formal Theory of Inductive Inference, Part II" Information and Control, Vol 7, No. 2 pp 224-254, June 1964.
- ^ "An Exact Method for the Computation of the Connectivity of Random Nets", Bulletin of Mathematical Biophysics, Vol 14, p. 153, 1952.
- ^ (pdf scanned copy of the original)
- ^ An Inductive Inference Machine," IRE Convention Record, Section on Information Theory, Part 2, pp. 56-62.(pdf version)
- ^ "A Progress Report on Machines to Learn to Translate Languages and Retrieve Information", Advances in Documentation and Library Science, Vol III, pt. 2, pp. 941-953. (Proceedings of a conference in Sept. 1959.)
- ^ "A Preliminary Report on a General Theory of Inductive Inference,", 1960 p. 1
- ^ "A Preliminary Report on a General Theory of Inductive Inference,",1960, p. 17
- ^ "Complexity-based Induction Systems, Comparisons and convergence Theorems" IEEE Trans. on Information Theory Vol. IT-24, No. 4, pp.422-432, July,1978. (pdf version)
- ^ "The Universal Distribution and Machine Learning", The Kolmogorov Lecture, Feb. 27, 2003, Royal Holloway, Univ. of London. The Computer Journal, Vol 46, No. 6, 2003.
- ^ "The Application of Algorithmic Probability to Problems in Artificial Intelligence", in Kanal and Lemmer (Eds.), Uncertainty in Artificial Intelligence,, Elsevier Science Publishers B.V., pp 473-491, 1986.
- ^ Levin, L.A., "Universal Search Problems", in Problemy Peredaci Informacii 9, pp. 115-116, 1973
- ^ "The Time Scale of Artificial Intelligence: Reflections on Social Effects," Human Systems Management, Vol 5, pp. 149-153, 1985(pdf version)
- ^ "Two Kinds of Probabilistic Induction," The Computer Journal, Vol 42, No. 4, 1999. (pdf version)
- ^ "Three Kinds of Probabilistic Induction, Universal Distributions and Convergence Theorems" 2008. (pdf version)
- ^ "The Discovery of Algorithmi Probability," Journal of Computer and System Sciences, Vol 55, No. 1, pp. 73-88 (pdf version)
- ^ "Agorithmic Probability, Theory and Applications," In Information Theory and Statistical Learning, Eds Emmet-Streib and Matthais Dehmer, Springer Science and Business Media, 2009, p. 11
External links
- Ray Solomonoff's Homepage
- For a detailed description of Algorithmic Probability, see "Algorithmic Probability" by Hutter, Legg and Vitanyi in the scholarpedia.
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




