Share on Facebook Share on Twitter Email
Answers.com

Bootstrapping

 
Wikipedia: Bootstrapping (compilers)

Bootstrapping is a term used in computer science to describe the techniques involved in writing a compiler (or assembler) in the target programming language which it is intended to compile. This technique is also called self-hosting.

A large proportion of programming languages are bootstrapped, including BASIC, C, Pascal, Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme, Clojure, and more.

Contents

Advantages

Bootstrapping a compiler has the following advantages:[1]

  • it is a non-trivial test of the language being compiled.
  • compiler developers only need to know the language being compiled.
  • improvements to the compiler's back-end improve not only general purpose programs but also the compiler itself.
  • it is a comprehensive consistency check as it should be able to reproduce its own object code.

The chicken and egg problem

If one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods to solving this chicken and egg problem include:

  • Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that he wrote the first Pascal compiler in Fortran.
  • Another interpreter or compiler for X has already been written in another language Y; this is how Scheme is often bootstrapped.
  • Earlier versions of the compiler were written in a subset of X for which there existed some other compiler; this is how some supersets of Java, Haskell, and the Free Pascal compiler are bootstrapped.
  • The compiler for X is cross compiled from another architecture where there exists a compiler for X; this is how compilers for C are usually ported to other platforms.
  • Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way) and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB literate programming system.

Methods for distributing compilers in source code include providing a portable bytecode version of the compiler, so as to bootstrap the process of compiling the compiler with itself.

Such methods are also one way of detecting or avoiding (or both) the potential problem pointed out in Reflections on Trusting Trust.

History

The first language to provide such a bootstrap was NELIAC. The first commercial language to do so was PL/I.

The first self-hosting compiler (excluding assemblers) was written for Lisp by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.[2]

The compiler as it exists on the standard compiler tape is a machine language program that was obtained by having the S-expression definition of the compiler work on itself through the interpreter. (AI Memo 39)[2]

This technique is only possible when an interpreter already exists for the very same language that is to be compiled. It borrows directly from the notion of running a program on itself as input, which is also used in various proofs in theoretical computer science, such as the proof that the halting problem is undecidable.

See also

References

  1. ^ Compilers and Compiler Generators: An Introduction With C++. Patrick D. Terry 1997. International Thomson Computer Press. ISBN 1850322988
  2. ^ a b Tim Hart and Mike Levin. "AI Memo 39-The new compiler". ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-039.pdf. Retrieved 2008-05-23. 



Search unanswered questions...
Enter a question here...
Search: All sources Community Q&A Reference topics
 
 

 

Copyrights:

Wikipedia. This article is licensed under the Creative Commons Attribution/Share-Alike License. It uses material from the Wikipedia article "Bootstrapping (compilers)" Read more