Homology is used to describe two things that share a common evolutionary origin. In genetics and molecular biology, homology means that the sequences of two different genes or two different proteins are so similar that they must have been derived from the same ancestral gene or protein.
The word "homology" has several meanings in biology, each related to the word's origin, meaning "same knowledge." At a molecular level, the term "homology" describes sequences, either DNA or protein, that share a common evolutionary origin. On a larger scale, a pair of chromosomes from a diploid organism that have the same size and shape, are considered homologous chromosomes. Regions of each member of a chromosome pair, which carry the same set of genes, are homologous regions. Finally, physical features with a common evolutionary origin, such as the wing of bat and the hand of a human, are homologous structures.
Diversity and Natural Selection
Biologists have long been fascinated by the diversity of life. The amazing variety of living things makes it natural to wonder how so many different life-forms came to be. Physical characteristics that could be easily observed, such as the shape of wing, the structure of a shell, or the size of a beak, provided the first means to search for an answer. Recognition of the variation within a species (imagine a Chihuahua and a Great Dane) led Charles Darwin to propose that new species emerge when selection favors certain traits within a population.
Today's biologists continue to study the effects of natural selection on the evolution of species, but they are no longer limited to beak size and wing shape. Now they can compare the positions of genes on chromosomes, the amino acid sequences of proteins, and the nucleotide sequences of genes. With DNA or protein sequences from over 133,000 species represented in the taxonomy database at the National Center for Biotechnology Information (NCBI) and over 800 genome sequences either published or in progress, researchers have an unprecedented opportunity to study evolution at a molecular level.
Homology and Computer Analysis
To study homologous sequences, researchers use computer programs, such as BLAST (Basic Local Alignment Search Tool), to compare a DNA or protein sequence with a collection of other sequences. One such collection is GenBank, the genetic sequence database operated by the NCBI that contains all publicly available DNA and protein sequences. Biologists use databases such as GenBank to find out if a test sequence matches any known sequences, how well it matches, and which portions of the sequence match.
Computer programs identify matching sequences by similarity. However, similar sequences are not always homologous, because they may not have a common origin. Although many sequences that show similarity did evolve from a common ancestor, the appearance of similar sequences can also result from independent events. For example, mutations frequently occur in the gene for the envelope protein of the AIDS virus, HIV-1, changing the amino acid sequence of the protein. The human immune system recognizes and destroys unmutated viruses, while leaving unharmed (selecting for) those viruses that contain mutations that make them unrecognizable. As a result, viruses from different patients can show identical mutations in the envelope protein, even though the patients were infected by different strains of the virus.
Exploring the Mechanisms of Mutation
The ability to compare protein and DNA sequences not only shows us where evolution has occurred but provides insight into its mechanisms. By comparing genomes, we find that mutations can occur on a small scale: Even a single nucleotide change is a mutation. They can also occur on a large scale, as happens when sequences are inserted, deleted, duplicated, or moved between chromosomes.
Many mutations that replace single nucleotides have no effect because of the "degeneracy" or redundancy of the genetic code. The genetic code has more codons (sixty-four) than amino acids (twenty). As a consequence, most amino acids are specified by two to four different codons. Because of this, some mutations can be "silent," with one nucleotide replacing another but without changing the specified amino acid. Other mutations are said to be "conservative." This occurs when a mutation replaces one amino acid with another that has similar properties: They may be chemically similar sharing the same charge, shape, or polarity.
If, however, the mutation affects the function of an important protein, that mutation may result in an evolutionary dead end, because it is less likely to be passed on to a future generation. As a result, important sequences show fewer mutations, whereas less important sequences show more change. Such properties can be deduced by comparing sequences from different organisms. Proteins that interact with other molecules, such as DNA or RNA, tolerate fewer changes in structure, and show little change through evolution. The histone proteins that form the backbone of the eukaryotic chromosome are important examples.
Evolutionary Relatedness
The number and types of differences that accumulate between genes or proteins of two different species can be used to assess their evolutionary relatedness and the amount of time since they diverged from a common ancestor. Such studies, termed "molecular systematics," can be used to show that humans are more closely related to chimps than to gorillas, for instance, and how long ago the split in these lineages occurred.
Homologous proteins that perform the same function in different species are called orthologs. For example, hemoglobin, a protein that transports oxygen, has a similar amino acid sequence in both horses and dogs. If the predicted amino acid sequence of a newly discovered protein is similar to a known protein in another species, researchers can make guesses about the function of the newly discovered gene. If the sequence of a newly discovered protein was similar to hemoglobin, one might guess that the new protein is able to bind to oxygen and function in transporting oxygen. In the way, orthologs help researchers about the functions of newly discovered genes.
Natural selection acts against harmful mutations in critical genes. Gene duplication, however, makes extra copies of less critical genes, which are more free to acquire mutations. Members of these gene families are known as paralogs. Researchers look for paralogs in order to find proteins with new abilities. Cytokine genes, for example, are all derived from the same ancestral gene and share common sequence motifs, yet they fill a variety of roles in the immune system. New members of the cytokine family might be valuable tools for fighting disease. Just as species diverge and fill new biological niches, genes become duplicated and acquire new functions. On a molecular scale, the evolution of the genome reflects the evolution of all living things.
Bibliography
Lander, Eric, et al. "Intitial Sequencing and Analysis of the Human Genome." Nature 409 (2001): 860-921.
Strachan, Tom, and Andrew P. Read. Human Molecular Genetics, 2nd ed. New York: John Wiley & Sons, 1999.
Venter, J. C., et al. "The Sequence of the Human Genome." Science 291 (2001): 1304-1351.
Internet Resource
National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov.
—Sandra G. Porter