STAN 3

From Micro-Evolution to Macro-Evolution:   Beneficial Mutations, the Pace of Evolution, and Increasing Genome Complexity

PREFACE (for blog) to Letter “STAN 3”:  As noted in the letter below, Stan had continued our correspondence with a letter in October, questioning the legitimacy of extrapolating from the small mutations observed in laboratory experiments (which typically span only a few years or decades) to the huge transformations implicated in macroevolution over billions of years. That seemed a fair question, so I did a lot of reading in the area, going as much as possible to primary sources, so I could understand the context of whatever data or conclusions were presented. This turned into a multi-month endeavor, issuing in this letter which is something like a review article. The treatment of beneficial mutations here should be thorough enough to put to rest claims like “Mutations are not really beneficial” or “Mutations only destroy information”.  A reasonably good match was found between the rates of genetic and morphological changes observable today, and rates of change over the past millions of years, as inferred from the fossil record. The increase in genome size and complexity over the course of evolution is understandable in the context of gene duplication and other insertion mutations. 

I hope this letter proves helpful to readers interested in the plausibility of macroevolution. The contents are:

SECTION 1. BENEFICIAL MUTATIONS: DEFINITION AND FREQUENCY

Fitness Depends on Environment

Some Mutants Can Maintain Wild-Type Vigor

Frequency of Beneficial Mutations

Lenski’s E. Coli  Long-Term Evolution Experiment

Summary on Beneficial Mutations

SECTION 2: THE PACE OF EVOLUTION: MICRO VS. MACRO

Rate of Morphological Changes

Current and Historical Mutation Rates

Rate of Adaptive Genetic Changes

Novelty in Evolution

Summary on Rates of Change of Phenotype and Genotype

SECTION 3.  INCREASING GENOME SIZE AND COMPLEXITY

Types of Gene Duplication and Subsequent Development

Examples of Recent Beneficial Gene Duplications

Historical Reconstructions of Gene Duplication and Modification

Role of Regulatory Elements in Evolution

Analogies Between Human Language and Genomes

Parable of the Two Engineers

Summary on Increasing Genome Size and Complexity

CLOSING THOUGHTS

**********************************************************************************

Hello Stan,                                                                                           July, 2009

We have established a stimulating line of correspondence now.  I think it started with your letter of  March of last year, where you critiqued Francis Collins’ The Language of God, and kindly enclosed a copy of John Sanford’s Genetic Entropy and the Mystery of the Genome.

I wrote back in April, supplying some examples of beneficial mutations from Ken Miller’s Finding Darwin’s God, citing (from The Language of God) some examples of genomic features in mice, chimps, and humans that suggest common ancestry, and noting that in general, the sequence of fossils in the rock layers (especially the reptile-to-mammal transition) conforms to evolutionary expectations.

You invited comments on your anti-evolution packet, leading to my letter in May which gave additional examples of beneficial mutation and noted that there is no reason to suppose that the modest mutational changes observed in short term experiments cannot be extrapolated over time to larger macroevolutionary changes. Some other topics included speciation, thermodynamics, the fossil record, cytochrome-c comparisons, and the age of the earth.

In your long and thoughtful letter of October of last year, you called attention to the micro-nature of most observed (recent) beneficial mutations, and question whether it is reasonable to extrapolate from these mutations to large changes, especially to large increases in genetic information.  You also refer to Michael Behe’s claim to have detected a limit to evolution, and to John Sanford’s claim that beneficial mutations must always be overwhelmed by harmful mutations, at least for non-microbes.

I will offer some thoughts that relate to your October letter, organized as listed below.  There are a number of references listed, since I decided to go to the primary literature, instead of relying on reviews by other writers. In most cases an internet link is included in the references. The main issue addressed is whether the changes in genes and in physical structure/function we can observe in controlled scientific studies (which typically span a few months or few decades) are consistent with the overall expectations of the theory of evolution as it is generally understood today.

This is an assessment of the self-consistency of evolution (not a proof of evolution), so we will provisionally accept the time-scales and inferred rates of change in genotypes and phenotypes that are part of the current evolutionary thinking. This is not circular reasoning: the faunal successions in the rock layers were established before Darwin’s 1859 publication, and the antiquity of the rock layers are established by radioisotopic dating and other means, independent of assumptions of organic evolution.  Here we are simply asking whether it is reasonable to extrapolate the small changes that can be observed in today’s short-term experiments back in time to explain the larger changes and transitions which are believed to have occurred over billions of years. That seems to be the core of the concerns about evolution expressed in your letter.

SECTIONS BELOW

SECTION 1. BENEFICIAL MUTATIONS: DEFINITION AND FREQUENC

SECTION 2: THE PACE OF EVOLUTION: MICRO VS. MACRO

SECTION 3.  INCREASING GENOME SIZE AND COMPLEXITY

SECTION 1. BENEFICIAL MUTATIONS: DEFINITION AND FREQUENCY

In our previous correspondence, we have discussed a number of observed mutations which evolutionists cite as examples of beneficial mutations.  These include:

–  The well-known ability of microbes and insects to evolve the ability to survive antibiotics and insecticides.

– The “nylon bug” – – a bacterium which evolved the ability to metabolize the nylon-related waste chemicals in a pond by a Japanese chemical plant. [1]

– Barry Hall’s “lac bug” – – a bacterium evolved the ability to metabolize lactose, after its normal means of utilizing lactose had been removed by genetic engineering [2]

– Several further examples of the evolution of new metabolic pathways [3]

You raised a number of objections to the assertion that these observed adaptations support the concept of macroevolution. These objections include:

(a) These mutations often appear in specific settings. As you put it in your letter:

Evolutionists tend to allow the definition of the term “beneficial” to rest on “environmental conditions”. I believe, as mentioned earlier, that most of the mutations described as “beneficial” may permit “survival” in a highly selective environment.

 (b) These mutations often involve tradeoffs – – the mutant organisms that thrive in the new environment are typically less competitive in their original setting. You interpret that to mean that the new strains are “genetically inferior to the wild type.”

(c) Beneficial mutations are very rare:

 The vast majority of mutations are either neutral, mildly harmful, or seriously deleterious…No scientific “case” is ever made using “rare exceptions” as foundational data. Using the mutation/selection mechanism to underpin macroevolution does just that.

 (d)  The observed mutations are very modest in terms of both genotype and phenotype. They typically involve one or two point substitutions which may affect a metabolic pathway but don’t generate some new complex organ. It’s not immediately obvious that these mutations (microevolution) could enable the (macro)evolution of animals from single-celled ancestors.

(e)  These observed mutations may not add new information to the genome, whereas an increase in genetic complexity is implied in macroevolution, especially in the origin of eukaryotes and animals/humans.

In the present section (“Beneficial Mutations”), I will offer some observations on objections (a), (b), and (c). Items (d) and (e) are addressed in later sections.

Fitness Depends on Environment

First, as to the propriety of defining fitness in terms of environment – – this approach is accepted in the world of biology as the only rational thing to do.  Anywhere I looked to find commentary on the subject, researchers noted that fitness can only be defined or measured with reference to some particular environment. Here is one such statement [4]:

Fitness describes the overall ability of an organism to survive and reproduce and can be measured in bacteria by calculating growth rate. The environments in which a bacterium must survive have been shown to affect fitness values, such that the same bacterium can have different fitness values in different environments.

There is no reason to require that a mutant which has become the most fit in environment Y should also remain the most fit in environment X. There is no “standard” environment (temperature, atmosphere, pH) in which every organism should be rated. Is an E. coli in an animal gut more “fit” than a thermophile living in a hot spring? Neither one would likely survive for long if they swapped environments.

An original strain may well be superior in the original environment, while the mutant is superior in the new environment. That does not make the mutant “genetically inferior.”  If the mutant strain multiplies and occupies the new environment, it is in fact “thriving”, not merely “surviving.”

From the viewpoint of mutation/selection being a success, all that matters is the current environment to which the population is exposed [5]:

Adaptation can be quantified by measuring changes in fitness in the experimental environment, in which fitness reflects the propensity to leave descendants….Of course, relative fitness depends not only on the genotypes but also on the environment in which it is measured. As discussed later, it is possible to test the specificity of adaptation that occurred in an evolution experiment by measuring fitness in different environments. Unless otherwise specified, however, it should be understood that fitness is measured under conditions that are similar or identical to those that prevailed during an evolution experiment.

This is not some sleight of hand by evolutionists. It is a practical truism. In the wild, environments do change (climates get hotter, vegetation changes, new predators appear), and organisms must adapt to the new circumstances. If these new circumstances become entrenched, then fitness in the old environment becomes irrelevant. In fact, changes in environment are seen in evolutionary theory as a key driver for evolution. Insecticide-resistant insect mutants are now common in “the wild”, due to the widespread use of insecticides.

Using computer simulations of natural evolution, Kashtan et al. [6] found that

Evolution toward goals that change over time can, in certain cases, dramatically speed up evolution compared with evolution toward a fixed goal. The highest speedup is found under modularly varying goals, in which goals change over time such that each new goal shares some of the subproblems with the previous goal. The speedup increases with the complexity of the goal: the harder the problem, the larger the speedup. Modularly varying goals seem to push populations away from local fitness maxima, and guide them toward evolvable and modular solutions. This study suggests that varying environments might significantly contribute to the speed of natural evolution.

Kimura [7] summarized the long story of vertebrate evolution as indicated by the fossil record over the past 500 million years, from jawless fishes, through jawed fishes, amphibian, reptiles, early mammals and on to modern mammals.  He calls attention to the influence of environmental drivers in the evolutionary process: “The history of vertebrate evolution summarized above clearly shows that ecological opportunities play an essential role in rapid phenotypic evolution, and that progressive evolution is nearly always brought about as a result of organisms’ response to environmental challenge.”  For instance, the proliferation of land vegetation 300-400 million years ago facilitated the spread of fully terrestrial reptiles; the sudden extinction of dinosaurs (possibly due to an asteroid impact) at the end of the Cretaceous (65 million years ago) gave scope for mammals to take over on land; the emergence of bipedal hominids in eastern Africa was likely driven in part by the disappearance of trees in that region as rainfall patterns shifted.

Some Mutants Can Maintain Wild-Type Vigor

For the reasons discussed above, it would not tarnish the credibility of evolution if every adaptation to a new environment involved loss of fitness in the old environment. It is well-known that mutations which are beneficial in one environment may be detrimental in another [5]:

In principle, several mechanisms can produce tradeoffs. The simplest mechanism is antagonistic pleiotropy (AP), in which a particular mutation that is beneficial in one environment is harmful in the other. A second mechanism is mutation accumulation (MA), in which mutations accumulate by drift in genes the products of which are not used in one environment but are useful in another. These mutations are, therefore, neutral in the environment in which they were substituted, but deleterious in the other environment. The third mechanism that can produce tradeoffs is the independent adaptation of organisms to alternative environments. If each of two populations substitutes a mutation that is beneficial in one environment and neutral in the other, then each population will be more fit in one environment than the other. A population does not suffer a decline in fitness relative to its progenitor under this third mechanism, unlike the first two. Under all three mechanisms, the net effect is a tradeoff in which different genotypes, populations or species are maximally fit in alternative environments. Although tradeoffs are widespread in nature, the underlying mechanisms are rarely known.

However, this tendency towards fitness tradeoffs is not universal.  While the mutations which confer antibiotic resistance on bacteria often leave them less competitive in a permissive (i.e. antibiotic-free) environment, this is not always true. It is not uncommon for these bacteria to continue to evolve, and to pick up compensatory mutations which restore their vigor. In some cases, even the initial antibiotic-resistant mutation is not disadvantaged in the permissive environment.  These “superbugs” do not simply die out when there are not antibiotics present. This is a serious medical problem, not a contrived genetics experiment.

Bjorkman et al. [8] cited various studies showing that antibiotic resistance mutants can be or become competitive with non-resistant strains:

It has been shown in Escherichia coli that carriage of resistance genes on a plasmid is associated with a decreased growth rate, and that these strains can accumulate chromosomal compensatory mutations that, by an unknown mechanism, compensate for the growth rate decrease (2, 3). Likewise, it has been shown that slow-growing streptomycin-resistant mutants of E. coli can accumulate compensatory mutations that restore rapid growth under laboratory conditions without affecting the resistance (4). Interestingly, these compensatory mutations appear to create a genetic background in which the streptomycin-sensitive revertants have a strong selective disadvantage, implying that it would be difficult for an evolved resistant strain to become sensitive even in the absence of the antibiotic (5). There are also animal data which indicate that tetracycline-resistant E. coli persist in pigs long after the antibiotic has been removed, suggesting that the added burden of this particular resistance in vivo is in fact small (6).

In their experiments with bacteria hosted by mice, Bjorkman et al. [8] found that some antibiotic-resistant salmonella start off competitive with wild-type strains, and that many of the antibiotic-resistant strains that were initially noncompetitive became competitive (virulent) via compensatory mutations:

In this study we examined the fitness of antibiotic-resistant S. typhimurium in mice. Our results indicate that most resistant mutants are less virulent than the wild type. However, the avirulent mutants rapidly accumulate various types of compensatory mutations that restore virulence to wild-type levels without loss of high-level resistance.

Of seven resistant mutants examined, six were avirulent and one was similar to the wild type in competition experiments in mice. The avirulent-resistant mutants rapidly accumulated various types of compensatory mutations that restored virulence without concomitant loss of resistance. Such second-site compensatory mutations were more common then reversion to the sensitive wild type. We infer from these results that a reduction in the use of antibiotics might not result in the disappearance of the resistant bacteria already present in human and environmental reservoirs. Thus, second-site compensatory mutations could increase the fitness of resistant bacteria and allow them to persist and compete successfully with sensitive strains even in an antibiotic-free environment.

Kassen and Bataillon [9] took a wild-type Pseudomonas flourescens bacterium, and exposed it to the antibiotic nalidixic acid in multiple populations. A total of 673 separate strains of antibiotic-resistant P. flourescens were obtained. These were inferred to all involve single point mutations, with an estimated frequency of 2.4 x 10-9 beneficial mutations per cell division. 665 of these antibiotic-resistant strains were assayed for fitness in the presence and in the absence of the antibiotic, and compared with the parent wild-type. In the presence of antibiotic, the wild-type could not survive, so the mutants were all judged (by cell density measurements) more fit. In the permissive environment (i.e. no antibiotic), the mutants showed a distribution of fitness effects. Most mutants were less fit than the wild-type, but 28 of them showed fitness greater than the wild-type.  The researchers did a detailed (8 repeats) reassay on the top 40 antibiotic-resistant mutants, and in this test identified 18 mutants which were more fit than the parent wild-type in the absence of antibiotic.  Thus, 100% of these selected mutants were superior to the parent in the presence of antibiotic, and at least 18 out of 665 mutants ( 2.7 %) were superior in the absence of antibiotic as well.

There are two main ways to study the impact of mutations. One approach, used in the studies above, is to let random natural mutations accumulate in a population (with or without natural selection operating), and observe the changes in phenotype. There are several practical problems with this method. First, it can be difficult to determine what genetic changes took place. It is prohibitively expensive to run lots of complete genome sequences on a population. Second, a number of different mutations may be detected, and it may not be clear which one or ones caused the observed effects on the phenotype.

This leads to a second approach to studying mutations, which is to deliberately introduce known mutations, then observe their effect. This approach allows for just one, known mutation at a time. Note that this is not genetic engineering or intelligent design. A genetic engineer makes a genetic change, knowing or believing it will have a particular effect. Also, the genetic changes which a genetic engineer makes are typically not like the mutations usually seen in nature, but often involve transplanting large, sometimes foreign sections of DNA. In contrast, the genetic researcher in these mutational studies makes changes without knowing their effect ahead of time, and typically these changes are of the type that are usually seen in nature (e.g. single point substitutions or indels).

Sander et al. [10] used defined Mycobacterium smegmatis laboratory mutants to study the effects of point mutations on fitness. While most mutations which conferred antibiotic resistance reduced fitness in the absence of antibiotic, several effectively no-cost antibiotic-resistant mutations were identified. This study concludes that the virulent antibiotic-resistant strains of this bacterium which appear clinically are likely derived from these low-cost/no-cost initial mutations, as opposed to evolving from initially high-cost mutations via further compensatory mutations.

Similarly, Luo et al. [11] report that an antibiotic-resistant mutant can sometimes be the more fit strain, even apart from compensatory mutations:

Campylobacter jejuni, a major foodborne human pathogen, has become increasingly resistant to fluoroquinolone (FQ) antimicrobials. By using clonally related isolates and genetically defined mutants, we determined the fitness of FQ-resistant Campylobacter in chicken (a natural host and a major reservoir for C. jejuni) in the absence of antibiotic selection pressure. When monoinoculated into the host, FQ-resistant and FQ-susceptible Campylobacter displayed similar levels of colonization and persistence in the absence of FQ antimicrobials. The prolonged colonization in chickens did not result in loss of the FQ resistance and the resistance-conferring point mutation (C257 → T) in the gyrA gene. Strikingly, when coinoculated into chickens, the FQ-resistant Campylobacter isolates outcompeted the majority of the FQ-susceptible strains, indicating that the resistant Campylobacter was biologically fit in the chicken host. The fitness advantage was not due to compensatory mutations in the genes targeted by FQ and was linked directly to the single point mutation in gyrA, which confers on Campylobacter a high-level resistance to FQ antimicrobials.

These four studies, plus similar work cited in these papers, disprove the assertion that essentially all mutations involve loss of overall function.

Frequency of Beneficial Mutations

There is a wide range of literature estimates of the frequency of beneficial mutations. Several estimates are listed below. Some of the lower frequencies were for populations bred under competitive conditions. In such cases, the beneficial mutations that are detected by fixation are typically a small percentage of the total beneficial mutations that occurred. Most beneficial mutations have a modest effect and will therefore be eliminated by genetic drift and/or by clonal interference (i.e. competition among clones carrying different beneficial mutations). Thus, competitive breeding experiments tend to greatly underestimate the actual number of beneficial mutations.

We noted above that Kassen and Bataillon estimated a frequency of 2.4 x 10-9 (fixed) beneficial mutations per cell division, for Pseudomonas flourescens bacteria exposed to antibiotic in competitive populations [9].  Imhof and Schlotterer [12] directly assayed the alleles in ten parallel cultures of E. coli, where each culture was propagated for 1000 generations in a non-restrictive medium. From the 66 adaptive events identified, they estimated the rate of (fixed) beneficial mutations to be 4 x 10-9 per cell per generation.

In a study of mutations in E. coli, Perfeito et al. [13] used small population sizes to minimize the effect of clonal interference. They found about 10-5 beneficial mutations per genome per generation, and that 1 out of every 150 new mutations was beneficial:

We measured the genomic mutation ratethat generates beneficial mutations and their effects on fitnessin Escherichia coli under conditions in which the effect ofcompetition between lineages carrying different beneficial mutationsis minimized. We found a rate on the order of 10–5 pergenome per generation, which is 1000 times as high as previousestimates, and a mean selective advantage of 1%. Such a highrate of adaptive evolution has implications for the evolutionof antibiotic resistance and pathogenicity.

 In most of the studies discussed so far, the organisms exist in laboratory populations of typically thousands to many millions, and the organisms breed under some sort of competitive conditions where natural selection can operate to weed out the deleterious mutants and reward the mutants who are better suited to the experimental conditions. A different type of experiment is the “mutation accumulation” (MA) study, where the breeding lines are frequently put through bottlenecks, such as randomly selecting a breeding pair or single organism. In this case, mutations just accumulate without being weeded out or rewarded. In most such studies, the average fitness of the populations decline. This is consistent with the view that most mutations are deleterious or neutral, and that beneficial mutations are relatively rare. However, rare does not mean absent. In some cases, the fraction of beneficial mutation measured by these experiments is appreciable.

For instance, Joseph and Hall [14] write:

We performed a 1012-generation mutation-accumulation (MA) experiment in the yeast, Saccharomyces cerevisiae. The MA lines exhibited a significant reduction in mean fitness and a significant increase in variance in fitness. We found that 5.75% of the fitness-altering mutations accumulated were beneficial. This finding contradicts the widely held belief that nearly all fitness-altering mutations are deleterious. The mutation rate was estimated as 6.3 x 10–5 mutations per haploid genome per generation and the average heterozygous fitness effect of a mutation as 0.061. These estimates are compatible with previous estimates in yeast.

To firm up these results, they performed a follow-up study [15], extending the MA experiment for an additional 1050 generations and re-estimating the mutation parameters. From these augmented experiments they estimate that 13% of the mutations accumulated during this study are beneficial, with the genome-wide mutation rate to be 13.7×10−5 mutations per haploid genome per cell generation and the absolute value of the average heterozygous effect of a mutation to be 7.3%.

In another MA study, Dickinson [16] accumulated spontaneous mutations for ~ 4800 generations in 48 lines of yeast which were protected from effective selection by frequent passage through single-cell bottlenecks. The fitness of all lines was measured periodically. Figure 1 in the paper shows fitness results from the start of the experiment, and at three more times during the study, for each line. Most of the changes in fitness were downward, and the average fitness declined by 5% over that course of the experiment. However, a full 25% of the recorded changes gave an increase in fitness. Many of these improvements appeared after a prior drop in fitness, but in several of these MA lines, the fitness exceeded that of the parent by a small amount (2-4%). When 24 large (to allow natural selection) populations were allowed adapt to the assay conditions, the average fitness improvement across all lines was 8%, and the maximum was 12%. Dickinson notes that these results provide empirical confirmation of the expectation from evolutionary theory that, while nearly all mutations of large effect will be deleterious, a significant fraction of small-effect mutations may be beneficial.

Shaw et al. [17] note that, while most MA experiments show an average decline in fitness, most of these studies do not distinguish the distribution of deleterious and beneficial effects. Thus (as illustrated with the Dickerson study above), there may be an appreciable number of beneficial mutations along with the majority deleterious mutations.  Shaw et al. [18] conducted a MA study of the plant Arabidopsis thaliana which was initiated from a single inbred founder, where 120 lines were established and advanced 17 generations by single-seed descent. They assayed reproductive traits as measures of fitness. For three of these fitness measures (mean number of seeds per fruit, number of fruits, and dry mass of the infructescence), the means did not shift appreciable among generations. That means that for this experiment approximately 50% of the observed mutational changes were beneficial, and 50% were deleterious.

Colby [19] notes:

One example of a beneficial mutation comes from the mosquito Culex pipiens. In this organism, a gene that was involved with breaking down organophosphates – common insecticide ingredients -became duplicated. Progeny of the organism with this mutation quickly swept across the worldwide mosquito population. There are numerous examples of insects developing resistance to chemicals, especially DDT which was once heavily used in this country. And, most importantly, even though “good” mutations happen much less frequently than “bad” ones, organisms with “good” mutations thrive while organisms with “bad” ones die out.

If beneficial mutants arise infrequently, the only fitness differences in a population will be due to new deleterious mutants and the deleterious recessives. Selection will simply be weeding out unfit variants. Only occasionally will a beneficial allele be sweeping through a population. The general lack of large fitness differences segregating in natural populations argues that beneficial mutants do indeed arise infrequently. However, the impact of a beneficial mutant on the level of variation at a locus can be large and lasting. It takes many generations for a locus to regain appreciable levels of heterozygosity following a selective sweep.

Colby brings out a couple of relevant points. First, it was a change in the environment which drove this change in the population genetic composition. Second, the distribution of differences in natural populations is consistent with the picture which emerges from laboratory studies: effective beneficial mutations are rare, but not so rare as to preclude populations from significant adaptive improvements in fitness.

Lenski’s E. Coli  Long-Term Evolution Experiment

Richard Lenski’s group at Michigan State University has been running a long-term evolution experiment on asexual E. coli since 1988. Twelve populations were started from a single cell. Each population is kept in a flask at 37 C with 10 ml of a growth medium which contains only enough glucose to support about 5 x 10+8 cells per culture. Each day, 0.1 ml of the previous day’s culture is transferred into 9.9 ml of fresh growth medium, and the cells reproduce up to the limit of the nutrients. The result is about 6.64 generations per day, or 2400 generations per year. Samples of each population are cryo-preserved every 500 generations. These preserved cells can be revived and further studied as needed.  Fitness is periodically assayed by measuring growth rates versus an ancestral strain.

Over the first 10,000 generations, each of the twelve populations demonstrated significant increases in fitness.[20]    The fitness tended in increase in a series of steps, indicating the successive sweeps of individual beneficial mutations through a population every several hundred generations or so. Different populations showed different steps at different times, showing a variety of approaches towards increased fitness. The rate of change was fastest in the first 2000 generations, and then it slowed appreciably.

Elena and Lenksi [5] review the genetic changes in the populations through the 20,000 generation mark. By this point, the average fitness of the populations was about 1.7 times that of the ancestral population. Fitness was still increasing, but at an ever-diminishing rate. A prior estimate of the mutation rate for E. coli is 5 x 10-10 per base pair per generation, while the results from their long-term experiment suggest more like 1 x 10-10. The genome length for E. coli is about 5 x 106 base pairs. Thus, the total point mutations expected in 20,000 generations is 3 x 108  to 1.5 x 109.  Thus, each population has had most point mutations represented many times over. In addition, other mutations (insertions, deletions, inversions) presumably occurred.

In each population, they estimated that 10 to 20 beneficial mutations had achieved fixation, while the total number of beneficial mutations (fixed and unfixed) was many times higher. Regarding total fixed mutations, they state: “Combining beneficial and neutral substitutions, it seems probable that fewer than 100 mutations could have been substituted in each population, out of the billion or so that occurred. Thus, the bacterial genomes will have changed very little in the broad scheme of things, which is reassuring given that a decade is a mere ‘drop in the bucket’ in terms of molecular evolution.”      Using the lower estimate of 10 fixed beneficial mutations out of a total of a billion mutations, only one out of  every 108 mutations was a beneficial mutation which was strong (and lucky) enough to get fixed, despite clonal interference and genetic drift. However, even that miniscule proportion of fixed beneficial mutations was sufficient to reliably rachet up the fitness of the population.

Barrick et al. [67] have identified the specific mutations in the genome that  accumulated in the genome of one of these twelve populations, up through 20,000 generation. They found that mutations continued to appear which by themselves were beneficial, but that rate of increase in population fitness dropped with time. Clonal interference (new beneficial mutations must compete with previous beneficial mutations) is likely part of the explanation here. In several of the Lenski E. coli populations, strains have arisen with higher than normal mutation rates. The appearance of hyper-mutable strains has been observed elsewhere for populations of bacteria under stress. [68]

The rate of adaptation in Lenski’s main evolution experiment slowed dramatically after the first 2,000 generations.  In 1994, after 10,000 generations, the populations all seemed to be approaching an asymptote in fitness.  Lenski and Travisano [20] admitted that they had not seen much of a step change in the bacteria, but held out hope that it might yet occur:

We saw no compelling evidence for any more radical punctuation, such when one adaptive change sets off a cascade of further changes…Such an event might have been manifest by a period of renewed, rapid evolutionary change in a population that had previously been at or near stasis. Perhaps 12 populations and 10,000 generations were too few to see such rare events.

Many years later, their faith in evolution was rewarded. After about 33,127 generations, one of the twelve populations displayed significantly elevated turbidity, which continued to rise for several days [21]. The Wikipedia entry [22] has recent color photos of the twelve flasks, showing an obvious difference for flask A-3. Clearly, the cell density is much higher in that population, which after nearly two decades finally evolved the ability to metabolize citrate.

Citrate is present along with glucose in the growth medium used in the Lenksi experiment, but normally E. coli is unable to make use of it. The metabolic machinery for utilizing citrate exists in E. coli, and it can ferment citrate under anaerobic conditions in the presence of a reducing substrate, but it cannot transport citrate under oxic conditions. Atypical strains of E. coli have been found that can grow aerobically on E. coli, but there is reason to believe they acquired this ability by plasmids transferred from other species. There is only one other documented case of spontaneous mutation in E. coli to allow citrate utilization, which was inferred to involve some complex mutational pathway [23]. So it appears to be difficult for E. coli to develop this ability.

How the citrate ability arose in the Lenski study is instructive. Blount et al. [21] discuss many experiments designed to eliminate or confirm various hypotheses:

 No population evolved the capacity to exploit citrate for >30,000 generations, although each population tested billions of mutations. A citrate-using (Cit(+)) variant finally evolved in one population by 31,500 generations, causing an increase in population size and diversity. The long-delayed and unique evolution of this function might indicate the involvement of some extremely rare mutation. Alternately, it may involve an ordinary mutation, but one whose physical occurrence or phenotypic expression is contingent on prior mutations in that population. We tested these hypotheses in experiments that “replayed” evolution from different points in that population’s history.

For instance, they resurrected (thawed) cells from previous generations in that population and let them evolve, to see whether these parallel lines would also develop the ability to metabolize citrate. A few of the lines restarted from generation 20,000 evolved the ability to utilize citrate, whereas none of the lines restarted from generation 15,000 could.

It seems that some key but essentially neutral mutation occurred between generation 15,000 and 20,000. This potentiating mutation (or complex of mutations) had no distinctive effect on fitness, but it enabled some further mutation to produce a weakly effective Cit+ variant by generation 31,500. This variant constituted about 0.5% of the cells in the population at generation 31,500, and rose to 15% and 19% at generations 32,000 and 32,500. At generation 33,000, the dominance of the Cit+ variant plummeted to 1.1%, presumably because the Cit- subpopulation produced a beneficial mutation which allowed it to out-compete the emerging Cit+ subpopulation. However, after a few hundred more generations the optical turbidity of the population took a step change to a higher level, indicating some further mutation at that point had increased the viability of the Cit+ subpopulation to a very high level.

Lenski’s group has also done research with introducing artificial mutations to study their effects. Elena et al. [24] took an E. coli cell from one of the populations which had been evolved for 10,000 generations, and constructed 226 mutants. Each mutant contained a single random insertion of one of three transposons. Each of these transposons encoded resistance to one of three different antibiotics. A phage was used as the delivery vehicle. These mutants were assayed for fitness relative to a common competitor. None of the mutations had a significant positive effect on fitness in their standard glucose-limited environment, whereas 80% had a significant negative effect (average 3% fitness reduction).  The starting cell had already been heavily adapted to the particular test environment to the point where the millions of naturally-occurring mutations were conferring little if any further benefit, so it is not surprising that none of these few hundred random mutations tested here were beneficial.

Out of these 226 mutants, Remold and Lenski [25] randomly chose 9 mutants from each of the three antibiotic resistance classes. Of these 27 strains, one was eliminated due to contamination. The remaining 26 strains were evaluated for fitness in 4 different assay environments: glucose medium at 28 C and at 37 C, and maltose medium at 28 C and at 37 C. In glucose, the mutants showed fitness within about 2% percent of the progenitor, whereas in maltose, there was a much wider spread of fitness. In maltose, most of the 26 mutations were deleterious, but at least 3 (i.e. 12% of the total 26 mutations represented) were significantly beneficial.  This is another example of beneficial mutations not being vanishingly rare, and again shows the importance of environment in defining whether or not a mutation is beneficial

Summary on Beneficial Mutations

Logically, the fitness effect of a mutation can only be determined in some specified environment. Many organisms are found in more than one “wild” environment.  Most wild environments can change with time, and these changes are likely to be key drivers for evolution. Thus, there is no substance to the objection that most observed beneficial mutations are in defined laboratory settings. These experimental conditions simply make it possible to discern the beneficial mutations, which otherwise may be impossible to detect.

The mutant organisms which thrive in a new environment are often less fit in the old environment, but there is no necessity in evolution to require otherwise. This answers the objection that mutations often involve tradeoffs. In some important cases, the mutant organisms are (or can further evolve to be) competitive in the old environment as well as the new environment. This directly contradicts the notion that effectively all mutations are degenerative.

Finally, beneficial mutations are generally rare, yet that is no problem for evolution – – they are sufficiently numerous to effect genetic change in populations, as demonstrated in all kinds of experiments. Also, in some studies cited above, beneficial mutations are not particularly rare. It is entirely appropriate to use rare beneficial mutations to explain evolutionary change. Evolution does not require a high percentage of beneficial mutations, since the absolute number of mutations is astronomical over time and space, and since natural selection tends to favor their fixation over time. A rare beneficial mutation can sweep through a population, following well-known laws of population genetics. Many concrete instances are cited above and below of such sweeps. Thus, to object to evolution on the basis that “The vast majority of mutations are either neutral, mildly harmful, or seriously deleterious” reflects a categorical confusion of “rare” with “absent”, or a lack of understanding of the mathematics of population genetics.

The appearance and fixation of beneficial adaptations are fairly reliable, if the population size and number of generations is high enough. In the Lenski long-term experiments, only about 1 in 100 million mutations in a given population was a beneficial mutation which became fixed, yet that was sufficient to drive each of the twelve populations to a much better fitness in the glucose-limited environment.

SECTION 2. The Pace of Evolution: Micro vs. Macro

With organisms that reproduce quickly (e.g. in hours) and in large numbers, in a few weeks or years we can run experiments which assess the effectiveness of random mutations plus natural selection to increase the fitness of the organisms in their environment.  As discussed above, in these sorts of experiments with microbes, we routinely see these improvements in fitness. However, the magnitude of the changes in the genome are relatively small – – often just a couple of nucleotides substituted or some genes duplicated. Likewise, the changes in the physical form (phenotype) that we see due to these mutations are often modest. Does this constitute a problem for evolution?

In this section, I have pulled together some information on the historic rates of change of phenotype and genotype as indicated by the fossil record and comparative genomics. These rates are then compared to the rates of change observed in shorter-term experiments, to see whether the rates of evolution observed in today’s laboratory, if continued for eons, can account for the huge changes in life-forms which appear in the fossil record.

Rate of Morphological Changes

The fossil record shows a succession of life forms, moving from lower (older) to higher (more recent) rock layers.  The ages of the rock layers can be determined by non-biological means, e.g. by radioactive dating of igneous intrusions into sedimentary rock layers. We then can examine the fossils in those layers, and get a sense of the rate of physical changes in populations that take place over time.

For larger organism that reproduce more slowly (months-decades) and in smaller numbers, the evidence from the fossil record indicates that evolutionary changes proceed relatively slowly.  It took something like 50 million years to evolve today’s large horses from different and much smaller ancestors. It is estimated that there are seven successive genera in the branchy lineage from Hyracotherium to present Equus [26]. This gives an average phyletic taxonomic rate of 0.13 genera per million years, or 7.5 million years per genus. Triassic and earlier ammonites evolved at a rate of 0.05 genera per million years.   The last common ancestor of hominids and chimpanzees is considered to have lived around 6 million years ago. As far as I know, none of these transitions involved significant whole new organs, yet they still spread over millions of years.

The reptile to mammal transition was a more significant transition, including many changes to the reproductive system and elsewhere, but these were still tetrapods with two eyes and two ears. In the fossil record, this transition is spread over some 100 million years.  The transition from prokaryotes to the first unicellular eukaryotes was huge, but it may have spread over a billion years. Another billion years probably passed before multicellular life appeared. The picture is complicated by the fact that the rate of evolution in the fossils may have varied from time to time and species to species, and by the intrinsic incompleteness of the fossil record.

The average time it takes for a species to attain full reproductive isolation may be around 1 million years. Some examples in Ken Miller’s book Finding Darwin’s God [2] show speciation occurring across 200,000 years for a diatom (p.45), and across some 15,000 years for a snail (pp. 118-121).  Given these numbers, it is not reasonable to expect to see many large morphological changes (corresponding to major genetic changes) in laboratory experiments that operate over a few decades, or even over the last 400 years of scientific observation. We might expect to see a few modest speciations, and this is what in fact has been observed for speciation events in recent years: not absent, but few and modest [27]. They are expected to be modest (in the sense of morphological differences between the two new species) because if a species has only recently separated from another species, the two species will necessarily still be quite similar. Only after many millenia of reproductive isolation would we expect to see substantial differences appear.

Historical rates of morphological change are slow in general; for some creatures, they are nearly zero. For instance, the opossum and the crocodile have been largely unchanged for more than 60 million years. The brachiopod sea shell Lingula has hardly changed in 400 million years. Similarly, modern species of bacteria can maintain their identity over thousands or millions of generations.

A quantitative approach to the question of rates of morphological change was taken by Philip Gingerich [28]. One website [29] summarizes Gingerich’s findings:

In 1983, Phillip Gingerich published a famous study analyzing 512 different observed rates of evolution (Gingerich 1983). The study centered on rates observed from three classes of data: (1) lab experiments, (2) historical colonization events, and (3) the fossil record. A useful measure of evolutionary rate is the darwin, which is defined as a change in an organism’s character by a factor of e per million years (where e is the base of natural log). The average rate observed in the fossil record was 0.6 darwins; the fastest rate was 32 darwins. The latter is the most important number for comparison; rates of evolution observed in modern populations should be equal to or greater than this rate.

The average rate of evolution observed in historical colonization events in the wild was 370 darwins—over 10 times the required minimum rate. In fact, the fastest rate found in colonization events was 80,000 darwins, or 2500 times the required rate. Observed rates of evolution in lab experiments are even more impressive, averaging 60,000 darwins and as high as 200,000 darwins (or over 6000 times the required rate).

It’s a tricky thing to measure, with measured rates depending on the time scale of measurement, but any way we slice it, the observed rates of physical change in recent times/lab experiments are more than enough to account for the changes in the fossil record [30.] So the modest phenotype changes we see in a 10 year or even a 1000 year observation period accord with evolutionary theory and fossil data.

Current and Historical Mutation Rates

What about the rate of genotype change?  Lots of genetic changes are observed in the laboratory:  “Extremely extensive genetic change has been observed, both in the lab and in the wild. We have seen genomes irreversibly and heritably altered by numerous phenomena, including gene flow, random genetic drift, natural selection, and mutation. Observed mutations have occurred by mobile introns, gene duplications, recombination, transpositions, retroviral insertions (horizontal gene transfer), base substitutions, base deletions, base insertions, and chromosomal rearrangements. Chromosomal rearrangements include genome duplication (e.g. polyploidy), unequal crossing over, inversions, translocations, fissions, fusions, chromosome duplications and chromosome deletions.”[31]

The question here is whether rates of today’s genetic changes, as measured by nucleotide substitutions, are consistent with the rates required from the time allowed in the fossil record and the sequence differences observed between species. A discussion from the internet regarding mammals [32] is copied below. I bolded the key conclusions:

What we must compare are the data from three independent sources: (1) fossil record estimates of the time of divergence of species, (2) nucleotide differences between species, and (3) the observed rates of mutation in modern species. The overall conclusion is that these three are entirely consistent with one another.

For example, consider the human/chimp divergence, one of the most well-studied evolutionary relationships. Chimpanzees and humans are thought to have diverged, or shared a common ancestor, about 6 Mya, based on the fossil record (Stewart and Disotell 1998). The genomes of chimpanzees and humans are very similar; their DNA sequences overall are 98% identical (King and Wilson 1975; Sverdlov 2000). The greatest differences between these genomes are found in pseudogenes, non-translated sequences, and fourfold degenerate third-base codon positions. All of these are very free from selection constraints, since changes in them have virtually no functional or phenotypic effect, and thus most mutational changes are incorporated and retained in their sequences. For these reasons, they should represent the background rate of spontaneous mutation in the genome. These regions with the highest sequence dissimilarity are what should be compared between species, since they will provide an upper limit on the rate of evolutionary change.

Given a divergence date of 6 Mya, the maximum inferred rate of nucleotide substitution in the most divergent regions of DNA in humans and chimps is ~1.3 x 10-9 base substitutions per site per year. Given a generation time of 15-20 years, this is equivalent to a substitution rate of ~2 x 10-8 per site per generation (Crowe 1993; Futuyma 1998, p. 273).

Background spontaneous mutation rates are extremely important for cancer research, and they have been studied extensively in humans. A review of the spontaneous mutation rate observed in several genes in humans has found an average background mutation rate of 1-5 x 10-8 base substitutions per site per generation. This rate is a very minimum, because its value does not include insertions, deletions, or other base substitution mutations that can destroy the function of these genes (Giannelli et al. 1999; Mohrenweiser 1994, pp. 128-129). Thus, the fit amongst these three independent sources of data is extremely impressive.

Similar results have been found for many other species (Kumar and Subramanian 2002; Li 1997, pp. 180-181, 191). In short, the observed genetic rates of mutation closely match inferred rates based on paleological divergence times and genetic genomic differences. Therefore, the observed rates of mutation can easily account for the genetic differences observed between species as different as mice, chimpanzees, and humans.

Another study of mutation rates was done in the laboratory with nematodes [33]. The result for these animals was about 2 x 10-8   mutations per base pair per generation, which is consistent with the rates cited above.  The mutation rate is generally lower for prokaryotes than for eukaryotes. Other estimates of mutation rates are available. The field is complicated, partly because some zones of the genome mutate much faster than average.

Rate of Adaptive Genetic Changes

Long-term phenotype changes in a population come from allelic frequency shifts and mutations which become fixed in the population. Significantly beneficial mutations tend to get retained, and sweep through a population, due to natural selection. Putting that together with the slow pace of phenotype change, the prediction from evolutionary theory is that large, beneficial mutations must be rare. The observation that they are rare confirms evolutionary thinking, rather than overturning it.

We can come at this another way. Suppose really large and beneficial mutations were common, such as occurring in 5 % of the individuals in a population, in every generation. The effect would be for a species to rapidly and continuously morph into something else. The human species would not be able to retain its identity for thousands of years. But that is not what the fossil record indicates. Rather, it shows very slow change. A species may arise in tens or hundreds of thousands of years, then remain largely unchanged for more than a million years. Thus, if large beneficial mutations were common, that would constitute a serious challenge to evolutionary theory.

The estimates of raw nucleotide changes are relatively straightforward. The discussion above indicates a reasonable match between rates of total nucleotide changes currently measured in the laboratory with estimated historical rates. A different issue is the estimation of the rates of fixed beneficial mutations. This is much more difficult, because in most cases we simply don’t know which mutations are beneficial and in which circumstances. Most studies in this area rely on a number of assumptions and models.

Given those caveats, here are two references which bear on this issue. Smith and Eyre-Walker [34] write:

For over 30 years a central question in molecular evolution has been whether natural selection plays a substantial role in evolution at the DNA sequence level. Evidence has accumulated over the last decade that adaptive evolution does occur at the protein level, but it has remained unclear how prevalent adaptive evolution is. Here we present a simple method by which the number of adaptive substitutions can be estimated and apply it to data from Drosophila simulans and D. yakuba. We estimate that 45% of all amino-acid substitutions have been fixed by natural selection, and that on average one adaptive substitution occurs every 45 years in these species.

This suggests a fairly slow rate of fixed beneficial mutations in Drosophilia, considering their generation time. I suspect it would be challenging to detect one adaptive amino acid substitution in a 45 year study of a large wild population.

Bakewell et al. [34] studied 13,888 genes in the human genome, and estimated that 154 of them had been positively selected since the human/chimp split which is believed to have occurred about 6 million years ago. If the same proportion holds across all 22,000 or so of our genes, and we double the estimate to compensate for this study likely missing some adapted genes, and double that again to cover possible positive selection in regulatory sequences, we might expect a total of 960 fixed beneficial mutations that separate us from the common ancestor. This works out to about one new beneficial gene or regulatory sequence about every 6000 years. This is so slow that it is beyond the horizon of most direct scientific measurements, yet fast enough to drive homonid evolution.

In the Lenski long-term experiments, each of the twelve populations evolved to a significantly better fitness in the glucose-limited environment within 2000 generations. Comparing to humans, 2000 generations times 25 years per generation would amount to 50,000 years. Each of Lenski’s populations numbered between 5 and 500 million cells. This is on the order of human populations today, although any effective human breeding population would be some orders of magnitude smaller. By a very crude analogy, we might expect several beneficial mutations to have appeared in humans over the last 50,000 years, and maybe one or two in the last say 10,000 years.

Have any beneficial mutations appeared in humans in the past ten millennia? It is hard to answer that question with high confidence, because a number of factors conspire to make it very difficult to detect beneficial mutations in humans. First, we have limited access to the genetic material of humans of past ages.  We have a few sets of remains dating back a few thousand years, but these are of course an incomplete sample of all human genomes of past ages. Even if beneficial mutations occur in our century, it is unlikely that they can be detected. Any single fixed beneficial mutation is likely to be of modest effect. If a child is born possessing unusually high intelligence or health, it would be both impractical and unethical to run an extensive breeding program to hunt down the exact genetic origin of this trait and to assess whether it came from a mutation as opposed to normal shuffling of alleles. Since it is intrinsically unlikely that we would be able to detect a beneficial mutation in humans, the fact that few or none have been observed does not mean that none have occurred.

There is indirect evidence that the allele for lactose tolerance in northern European populations arose within the last 10,000 years. Most humans stop expressing lactase after childhood, but over 90% of populations in northern Europe now possess lactase persistence. This trait is advantageous in a culture with domesticated dairy animals. Burger et al. [36] sampled DNA from seven skeletons dated 5000-6000 years old, gathered from sites in northern Europe, and none of them displayed evidence of lactase persistence. Thus, this beneficial mutation has only become widely fixed in the population in the past few thousand years. However, for the reasons discussed above, it’s impossible to be certain when the mutation itself first occurred.

A more recent beneficial mutation is the “Milano” mutation which helps prevent hardening of arteries (atherosclerosis) [37].  This case is unusual for humans in that we know when and where this mutation arose. This mutation has been traced to a specific couple in a north Italian village, living in the eighteenth century [38].

Novelty in Evolution

In your letter you stated:

The mutation/selection mechanism can at best “tinker” with existing genomic codes. It is insufficient to write genuinely new code or to generate new code “from scratch.”

This statement confuses two issues. On the one hand, it is correct that evolution only acts on pre-existing genetic code.  So, yes, mutation/selection does not generate new code “from scratch.”

However, that does not imply that mutation/selection cannot produce new code at all. Chandrasekaram and Betran [69] provide an introduction to how new genes arise.   Mutation/selection takes existing code, and tinkers, and tinkers, and tinkers, and tinkers, and tinkers, and tinkers, for millions of generations, across millions of individuals and hundreds of millions of base pairs per individual. This tinkering includes duplication and insertion of big and small chunks of genetic strands, as well as point mutations. There is nothing in chemistry or biology that defines or justifies a hard limit to this process of molecular modification. As discussed in the next section, gene duplication with subsequent mutation is a straightforward means to generate additional, different genes. Less common are new genes originating from ancestrally non-coding DNA sequences, but three such human genes have recently been identified  by Knowles and McLysaght [64].

How did the first functioning code in the first living cells come about?  This is the topic of abiogenesis, not evolution. At present we cannot posit a step-by-step mechanism, using tight inferences from today’s observations, which would account for the development of RNA/DNA/protein-based organisms from molecules likely to be present on the primitive earth. Some theists call for a god-of-the-gaps miracle to bridge this lack in our understanding. Dogmatic atheists insist that a natural explanation must exist for the origin of life, no matter how clueless we are. There is so little hard data available, that a person is free to stake out a position on abiogenesis based on his or her ideology, with little fear of contravention. This is not true for evolution. Once the first cells with DNA-based replication became functional, the path to higher life-forms via evolution is understandable in terms of genetic mutations and natural selection. While many, many details remain to be learned, the specific objections raised by creationists against evolution can be tested and refuted. Conversely, to the extent that creationism can be pinned down to specific predictions, these can likewise be subjected to reality checks.

Summary on Rates of Change of Phenotype and Genotype

(1) The rate of physical changes to organisms over the geological ages is very, very slow; the rate of physical changes from generation to generation in short-term modern experiments is more than enough to account for the historic changes in morphology.

(2) The rate of mutations (i.e. nucleotide substitutions) measured in modern short-term experiments closely matched the historical rates inferred to have taken place over the past millions of years.

(3) These consistencies support the mainstream scientific view that the large changes seen in the fossil record can be accounted for by extrapolating the present-day rates of morphological and genetic changes backwards over hundreds of millions of years.

(4) The slow pace of historic evolution (according to the fossil record) implies that it is inherently improbable to detect large beneficial mutations on the timescale of practical experiments or observations. On the human scale of observation, all we should normally observe is micro-evolution.  The fact that micro-evolution is all we do normally observe is a point in favor of evolution, not a point against it.

SECTION 3.   INCREASING GENOME SIZE AND COMPLEXITY

In the fossil record, the earliest organisms tend to be the simplest. In the lowest rock layers, believed to be over a billion years old, there are fossilized stromatolites, which are the remains of colonies of single-celled algae. By Cambrian times (c. 540 million years ago), arthropods were crawling around the ocean floor. In higher rock layers, jawless fishes appear, followed (in the vertebrate line) by jawed fishes, amphibians, reptiles, early mammals, and primates. Some representatives of the earlier life-forms usually continue to survive alongside the newer models. Thus, fishes and amphibians are still with us.

There is a general, though not universal, trend of increasing size of the total genome and number of genes as the so-called higher life-forms appear. Bacteria typically have 1000-4000 genes; yeast has about 6000, fruitflies have about 13,000 and humans have something over 20,000 genes. The total size of the genome (number of base pairs) also increases in the same order. Evolutionary theory therefore must therefore propose a plausible mechanism for an increase in genome size and complexity.

Most of the beneficial mutations discussed in Section 1 involved very localized changes to DNA, such as a nucleotide base substitution, or a single insertion or deletion. These changes don’t have much of an effect on genome size. Is there a plausible mechanism for adding large chunks of genetic material, including the generation of additional genes? The answer is yes:  gene duplication is well-known in laboratory studies, and in its various forms, it provides a clear basis for increasing genome size and complexity.

Types of Gene Duplication and Subsequent Development

Gene duplication and other shuffling of large and small genomic domains are routinely observed in the laboratory. Duplication of entire genomes is common in plants. For instance, bread wheat (hexaploid) has six sets of its chromosomes. In one example of recent speciation, a new, tetraploid  (four chromosome sets) species of the goatsbeard (salsify) plant formed in the wild within fifty years, from the hybridization of two diploid species [62].

Some examples of beneficial (adaptive) gene duplication are noted below. Like all parts of the genome, this duplicated or shuffled genetic material is subject to further point mutations and further shuffling. As a duplicated gene is modified, it becomes a new, different gene.

The classic model is the duplication of a gene, including its regulatory elements, to form an exact, functioning copy of the original gene. However, there are other types of duplication and shuffling of genetic segments. A complete new gene may be formed, but with some differences from the original gene. For instance, only part of a gene may get duplicated.  A “chimeric” gene may incorporate portions of two or more different parent genes or other gene segments. Groups of genes or whole chromosomes may get duplicated. When the mechanism for duplication is unequal crossing over, the new genes are usually linked in tandem with the original genes, and preserve the introns. With retroposition, new genes or gene segments are inserted in various places, and typically lack the original introns and regulatory sequences.

There are several possible fates for the new gene. If the new gene or gene segment is not functional (e.g. doesn’t have adequate regulatory sequences), it will just be a dead pseudogene. Since most major mutations are deleterious, the most likely outcome for a functional new gene is to be silenced. That can happen slowly via natural selection over many generations, or quickly if the gene duplication is so deleterious that the organism fails to thrive.   Some gene duplicates will be near-neutral or (rarely) beneficial. The odds of any new near-neutral mutation being fixed are low, going as 1/2N (where N is population size) in a diploid population, and taking on average 4N generations to become fixed. But there are plenty of individuals, plenty of duplicated genes, and plenty of generations to allow some duplicated genes to become fixed and to further evolve. Computer analysis of yeast, fruitfly, and human genomes indicates that 30-40% of their genes are duplicates [39].

If the new gene is a precise duplicate, then new functionality can arise through the usual process of point and indel (frameshift) mutations. If one duplicate copy attains a new function, while the other retains the original function, this is termed neofunctionalization. Often, the original gene serves more than one function. After gene duplication, the organism can sustain mildly deleterious mutations in both gene copies, such that each copy comes to specialize in one of the two original functions. This “subfunctionalization” is more likely than neofunctionalization, since degenerative mutations are common. Duplicated genes can also undergo further movements of various sized chunks of DNA. Small mutations in regulatory regions can have big effects on gene expression.

When the new gene is not an exact replica of the parent, then significant novelty can be introduced right at the time of duplication. Also, when the two gene copies do not start off with identical functionality, it becomes easier for them to further diverge in function. In a study of genes in nematodes, Katju and Lynch [40] found that:

More than 50% of newborn duplicates in Caenorhabditis elegans had unique exons in one or both members of a duplicate pair, indicating that many duplicates are not functionally identical to the progenitor copy at birth. Both partial and chimeric gene duplications contribute to the formation of novel genes. For chimeric duplications, the genomic sources of unique exons are diverse, including genic and intergenic regions, as well as repetitive elements. These novel genes derived from partial and chimeric duplications are equally likely to be transcriptionally active as copies derived from complete duplications of the ancestral gene.

Gene duplications can confer an immediate advantage to an organism, if the additional protein expression is beneficial. Some recent examples are cited below. Gene duplication can also help to protect against knockout mutations: if one copy of the gene loses function, the other copy can carry on.  Gu et al. [61] studied this effect in yeast, finding that deletion of a duplicated gene was less than half as likely to be lethal as the deletion of a singleton gene  (12.4% versus 29.0%).

In frameshift mutations, the “reading frame” for the DNA bases is shifted, resulting in a very different protein being expressed. This can be a simple means to generate a radically different genetic function, and thus may be a powerful engine for evolutionary change.  Usually (not always) frameshift mutations are enormously harmful to an organism, because the original protein expression is lost. However, if the frameshift mutation occurs on one copy of a gene duplicate, the other copy can continue to express the original protein. Studies by Raes and Van de Peer [41] and by Okamura, et al. [42] indicate that frameshift mutations on a duplicated gene copy are more likely to be tolerated. This can enable a new, radically different gene to enter the gene pool.

Work by Cairns et al. and Henderson et al. [43] showed how duplication of genes can increase the number of targets for beneficial frameshift mutations. They started with a strain of E. coli which had a frameshifted, weakly functioning copy of the lacZ gene, which encodes beta-galactosidase. Analysis of the results indicated that duplication of the weak lacZ was favored as early evolutionary response by this organism to boost the beta-galactosidase activity. Having more copies of the weak lacZ gene present then increased the odds that at least one of these genes would undergo a favorable frameshift mutation to recover full beta-galactosidase activity.

Examples of Recent Beneficial Gene Duplications

The mosquito gene duplication was mentioned earlier [19]:“One example of a beneficial mutation comes from the mosquito Culex pipiens. In this organism, a gene that was involved with breaking down organophosphates – common insecticide ingredients -became duplicated. Progeny of the organism with this mutation quickly swept across the worldwide mosquito population.” Further details on this mutation are available [44].  It involves multiple duplications of two genes that generate carboxylesterases. As with the two examples below, gene duplication gave increased expression of certain enzyme(s), which increased the fitness of the organism. Natural selection would then favor the retention of the additional genes.

Richle et al. [45] ran six lines of E. coli under stressful high temperature conditions, and then scanned their genomes for fixed duplication/deletion events. A total of 5 such events were detected, spread over 3 of the lines. Three of the duplications were at the same place in the E. coli chromosome, indicating replicability of this adaptation to high temperature. Two of these cases were studied in more detail. They both provided significant increases in fitness: “In both of these cases, the model for the origin of the duplication is a complex recombination event involving insertion sequences and repeat sequences.  These results provide additional evidence for the idea the gene duplication plays an integral role in adaptation, specifically as a means for gene amplification.”

Brown et al. [46] let baker’s yeast (Saccharomyces cerevisiae) evolve for 450 generations under glucose-limited conditions:

 Relative to the strain used as the inoculum, the predominant cell type at the end of this experiment sustains growth at significantly lower steady-state glucose concentrations and demonstrates markedly enhanced cell yield per mole glucose, significantly enhanced high-affinity glucose transport, and greater relative fitness in pairwise competition. These changes are correlated with increased levels of mRNA hybridizing to probe generated from the hexose transport locus HXT6. Further analysis of the evolved strain reveals the existence of multiple tandem duplications involving two highly similar, high-affinity hexose transport loci, HXT6  and HXT7. Selection appears to have favored changes that result in the formation of more than three chimeric genes derived from the upstream promoter of the HXT7 gene and the coding sequence of HXT6.

 The mutated yeast species remained competitive at the original conditions: “these results suggest that physiological tradeoffs are not required for the evolution of enhanced substrate uptake and assimilation efficiency in yeast.”  This is yet another example that disproves the notion that all mutations represent degeneration relative to the earlier or “wild” strain. Here we have an increase in the number of functioning genes, which is beneficial to the organism. Moreover, the duplicated genes are chimeric: they combine the coding sequence of one parent gene with the promoter sequence of a different gene. Hence, the new genes are not completely identical to either of the parent genes. As discussed above, this gives some novelty and provides additional opportunity for modification of these genes. The observed duplications give insight into likely evolutionary pathways [46]:

The observation that multiple duplications involving HXT6 and HXT7 have arisen under selection has interesting implications concerning the evolution of the HXT gene family. It can be reasonably speculated that the entire hexose transport multigene family in yeast arose through a series of gene duplication and divergence events. Saccharomyces cerevisiae has 20 gene sequences that can be grouped together as hexose transport genes on the basis of amino acid sequence similarity, the conservation of 11 transmembrane domains, and the presence of 2 sugar-transport motifs that are conserved in eukaryotes. HXT8 through HXT17 were detected solely by their appearance in the Saccharomyces Genome Database, and may be important in the transport of exotic hexoses encountered by yeast in natural environments. Within the known glucose transporters, a gene duplication event has been inferred leading to the tandemly arrayed HXT1/3 and HXT4/6 genes. Duplications of whole segments of chromosomes led to two sets of tandemly arrayed genes; HXT3 and HXT6 are paralogs of HXT1 and HXT4, respectively. HXT7 was then duplicated from HXT6.

Historical Reconstructions of Gene Duplication and Modification

It is inherently difficult to reconstruct the detailed evolution of a gene. The field of molecular biology is extremely complex, our understanding is still in its infancy, and we have only limited information available to work with. Only in the last few years have scientists been able to determine the sequences of complete genomes for several organisms. That is not as useful as it may seem. We don’t have the genome sequences from 1000 generations ago, 2000 generations ago, 3000 generations ago, and so on. If we had those historical genomes, we could trace the changes in DNA with time with certainty. Since those ancestral genomes are not available, the best we can do is propose scenarios which are consistent with the types of spontaneous duplications and other mutations which are observed in the laboratory.

However,  since the number of possible pathways of mutations to arrive any given current gene is astronomical, and we don’t  know the full effect of almost any single nucleotide change, it should not be expected that biologists will be able to furnish complete, unassailable scenarios for how today’s genes assumed their present form.  This does not reflect on the coherence of evolutionary theory. It is merely a consequence of the level of our knowledge and the complexity of the problem.

In some cases, biologists can use comparisons of the genes of similar organisms to infer the genes of a common ancestor. This allows the proposal of detailed scenarios of evolution at the nucleotide level. This is a new but burgeoning field, enabled by the increasing availability of DNA sequencing information for various organisms. Dozens of articles a year are published in this area, in journals such as Genetics, PNAS, and Journal of Molecular Evolution.

Some recent reviews on gene duplication include:

Long, et al., “The Origin of New Genes: Glimpses From The Young and Old” (2003) [47]

Zhang, “Evolution by gene duplication: an update” (2003) [39]

Taylor and Raes, “Small-Scale Gene Duplications” (2005) (chapter in The Evolution of the Genome, ed T. R. Gregory, Elsevier, 2005).   [48]

Conant and Wolfe, “Turning a hobby into a job: How duplicated genes find new functions”(2008) [49]

Filatov gives a short on-line Powerpoint presentation on “Genome Evolution” with good graphics and references [50]

 These reviews summarize some of the examples where scientists have reconstructed the origins of some present-day genes. Since gene duplication and subsequent evolution is key to increasing genomic complexity, and since this is one of the most contentious issues in the creation/evolution debate, I recommend you read two or three of these reviews.

Long et al. [47] describe the mechanisms of gene duplication and exon shuffling, and pick out 22 genes in various species whose history has been reconstructed in some level of detail. I’ll mention three of these cases. Long et al. [47] depict the evolution of the “Jingwei” gene in African Drosophila fruitflies about 2 million years ago. First, an ancestral gene was duplicated. Then, a section of a different (“Adh”) gene was inserted (retroposed) into the middle of the duplicated gene to create the final Jingwei gene. In this gene, the protein coding comes from the inserted section plus the original DNA on one side of the insertion. The DNA on the other side of the insertion has been deactivated. The new chimeric Jingwei gene codes for an alcohol dehydrogenase that preferentially oxidizes longer-chain alcohols. This is the opposite of the preference of the parent Adh gene. This is a straightforward example of an increase in genetic complexity: a new gene is added, with differing functionality than its parents.

The Antarctic notothenoid fishes have an unusual “antifreeze” glycoprotein (AFGP) which inhibits the formation of ice crystals in their bodies in the sub-freezing (-1.9 C) Antarctic waters.  The AFGP is a polymer of a Thr-Ala-Ala glycopeptide monomer. It appears that a trypsinogen protease gene was duplicated, and in one copy the The-Ala-Ala region was expanded through multiple internal duplications. The exons coding for the protease sequences were lost, to yield the present form of the AFGP gene. Thus, a gene with an entirely different function was evolved. Internal molecular clocks indicate this gene arose around 5-14 million years ago. This is in remarkable agreement with independent estimates of when the Antarctic ocean temperatures dropped below freezing, due to shifts in ocean currents. Examination of the oxygen isotopic ratios in sea-bottom deposits indicates this drop in temperature occurred some 10-14 million years ago, which would provide an environmental driver for the evolution of the antifreeze [51].

Unlike other primates, columbine monkeys such as the douc langur eat leaves as their primary food source. The leaves are fermented in their foregut by symbiotic bacteria. The bacteria are then digested by the columbine monkey in its small intestine. Bacteria have a high ratio of RNA-nitrogen to total nitrogen, and the douc langur has an extra copy of an RNase gene which helps them digest the bacteria. Primates generally have one copy of this gene, known as RNase1.  RNase1 has a second enzyme activity in degrading double-stranded RNA, which might help to defend against viral infection. The douc langur has Rnase1, plus a slightly different version, RNase1B. Rnase1B differs by nine amino acids from RNas1, and was found to be optimally suited to the relatively low pH (6-7) in the douc langur small intestine. The Rnase1B thus contributes to digesting bacteria, but by these nine amino acid changes, it has lost the ability to degrade double-stranded RNA. The RNase1B was free to evolve to a more effective bacteria-digesting function around 4.2 million years ago without jeopardizing the organism by loss of the second enzyme activity, because the duplicate RNase1 gene was still present to perform the ancestral functions. [52]

Reaching further back in time, analysis of the human genome by Steven Salzberg and co-workers as part of the human genome project found several hundred regions of ancient large-scale duplications, involving several thousand genes [59]. The scientists inferred that these genes were duplicated before the mammalian lineages diverged, so on the basis of evolutionary theory they predicted that these same genes should exist in duplicate in other mammals such as mice. This prediction was born out when the mouse genome was sequenced two years later [60]. This illustrates the scientific usefulness of evolutionary theory. Computer analysis of animal genes suggests that two duplications of whole genomes took place early in the vertebrate ancestry [63], which would give scope for additional modified genes to develop.

Many other examples could be cited, but the point is made that we are finally beginning to explain the detailed origins of today’s genes in terms of step by step modifications from ancestral genes.  This is difficult work, given the state of our knowledge and the lack of access to genomes of extinct organisms. However, we expect that a steady stream of such reconstructions will be forthcoming from the research community.

Role of Regulatory Elements in Evolution

As you have noted, some regions of the genome that used to be considered “junk” DNA (because they don’t directly code for proteins) can play in important role in metabolism. A burgeoning new area of research concerns the role of mechanisms which regulate gene expression. There are a number of such mechanisms. Some of these can be tied to specific sites in the genome which may express, for instance, micro-RNA.  This offers a particularly parsimonious means for evolution to occur. Alterations in regulatory regions can introduce novelty in the phenotype without the need to evolve completely new genes. This helps explain why humans share so many genes with mice, yet look so different.

Current thinking is that “that altering the control region of a gene to change some feature of an animal or plant would produce fewer side effects than tinkering with the proteins that direct construction of the features. So an organism can change one part of its body without affecting the rest simply by adding a few more switches and buttons to its control panel (or taking some away), or by rewiring a switch to work at a different time or govern development in a new location” [53]

A primitive bony paddlefish, thought to be a proxy for a fish/amphibian transitional species, was found to have similar pattern of gene expression in the growth of its fins as occurs in the limb development of today’s terrestrial animal embroyos. The key difference is that it is turned on for a longer time for the terrestrial animals, according to research by Neil Shubin and colleagues [54]:

The discovery overturned the long-held notion that the acquisition of limbs required a radical evolutionary event.          “It turns out that the genetic machinery needed to make limbs was already present in fins,” says Shubin. “It did not involve the origin of new genes and developmental processes. It involved the redeployment of old genetic recipes in new ways.”

At this early stage, it is hard to make a full assessment of the impact of these discoveries. They indicate that there are multiple genetic pathways to achieve evolutionary changes and also show that there is much yet to learn about molecular biology.

Analogies Between Human Language and Genomes

When writers try to explain or discredit evolution, they often invoke analogies between human language, with its letters, words, and meanings, and the genetic code embodied in DNA. For instance, theistic evolutionist Darryl Falk develops a scenario in which two non-coding DNA sequences (represented as abcdefguvwxyz and zzzzzzzzzzzzzzzzzzzz) are inserted into a set of instructions, and then some instructions from a virus genome are inserted into that non-coding section. Both types of insertions are well-known from laboratory observations [55]. The original instruction set was whimsically chosen as, “Put down this book, go to the refrigerator and fix yourself a strawberry sundae”. The final sentence, after all three insertions of extraneous DNA, is “Put down this book, abcdefgzzzzzzzzzzzzzzhow to build viruses just like mezzzzzzzzzzzzzzzuvwxyz go to the refrigerator and fix yourself a strawberry sundae.”   Non-functional DNA sequences like these can be used to trace common ancestry among organisms.

Falk’s example illustrates two important aspects of the genetic code. First, the genome is like a recipe or other set of instructions. It should not be treated like a novel or newspaper article. This is especially important in evaluating the impact of gene duplication.

Second, despite its interlocking complexity, the genome can be more tolerant of variants than formal English prose. If we started with a well-written essay or story, and randomly changed letters or words, almost any change would be deleterious. This is not true in general for changes to DNA. As best we can tell, most of the genome in, for instance, mammals, does not code for or regulate protein expression.  In the case of Falk’s example, the actual instruction set is still functional, even after the insertion of extraneous sequences. Great caution is required before writing off any part of the genome as without function. Even if we currently do not know of a function for a segment of DNA, it is possible that a function may be discovered in the future. However, we can directly test the utility of segments of DNA by deleting them from the genome and monitoring the physical effects on the organism. For instance, Nobrega et al. [66] deleted two large chunks of non-coding DNA (1,511 kilobases and 845 kilobases in length) from the mouse genome. Homozygous descendants were generated, and found to be indistinguishable from wild-type littermates across a range of measured fitness parameters. This strongly suggests that these chunks of DNA have little current function, yet they can function as a reservoir of genetic material which might become mutated to gain some functionality.

Even within the coding for proteins, many of amino acids can be altered without appreciable harm [65]. Lynch [56] reviews this area, concluding that “10-50% of replacement mutations are capable of being maintained within populations at moderate frequencies by selection-mutation balance and/or going to fixation.” Among other things, this gives additional scope for mutations which are currently neutral but which may be beneficial if the environment changes or if some subsequent mutation occurs.

This is not to over-simplify the issues. Many nucleotide and amino acid alterations can be significant. There are genetic diseases which are caused by one differing nucleotide. Lactate dehydrogenase can be changed into malate dehydrogenase by replacing just one of its 317 amino acids [57]. This may or may not be a good thing for the organism. Most mutations which affect the organism are harmful, but (if harmful enough) they are diminished by natural selection, while beneficial mutations are retained.

Creationist John Sanford attempts to ridicule the efficacy of gene duplication, using several rhetorical examples [58]. First, he asks, “Iff I repeaaat a lletttter, does it immmmprove my sentenccce?”  Here Sanford deliberately chooses a “deleterious” example of letter duplication. However, it is easy to come up with a counter-example where letter or word duplications give more emphatic and thus more meaningful sentences: “I reeeeally like dark chocolate. I like it very, very much.”

Sanford makes a more fundamental error when he asks, “If I repeat my sentence, do I tell you more? If I repeat my sentence, do I tell you more?” and concludes that “Obviously, all these types of duplications are deleterious, regardless of the scale.”  This fails to recognize the literary genre of the genome. It may be true that repeating a statement like “The sky was blue” adds nothing to a descriptive essay. However, repeating a line like “Add 1 teaspoon salt” in a cake recipe would lead to a physically different cake. The resulting cake with 2 teaspoons of salt may be better, worse, or the about same as the original cake with only 1 teaspoon, but there is no doubt that duplicating the instruction made a substantive difference.

We could take this cake recipe example further by noting that one of the duplicate instructions might be altered by one word  to read, “Add 1 teaspoon cinnamon,” while retaining the original 1 teaspoon of salt. Now we are well on our way to a new species of cake.  It is obvious that by adding a new and different line (“Add 1 teaspoon cinnamon”) the complexity and information content of the recipe has been increased.

Of course, if we wanted to inject more evolutionary realism to this example, we may have to posit a large number of other instances of instruction duplication and word substitutions to the recipe, with an equally large number of cakes made and sampled by taste-testers, who might feed back approval or disapproval to the baker, in order to evolve a better cake. But those numbers don’t change the fundamental fact that duplication of instructions, with subsequent (or simultaneous) modification of one of the duplicates is a clear path to increasing genetic complexity.

Turning to the analogy you offered in your letter:

An oversimplified analogy might be to take a one thousand word text in a computer file, override the spell check on the word processor, and have someone with “average” keyboard skills continually re-type the same text over and over again (billions or trillions of times). Then have a…child read each new text and preserve any new words (most of which will be nonsense or misspellings) which may have been preserved by mistake. On rare occasions an intelligible new word may appear, but integrating it and thousands of other such randomly produced words into meaningful texts (containing up to 30,000 words) which “make sense” is “beyond unlikely.” Engaging in this process for “millions or billions of years” will not improve the likelihood of a meaningful outcome.

This characterization of mutation/natural selection fails to take into account the points noted above:

(1) The genomic code is like a recipe, not a novel.

(2) Besides point mutations (analogous to changing one letter at a time), duplication of whole segments of DNA occur, which are the equivalent of having your typist duplicate whole sentence or paragraphs. This would be an effective means to building up the text from 1000 to 30,000 words.

(3) There is a sufficient number of gene duplications and mutations over the past several billion years to account for the evolution of today’s complex genomes. Your assessment of “beyond unlikely” is not mathematically grounded. The likelihood that any particular square foot of my township will be struck by lightning this year is exceedingly low, yet the chance that lightning will strike somewhere is substantial. Similarly, given some early functioning life-forms, it is entirely probable that some improved genomes would appear by mutation/selection, although the production of any particular pre-specified genome may be vanishingly small. Confusing these two issues is a common creationist mistake.

Parable of the Two Engineers

As a lead-in to your text-typing analogy above, you stated:

Natural selection doesn’t “think” or “plan”, it only “filters” the information passed on to it. Mutations are correctly described…as random events. Mutations cannot “think” or “problem solve” and have no idea what the genotypic and phenotypic needs of any given organism might be. That is, the collective I.Q. of the  mutation/selection mechanism is zero.

I cannot make sense of this statement. It is undeniable that mutation and natural selection operate in vats of bacteria to produce modified genomes that are better adapted to their environments. The modified genomes are “new”, in the sense of not existing previously in that particular form.  Yet there is no thinking or “I.Q.” involved here.

Furthermore, this viewpoint fails to appreciate the wisdom, foresight, and skill of the Creator of the world as it is, which includes evolution.  To illustrate this point, I will offer my own analogy.

Once there were two engineers, plodding Pete and Walt the wiz. Each was tasked with designing an airplane wing of known length and width. They had to design the profile of the airfoil along the wing, to minimize drag while maintaining lift.

Pete went to his computer console and programmed in the known laws of aerodynamics. He then sat at the console for hours a day, manually entering in different airfoil shapes, one at a time. He retained improved designs to build on, and after a week of intense labor managed to produce a reasonably optimized design.

Walt took a more automated approach. He set up a program that started with a first guess at the wing profile, and then kept making random, mainly small changes to it. If the change made an improvement, it was retained, and the new shape became the basis for further variations. This problem-solving approach is well-known in engineering. In fact, it is called a “genetic algorithm” because of the analogy with the variation/selection mechanism in genetic evolution. Walt started the program running on Monday afternoon, and went to the beach on Tuesday. He came in Wednesday to find that the program had converged on a wing profile which was as good as Pete’s.

Anyone walking by Pete’s console that week would see that he was personally present, giving constant, extra input into the process. In contrast, an observer walking past Walt’s console on Tuesday would not see Walt at all. The process Walt set up would be chugging along, combining random, blind variations with a selection criterion. No consciousness, no I.Q. was involved in this ongoing process. Even if the observer peered inside the computer housing, he would still not see Walt. A skeptical observer might doubt that Walt had anything to do with the unfolding design. Yet Walt’s skill is demonstrated in how well the process progressed without him continually reaching in to tweak it.

I leave it to you: Which of these two engineers is the more impressive designer?

Summary on Increasing Genome Size and Complexity

The duplication and insertion of various-sized lengths of DNA, including whole, functioning genes, is clearly observed in today’s organisms. As usual with large-scale mutations, most such events are likely to be deleterious. Some are effectively neutral and a few are beneficial. Several examples of beneficial gene duplication were given.

The usual processes of point and other mutations can operate on one or both of the duplicate genes, resulting in additional, different genes compared to the original.  In some cases, the duplicate gene copies are different from the parent gene right from the start. Having two gene copies available opens up additional pathways for mutations to operate.

The occurrence of any beneficial mutation is typically rare, so it is difficult to observe in the lab a survivable gene duplication plus subsequent substantive, beneficial gene modifications. However, no biochemical barrier exists for the mutational modification of duplicated genes. They are just genes like any other, subject to the same types of mutations. This militates against the creationist assertion that evolution must be limited to micro-evolution. Anti-evolutionists cannot refute this lack of biochemical limits on gene duplication/subsequent mutation, and thus rarely engage this issue directly. Rather, they raise doubts by proposing verbal analogies or invoking inferences from information theory.  On inspection, these objections are all seen to be flawed.

Tracking the progress of gene duplication and subsequent differentiation is hampered by the lack of access to the genomes of organisms that lived millions of years ago. Only recently has our understanding of molecular biology and determination of whole genomes of current organisms allowed the step-by-step reconstruction of the development of today’s genes from an earlier common ancestor.  These reconstructions are of course tentative, but they show that the appearance of current genes from earlier ones can be explained in terms of known, natural molecular processes.

Closing Thoughts

I have attempted to investigate each of the concerns you raised about evolution in your letter, using mainly primary data. I found that none of these concerns is justified. To the extent that we can put it to the test, we find a generally good match between the nature and rates of today’s changes in genotype and phenotype, and those inferred from geological measurements, fossil discoveries, and comparative genomics. The number and nature of beneficial, neutral, and deleterious mutations are adequate to supply the variations needed for evolution to progress. Duplication and insertion of various-sized chunks of DNA, including whole genes, provides a path for increasing genetic complexity.

I’ll offer some brief comments on Behe and Sanford, since you refer to their books. Behe’s Edge of Evolution is a generally scholarly work. However, it makes a fundamental mistake which vitiates its entire thesis: it confuses the probability of obtaining one specific pair of point mutations (such as the pair that confer antibiotic resistance on the malaria parasite) with the probability of obtaining any possible pair of mutations anywhere in the genome. Thus, no edge to evolution was demonstrated.

Sanford’s Genetic Entropy, on the other hand, is simply wrong from beginning to end. It misrepresents everything it touches: beneficial and deleterious mutations, gene duplication, natural selection, and (most crucially) synergistic epistasis. In all these areas, Sanford avoids engaging the large body of work which directly refutes his viewpoint, and instead cherry-picks a few references that seem to point his way, usually misinterpreting them in the process.

It has taken me some months to complete this letter, since it involved a lot of internet searching on the occasional evening and weekend.  However, it has been very educational, helping me better appreciate the coherence of the theory and the facts of evolution. I hope it proves useful in your thinking, as well. I would be happy to answer any further questions you have in this area.

All the best,

Scott

ENDNOTES

(Note: The formats here are not consistent – – I mainly copied reference descriptions directly from the internet. All or nearly all the links should work if you paste them into your browser window.)

[1] http://www.nmsr.org/nylon.htm

[2] Barry Hall’s lac bug experiments are described in Kenneth Miller , Finding Darwin’s God ( New York: Perennial/Harper Collin, 1999) pp. 145-146.

[3]  http://www.gate.net/~rwms/EvoMutations.html “Examples of Beneficial Mutations and Natural Selection”

This is part of the larger “Evolution Evidence” page by Robert Williams, which has many links and articles on biology and genetics in general, plus articles on creation/evolution topics: http://www.gate.net/~rwms/EvoEvidence.html

[4]  Lisa M. Durso, David Smith, and Robert W. Hutkins, “Measurements of Fitness and Competition in Commensal Escherichia coli and E. coli O157:H7 Strains”

Applied and Environmental Microbiology, November 2004, p. 6466-6472, Vol. 70, No. 11

http://aem.asm.org/cgi/content/full/70/11/6466

[5]  Santiago F.Elena and Richard E.Lenski ,  “Evolution Experiments With Microorganisms: The Dynamics And Genetic Bases Of Adaptation”

Nature Reviews Genetics, 2003 Vol 4, page 457

http://www.oeb.harvard.edu/faculty/marx/Manuscripts/Elena-Lenski-2003.pdf

[6] Nadav Kashtan, Elad Noor, and Uri Alon.  “Varying environments can speed up evolution”

PNAS August 21, 2007 vol. 104 no. 34 13711-13716

http://www.pnas.org/content/104/34/13711.full  

[7] Motoo Kimura, The Neutral Theory of Molecular Evolution (Cambridge: Cambridge University Press, 1983) pp. 56-61

[8] Johanna Björkman, Diarmaid Hughes, and Dan I. Andersson, “Virulence of antibiotic-resistant Salmonella typhimurium”, PNAS March 31, 1998 vol. 95 no. 7 3949-3953

http://www.pnas.org/content/95/7/3949.full           

[9]  Rees Kassen and  Thomas Bataillon, “Distribution of fitness effects among beneficial mutations before selection in experimental populations of bacteria “,  Nature Genetics 38, 484 – 488 (2006)

Abstract: http://www.nature.com/ng/journal/v38/n4/abs/ng1751.html

Full Article: http://www.daimi.au.dk/~tbata/tap/ng1751.pdf

[10]   P. Sander, B. Springer, T. Prammananan, A. Sturmfels, M. Kappler, M. Pletschette, and E. C. Bottger     “Fitness Cost of Chromosomal Drug Resistance-Conferring Mutations”.

Antimicrob. Agents Chemother. 46: 1204-1211   (2002)

http://aac.asm.org/cgi/content/full/46/5/1204

[11]  Naidan Luo, Sonia Pereira, Orhan Sahin, Jun Lin, Shouxiong Huang, Linda Michel, and Qijing Zhang

“Enhanced in vivo fitness of fluoroquinolone-resistant Campylobacter jejuni in the absence of antibiotic selection pressure”,  PNAS January 18, 2005 vol. 102 no. 3 541-546

http://www.pnas.org/content/102/3/541.abstract

[12] M. Imhof And C. Schlotterer, 2001     “Fitness effects of advantageous mutations in evolving Escherichia coli populations.”           Proc. Natl. Acad. Sci. USA 98:1113-1117

http://www.pnas.org/content/98/3/1113.abstract?ijkey=cb3808c26f8d3de7637f9ac55be814feb7d5b246&keytype2=tf_ipsecsha

[13]  Perfeito, Lilia, Lisete Fernandes, Catarina Mota and Isabel Gordo.    2007.  “Adaptive mutations in bacteria: High rate and small effects.”              Science 317: 813-815.

http://www.sciencemag.org/cgi/content/abstract/sci;317/5839/813

[14]  Sarah B. Joseph and David W. Hall,”Spontaneous Mutations in Diploid Saccharomyces cerevisiae More Beneficial Than Expected,” Genetics, Vol. 168, 1817-1825, December 2004

http://www.genetics.org/cgi/content/abstract/168/4/1817

[15]  David W. Hall, Rod Mahmoudizad, Andrew W. Hurd And Sarah B. Joseph,” Spontaneous mutations in diploid Saccharomyces cerevisiae: another thousand cell generations,”  Genetics Research (2008), 90 : 229-241           http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=1919588

[16] W. Joseph Dickinson , “Synergistic Fitness Interactions and a High Frequency of Beneficial Changes Among Mutations Accumulated Under Relaxed Selection in Saccharomyces cerevisiae,”      Genetics, Vol. 178, 1571-1578, March 2008,    http://www.genetics.org/cgi/content/full/178/3/1571

[17]   Ruth G. Shaw, Frank H. Shaw, Charles Geyer, “What Fraction Of Mutations Reduces Fitness? A Reply To Keightley And Lynch,”    Evolution 57(3):686-689. 2003  http://www.bioone.org/doi/full/10.1554/00143820%282003%29057%5B0686%3AWFOMRF%5D2.0.CO%3B2

[18] Ruth G. Shaw, Diane L. Byers, and Elizabeth Darmo , “Spontaneous Mutational Effects on Reproductive Traits of Arabidopsis thaliana,” Genetics, Vol. 155, 369-378, May 2000

http://www.genetics.org/cgi/content/abstract/155/1/369

[19] http://toarchive.org/faqs/faq-intro-to-biology.html#mutation

“Introduction to Evolutionary Biology “

Version 2  © 1996-1997 by Chris Colby

[20] R E Lenski and M Travisano,”Dynamics of adaptation and diversification: a 10,000-generation experiment with bacterial populations,” PNAS July 19, 1994 vol. 91 no. 15 6808-6814

http://www.pnas.org/content/91/15/6808.full.pdf+html

[21] Blount, Zachary D.; Christina Z. Borland, Richard E. Lenski (2008-06-10). “Inaugural Article: Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coli”. Proceedings of the National Academy of Sciences 105 (23): 7899–7906.  http://www.pnas.org/content/105/23/7899.full

Also available at: http://myxo.css.msu.edu/lenski/pdf/2008,%20PNAS,%20Blount%20et%20al.pdf

[22] http://en.wikipedia.org/wiki/E._coli_long-term_evolution_experiment

[23] B. G. Hall (1982) J. Bacteriol 151: 269-273

[24]  Santiago F. Elena, Lynette Ekunwe, Neerja Hajela, Shenandoah A. Oden and Richard E. Lenski, “Distribution of Fitness Effects Caused By Random Insertion Mutations in E. Coli.”  Genetica 102/103:349-358. (1998) http://www.springerlink.com/content/r37w1hrq5l0q3832/

[25]   Susanna K. Remold and Richard E. Lenski, “Contribution of individual random mutations to genotype-by-environment interactions in Escherichia coli”   PNAS September 25, 2001 vol. 98 no. 20 11388-11393

http://www.pnas.org/content/98/20/11388.full?sid=aa94d97a-0302-417a-a669-d8684cf0be99

[26] Motoo Kimura, The Neutral Theory of Molecular Evolution, (Cambridge: Cambridge University, 1983 Press) p. 63.  Equus itself is excluded from the count because its span is incomplete.

[27]  Lists of observed speciations appear in  http://www.talkorigins.org/faqs/faq-speciation.html

and http://www.talkorigins.org/faqs/speciation.html

[28] P. Gingerich, “Rates of Evolution: Effects of Time and Temporal Scaling,” Science 222 (1983): 159-161.

[29] http://www.talkorigins.org/faqs/comdesc/section5.html

29+ Evidences for Macroevolution, by Douglas Theobald (1999-2004)

Section 5.7, “Morphological rates of change”

[30] The same point is made in Ken Miller, Finding Darwin’s God, pp. 109-111.

[31]  http://www.talkorigins.org/faqs/comdesc/section5.html

Section 5.1, “Genetic change”

[32] http://www.talkorigins.org/faqs/comdesc/section5.html

Section 5.8, “Genetic rates of change”

[33] http://users.rcn.com/jkimball.ma.ultranet/BiologyPages/M/Mutations.html

[34] N. G. Smith and A. Eyre-Walker, “Adaptive protein evolution in Drosophila” Nature. 2002 Feb 28;415(6875):1022-4.

http://www.ncbi.nlm.nih.gov/pubmed/11875568?dopt=Abstract

[35]  Margaret A. Bakewell, Peng Shi, and Jianzhi Zhang,”More genes underwent positive selection in chimpanzee evolution than in human evolution”  PNAS May 1, 2007 vol. 104 no. 18 7489-7494

Our treatment here follows http://pandasthumb.org/archives/2007/07/haldanes-nondil.html

[36]  J. Burger, M. Kirchner, B. Bramanti, W. Haak, and M. G. Thomas,

“Absence of the lactase-persistence-associated allele in early Neolithic Europeans”

PNAS March 6, 2007 vol. 104 no. 10 3736-3741

http://www.pnas.org/content/104/10/3736.full.pdf+html

[37] http://www.talkorigins.org/faqs/information/apolipoprotein.html

” Apolipoprotein AI Mutations and Information”

This link describes the mutation, its metabolic implications, and debunks creationist claims that this mutation involves a loss of information.

[38] Primary study: Gualandri V, Franceschini G, Sirtori CR, Gianfranceschi G, Orsini GB, Cerrone A, Menotti A., ”AIMilano apoprotein identification of the complete kindred and evidence of a dominant genetic transmission,”  Am J Hum Genet,  1985 Nov;37(6):1083-97

http://www.ncbi.nlm.nih.gov/pubmed/3936350?dopt=Abstract

[39]  Jianzhi Zhang, “Evolution by gene duplication: an update,” Trends in Ecology & Evolution      Volume 18, Issue 6, June 2003, Pages 292-298

http://www3.botany.ubc.ca/biol430/Zhang_gene_duplication.pdf

[40]  Vaishali Katj and Michael Lynch, “On the Formation of Novel Genes by Duplication in the Caenorhabditis elegans Genome,” Molecular Biology and Evolution 2006 23(5):1056-1067

http://mbe.oxfordjournals.org/cgi/content/abstract/23/5/1056

[41]  Jeroen Raes and Yves Van de Peer, “Functional divergence of proteins through frameshift mutations ,”

Trends in Genetics Volume 21, Issue 8, August 2005, Pages 428-431  http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6TCY-4GCX00S-1&_user=10&_coverDate=08%2F31%2F2005&_fmt=abstract&_orig=search&_cdi=5183&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=889651ab0c5395308fe311653669b6ae&ref=full

[42] Kohji Okamuraa,  Lars Feuka,  Tomàs Marquès-Bonetc, Arcadi Navarroc and Stephen W. Scherera,”Frequent appearance of novel protein-coding sequences by frameshift translation,”

Genomics    Volume 88, Issue 6, December 2006, Pages 690-697

http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WG1-4KJV32X-2&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=8cbcef5e1865ac2864bf746e692fc4f7

[43] See writeup in Box 2 of Conant and Wolfe, “Turning a hobby into a job: How duplicated genes find new functions” (Reference [49])

[44] See http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1218568  , and

  http://www.ncbi.nlm.nih.gov/pubmed/10657234

[45] M. . Richle, A. F. Bennett, and A. D. Long, ”Genetic architecture of thermal adaptation in Eschericia coli,”   PNAS January 16, 2001 vol. 98 no. 2 525-530

http://www.pnas.org/content/98/2/525.full

[46] Celeste J. Brown, Kristy M. Todd, and R. Frank Rosenzweig, “Multiple Duplications of Yeast Hexose Transport Genes in Response to Selection in a Glucose-Limited Environment,”

Molecular Biology and Evolution, Vol 15, 931-942 (1998)

http://mbe.oxfordjournals.org/cgi/reprint/15/8/931.pdf

[47] Manyuan Long, Esther Betrán, Kevin Thornton & Wen Wang, “The origin of new genes: glimpses from the young and old,” Nature Reviews Genetics 4, 865-875 (November 2003)

http://bbs.cst.sh.cn/cgi-bin/bbs/bbsfile/Travel/1086338972/Origin%20of%20new%20genes%20Glimpses%20from%20the%20young%20and%20old.pdf

[48] John S. Taylor and Jeroen Raes, “Small-Scale Gene Duplications” (2005) [chapter 5 in The Evolution of the Genome, ed T. R. Gregory, Elsevier (2005) ]

http://books.google.com/books?id=8HtPZP9VSiMC&pg=PA289&lpg=PA289&dq=Taylor+and+Raes,+%E2%80%9CSmall-Scale+Gene+Duplications%E2%80%9D&source=bl&ots=Dp2akA5Qou&sig=m3BRTJ_Q6k2__CwW2YBq8PE35vE&hl=en&ei=FMdPSs2vHYSMtgfpoY2jBA&sa=X&oi=book_result&ct=result&resnum=7

[49]  Gavin C. Conant & Kenneth H. Wolfe, “Turning a hobby into a job: How duplicated genes find new functions,” Nature Reviews Genetics 9, 938-950 (December 2008)

Email to request reprint: conantg@missouri.edu

[50] weblearn.ox.ac.uk/site/mathsphys/biology/ugradbiol/yr2/evolsys/2_4/ME5_GenomeEvol.ppt

[51] Described by Long [47], and Taylor and Raes [48]. A primary paper is:

Chen et al. (1997): ‘Evolution of antifreeze glycoprotein gene from a trypsinogen gene in Antarctic notothenioid fish’. PNAS 1997; 94; 3811-3816  http://www.pnas.org/content/94/8/3811.full]

[52] From  reference [39], referring to:  J. Zhang et al. (2002), Adaptive evolution of a duplicated pancreatic ribonuclease gene in a leaf-eating monkey. Nat. Genet. 30, 411-415

[53] http://www.sciencenews.org/view/feature/id/40006/title/Molecular_Evolution

[54] National Geographic, February 2009, pp. 36-73, “Recent Evolution”

[55] Darrel Falk, Coming to Peace With Science.  Intervarsity Press, Downers Grove (2004), p. 193

[56]   Michael Lynch, “ Simple evolutionary pathways to complex proteins,” Protein Science (2005), 14:2217-2225

http://www.proteinscience.org/cgi/content/full/14/9/2217 .

[57]  HM Wilks, KW Hart, R Feeney, CR Dunn, H Muirhead, WN Chia, DA Barstow, T Atkinson, AR Clarke, and JJ Holbrook, “A specific, highly active malate dehydrogenase by redesign of a lactate dehydrogenase framework,”  Science, Vol 242, Issue 4885, 1541-1544

http://www.sciencemag.org/cgi/content/abstract/242/4885/1541

[58]  John Sanford, Genetic Entropy and the Mystery of the Genome. Elim Publishing, Lima, NY (2005), p. 189.

[59] The 2001 Venter et al. article is:

“The Sequence of the Human Genome “

Science 16 February 2001:   Vol. 291. no. 5507, pp. 1304 – 1351 http://www.sciencemag.org/cgi/content/full/291/5507/1304?sendit.y=7&sendit.x=37&gca=291%2F5507%2F1304&

Informative figures of the duplications are on Salzberg’s web site; one needs to refer to the captions in the original Venter et al. paper to interpret these figures.

http://cbcb.umd.edu/~salzberg/docs/Duplications-p1.pdf

http://cbcb.umd.edu/~salzberg/docs/Duplications-p2.pdf

[60] Steven Salzberg, personal communication, December 2, 2008.

[61]  Zhenglong Gu, Lars M. Steinmetz, Xun Gu, Curt Scharfe, Ronald W. Davis, and  Wen-Hsiung Li

, “Role of duplicate genes in genetic robustness against null mutations ,”   Nature 421, 63-66 (2 January 2003)

http://www.nature.com/nature/journal/v421/n6918/full/nature01198.html

[62] http://en.wikipedia.org/wiki/Salsify

[63] Paramvir Dehal and Jeffrey L. Boore

“Two Rounds of Whole Genome Duplication in the Ancestral Vertebrate”

PLoS Biol 3(10): e314 October, 2005.

http://www.plosbiology.org/article/info:doi/10.1371/journal.pbio.0030314

[64] D. G. Knowles and A.  McLysaght  “Recent de novo origin of human protein-coding genes”

Genome Res. 2009 Oct;19(10):1752-9.

http://www.ncbi.nlm.nih.gov/sites/entrez/19726446

Stephen Salzberg commented in his “Faculty of 1000” review of this article:

http://facultyof1000.com/

Where do genes come from? Almost all reports of “new” genes describe gene creation by duplication, but somehow genes must arise in the first place from DNA that was non-coding. This fascinating finding describes the discovery of three novel genes that have arisen from non-coding DNA in the human genome after our divergence from other primates.

Knowles and McLysaght have done a comprehensive, careful analysis looking for protein-coding genes that exist in the human genome and that are missing in the chimp, gorilla, gibbon, and macaque. They found three genes that appear to have evolved protein-coding function entirely de novo: the DNA sequence of these genes is present in other primates, but contains multiple non-sense mutations that make it clear it is non-functional. They also found evidence of both transcription (from expressed sequence tag [EST] data) and translation (from mass spectrometry data) for all three proteins in multiple tissues. Although I was skeptical at first, the thoroughness of the analysis convinced me that these three genes are very likely genuine de novo genes. They also answered the question of how the genes came to be transcribed: all three are found right next to (and slightly overlapping) existing, older genes. Thus, the transcription machinery can be borrowed from the neighboring genes. This result is one answer to an objection I’ve heard from anti-evolutionists (creationists and intelligent design-ists): that we can’t show any genes that have arisen from previously non-coding DNA. Well, here they are.

[65] http://pandasthumb.org/archives/2004/08/meyers-hopeless-1.html

This link claims up to 80% amino acid replacement has been observed in a protein without its losing function.  This is not surprising, since for a long protein only a small fraction of the amino acids are in the vicinity of the active site or sites for catalysis. The remaining amino acids are likely to have little effect unless they adversely impact the protein folding.

[66] Nóbrega MA, Zhu Y, Plajzer-Frick I, Afzal V, Rubin EM., “Megabase deletions of gene deserts result in viable mice”,   Nature. 2004 Oct 21; 431(7011):988-93.

http://www.ncbi.nlm.nih.gov/pubmed/15496924?ordinalpos=4&itool=EntrezSystem2.PEntrez.Pubmed.Pubmed_ResultsPanel.Pubmed_DefaultReportPanel.Pubmed_RVDocSum

[67] Jeffrey E. Barrick, Dong Su Yu, Sung Ho Yoon, Haeyoung Jeong, Tae Kwang Oh, Dominique Schneider, Richard E. Lenski & Jihyun F. Kim, “Genome evolution and adaptation in a long-term experiment with Escherichia coli,” Nature 461, 1243-1247 (29 October 2009).    http://www.nature.com/nature/journal/v461/n7268/abs/nature08480.html   (Abstract)

http://pds16.egloos.com/pds/201001/20/79/nature08480.pdf

[68] http://en.wikipedia.org/wiki/Adaptive_mutation

[69] Chandrasekaran , C. & Betrán , E. (2008) Origins of new genes and pseudogenes. Nature Education 1(1)   http://www.nature.com/scitable/topicpage/origins-of-new-genes-and-pseudogenes-835

COPYRIGHT   SCOTT BUCHANAN 2010

Permission is granted for reproduction for non-commercial use as long as the contents are not altered and attribution is given.