Holey Fitness Landscapes
Fitness landscapes and speciation
The term "fitness landscape" (FL) was coined by Wright [1] to represent the fitness of all conceivable individuals in relation to their traits. He envisaged a rugged landscape, where peaks represented combinations of traits with high fitness separated by valleys of low-fitness trait combinations. On this landscape, selection drives populations uphill. Since Wright's work, several critiques of the FL concept have been made. Fitness landscapes are usually treated as static networks, but in reality, fitness is the ability to survive and reproduce in a dynamic environment that is constantly changing through co-evolutionary dynamics and external disturbances [2]. Some models account for this by using a FL that changes with time to reflect changes in the environment (e.g. [3]). However, the effects of genes underlying species differences and reproductive isolation are, in general, not strongly affected by the environment and FLs are, therefore, widely accepted as a useful abstraction in theoretical biology [4].
In terms of FLs, the problem of speciation is that part of a population located at a fitness peak must cross the fitness valley surrounding the peak in order to ensure that the diverged genes are not selected out. Stochastic factors such as genetic drift may act against natural selection and help overcoming fitness valleys, particularly for small populations, however, such factors can only account for certain types of speciation. It has been shown that speciation due to stochastic crossing of fitness valleys is, in general, extremely unlikely [4, 5].
Peaks in low-dimensional spaces become saddle points in higher-dimensional spaces. This led to the suggestion that highly multi-dimensional biological FLs may actually possess a single global maximum that can be reached by hill climbing from (almost) any point [6]. Although this model is useful in some cases, it does not apply in general: the local-maxima-to-saddle-point transformations are outnumbered by the appearance of new peaks in higher dimensions [4].
On a biochemical level, most genetic changes are fitness-neutral. This led to the suggestion that the fitness landscapes may be largely flat [7] and that the main force behind speciation is stochastic genetic divergence, i.e. genetic drift. However, an overwhelming proportion of biochemically conceivable genotypes are, in fact, inviable because they contain deleterious genes or groups of incompatible genes. Neutral fitness landscapes fail to account for this fact.
Holey Fitness Landscapes
A genetic model that accounts for the above limitations is the Holey Fitness Landscape (HFL) introduced by Gavrilets [4, 5, 8]. Generally, a HFL is "an adaptive landscape where relatively infrequent high-fitness genotypes form a contiguous set that expands throughout the genotype space" [5].
To build some intuition for this model, recall a few results from the percolation theory which plays an important role in the analytical treatment of HFLs. Consider a 2-dimensional lattice of cells which can assume one of two states: "black" or "white". Let every cell be black with some probability p independently of all other cells, or white with probability 1 – p. If p is small, the lattice will contain a few black cells, which may be grouped in a number of small, isolated clusters. As p increases, these clusters grow and merge. Once p crosses a certain threshold pc, most of the black cells merge together into a single giant cluster that percolates the whole lattice. For a 2-dimensional square lattice this percolation threshold is known to be pc ≈ 0.5927 [9]. However, for lattices of higher dimensions the percolation threshold lies around the reciprocal of the lattice dimension [10], meaning that for a high dimension lattice, a small proportion of black cells is sufficient for the emergence of a giant percolating cluster of connected black cells.
For the HFL model, assume that a genotype is viable with probability p independent of all other genotypes, and inviable with probability 1 – p. For the purpose of this discussion, the exact fitness of a genotype is irrelevant, thus, without loss of generality, let the fitness of all viable and inviable genotypes be 1 and 0 respectively. Assume that all possible genotypes are ordered in an abstract genotype space in which the distance between the genotypes describes the probability or ease of transformation from one genotype to another. Distance 1 means that two genotypes can be transferred into each other through a single one-point mutation. Consider the space of all possible haploid genotypes with L loci and A alleles at each locus (note that for the purposes of this model, a diploid genotype with L loci can be represented as a haploid genotype with 2·L loci [11]; thus only haploid genotypes are considered for simplicity). The dimensionality of this genotype space is D = L·(A – 1), and the corresponding percolation threshold is pc = 1/D. Note that even for short (on biological scales) genotypes a relatively small value of p will result in an extensive network of high-fitness ridges extending through the genotype space (e.g. for L = 105 and A = 5, pc ≈ 20·10-7). The traditional picture of rugged highly-dimensional FLs is therefore misleading, as these landscapes are characterised by the existence of percolating nearly neutral networks. It can be shown [4, chap. 4] that if the fitness of the genotypes is not restricted to 1 or 0, a large number of such networks emerges, each containing genotypes from a narrow fitness band. Among these networks, those with high fitness are particularly important as adaptive walks along such networks can proceed very far without any substantial loss to fitness.
Holey Fitness Landscapes in simulations
A detailed in-depth discussion on Holey Fitness Landscapes can be found in [4]. There is a number of analytic models of adaptive radiation based on HFLs (e.g. see [4, part 1]), however, incorporating HFLs into individual-based simulation models presents significant difficulties. An analysis of these difficulties and a discussion of how they may be overcome has been published in [11]. That paper also provides a detailed numerical analysis of the HFL structure under a range of parameters. The strategy for incorporating HFLs into individual-based simulations presented in [11] enabled numerical work based on HFL generics to be undertaken. For instance, this work includes a numerical investigation [12] of HFL genetics ability for maintaining postzygotic reproductive isolation under migration and mixing. Similar ideas have also been applied outside traditional evolutionary biology. For instance, the potential of using HFL-like genetics for the prevention of premature convergence in evolutionary optimisation algorithms has been investigated [13].