The explosion of biological sequence data and many of the problems posed by it require tremendous computational resources to solve exactly. Thus, many ofthe interesting problems arising in the analysis of biological sequence data are in the class of NP-hard. Evolutionary algorithms are one possible tool for addressing such problems. These algorithms use the techniques of survival of the fittest and natural selection to evolve solutions to particular problem instances.
Many sequence analysis problems are optimization problems. The formulation of an optimization problem is based on two factors, namely the search space which is the set of all possible solutions and the fitness function which is the measure used to determine how good a particular answer is.
The evolutionary approaches explore the search space of a problem by working from individuals. Each individual represents an encoding of a potential solution. They modify individuals using artificial operators inspired by natural mechanisms. These individuals compete on the basis of their value under the fitness function. Finally, the selected ones reproduce and live into the next generation .
The fitness function embodies the essential aspects of the problem to be solved. It is desirable for individuals with significant shared characteristics to have similar fitness values. The fitness function should point the GA toward the correct value, rather than away from it. In contrast, choosing a representation for a problem is a critical design decision and defines the search space. The representation specifically should help preserve the building blocks of the problem.
The interaction of each of the GA component, affecting the ability of the GA to search the space of available solutions and the design of an efficient GA to solve a particular problem, necessitates some understanding of how the individual components will work together. Their interaction is the primary driver of effective performance of the GA. The complex interactions of the GA components and the generality of the approach are both a strength and a weakness. Therefore, proper understanding of the approach allows one to avoid the weakness and exploit the strength of the GA approach.
Was this article helpful?