Experimental biologists and biomedical scientists who are considering the use of microarrays in their research programs have often asked us questions of the following form: "How do I best design an experiment with the following mutant mouse?", or, "We have a budget for 20 microarrays to analyze this system. How can we complete the experiment within budget?" Unlike the typical biomedical study design, many of these investigations that the experimentalist are contemplating are driven not by the testing of a particular hypothesis, but by the hope of generating hypotheses that will then lead to a directed and relevant experiment that will subsequently yield an interesting biological or biomedical insight.
We find it even more challenging when we are asked secondary questions related to the subsequent microarray data analyses such as, "What is the right machine-learning technique or clustering technique to apply to a set of data to best obtain an answer about 'some biological mechanism'?" The questioner often goes on to ask whether one particular software package works better than another, is more reliable, or whether it is worth its cost. We view these questions as well-intentioned but fundamentally flawed and ill-posed. Although a particular machine-learning algorithm may be slightly superior to another in one set of circumstances (see section 2.2.2 on the relative merits of different clustering algorithms), the key to a successful functional genomic investigation is to start with an experimental design that maximizes the possibility of observing gene expression patterns that are relevant to and informative of the biological aspect or question under investigation. If the experimental design is such, then most of the common machine-learning algorithms will reveal those patterns.
So how does one proceed to an effective experimental design for the expression of thousands of genes in order to, hopefully, answer a biological question? We have found that it useful to frame our answers within the concepts of experiment design space and expression space. Let us start with the experiment design space.
Was this article helpful?