This chapter starts with an analysis of the components in molecular biology for which analysis can be performed. We then move to defining genomic data mining and what it actually means biologically to apply these clustering techniques in this domain. Although many analytic techniques are currently available, one should clearly not apply these techniques blindly without some hypothesis, so we discuss potential broad hypotheses one can start with. We then cover techniques for data reduction and filtering. Using examples written in pseudo-code, we cover the most commonly used techniques, discussing the advantages and disadvantages of each technique. We will then address the postanalysis process, including determining significance through permutation testing. Finally, we end with some thoughts on the automatic determination of genetic networks, including bayesian networks.
Was this article helpful?