We recognize that the readership of this book will be varied due to the intrinsically multidisciplinary nature of the functional genomics enterprise (as will be emphasized in the introductory chapter). Accordingly we outline the content of the following chapters so that readers may choose for themselves the path that suits them. Nonetheless, our intent and contention is that the current ordering of the chapters provides the most efficient way of acquiring the content of this book.
Introduction. Here we establish the motivation and the scope of this book and touch upon substantial obstacles to success in the successful application of bioinformatics to an integrative genomics. The notion of an interdisciplinary functional genomics pipeline is introduced. We also review which kind of readers might find this book worthwhile. The promise and limitations of functional genomics techniques, the nature of various kinds of genomic data, and the central role played by the discipline of bioinformatics are outlined. For those who have a limited background in biological sciences, there is a subsection on the basic minimum of molecular biology concepts that will be needed to grasp the the following chapters.
Chapter2. Experimental Design. This chapter develops a framework for approaching the design of microarray-driven functional genomics experiments. Very little here is quantitative or mathematical. Rather the emphasis is on ways of thinking about the design of experiments and how it might impact the yield of these experiments. We address challenges that are particular to computer scientists (e.g., defining a figure merit for the performance of the bioinformatics algorithms) and to biologists (e.g., discarding potentially valuable data using formal decision theory because of the scale issues in massively parallel data acquisition using noisy measurement devices), respectively. In exploring the design issues we introduce the functional genomics clustering dogma, the broad machine-learning categories of supervised and unsupervised learning, and the nature of the analyses developed using these techniques.
Chapter 3. Microarray Measurements to Analyses. We lay the foundations for performing analyses of microarray data sets. This is the first of the more quantitative and mathematical chapters. We start with a discussion of the acquisition of digital data from the two most widely employed classes of microarrays. Then we consider the two most generic problems of comparing gene expression within a single microarray, i.e., intra-array analyses, and comparing expression across microarrays, i.e., inter-array analyses; in so doing, we introduce the fundamental concept of (dis)similarity and similarity measures and the several kinds of such measures. These measures become the building blocks for the genomic data-mining techniques described in the following chapter.
Chapter 4. Genomic Data-Mining Techniques. When gene expression is measured in more than two samples, gene expression patterns have to be analyzed using methods that consider the coordinated interactions of genes across multiple conditions. This chapter assesses the components of biomedical experiments that can be included in a data-mining investigation. We then cover the most commonly used analytic techniques, discussing the advantages and disadvantages of each technique, as well as the postanalysis process. Where appropriate, we provide pseudocode that will allow readers with some training in computer science to understand the details of the most often used and cited data-mining algorithms. The emerging field of genetic network reverse engineering is also introduced here.
Chapter 5. Bio-Ontologies, Data Models, Nomenclature. This chapter addresses possibly the least exciting but the most pressing bioinformatics need for genomic research: creating and using comprehensive annotations of gene function, storing and organizing microarray expression data, and ensuring standardized access to these data. We review current efforts to create formalized systems of description of gene function and the various kinds of "ontologies" that support these descriptions. The challenge to design "standardized data models" for the storage of microarraydata is addressed and the principal contenders claiming to be this standard are reviewed. Naming schemes—nomenclatures—most applicable to gene expression studies are described. Nomenclatures, data models, and ontologies are placed in a perspective of the general problem of analyzing the results of functional genomics experiments. Tools that leverage these standardization efforts and the on-line published literature are also described.
Chapter 6. From Functional Genomics to Clinical Relevance: Getting the Phenotype Right. Here we address the process of translating the functional genomics research agenda into one of clinical relevance. We place in this perspective the value and deficiencies of electronic medical records and standardized clinical vocabularies. Although by no means comprehensive, we provide the highlights of the privacy issues (e.g., the implications of the Health Insurance Portability and Accountability Act, anonymization, cryptographic identifiers, etc.) that are most likely to have an impact on the clinical application of genomic technologies.
Chapter 7. The Near Future. As the techniques and goals of functional genomics are in rapid flux, we engage in some short-term forecasting to guide readers planning in this time window. Microarray technologies being developed and recently released are previewed. In this context, the problem of comparing expression measurements across generations of microarray measurement platforms is appraised. More broadly, the different kinds of software required for the successful functional genomics enterprise are described. Finally, a model to meet the training needs of this new discipline is outlined.
Was this article helpful?