Index

Page numbers in italics indicate figures and those in bold indicate tables, where these are separated from their main reference in the text. Affymetrix GeneChip 29-30,33,47-8,147 alternative splicing 31,143 amplification, RNA 28 analysis of variance (ANOVA) 67-9 angle distance 83,84,87 annotated gene expression data matrix 73,95-7 annotation 9,73,95-7 experimental condition 97 gene 11,73,95-6,97 sample 77,73,95-6,97 anticorrelation 87 Arabidopsis thaliana controls 23,24 arcs 95 average...

Choice of array platform

The mediod of microarray manufacture and the nature of the substrate can be used to categorise arrays. First there is distinction between the so-called cDNA arrays, where polymerase chain reaction (PCR) products are typically used as sequences on the array features (spots), and oligonucleotide arrays, or oligo-arrays, where the features (spots) are made up of oligonucleotides. Oligo-arrays may be either spotted or synthesised in situ, i.e. the features can be made from presynthesised oligomers...

Data generation processing and analysis an overview

Every high throughput experiment consists of two major parts (i) material processing and data collection-, and (ii) information processing. In a microarray experiment material processing and data collection can be broken down into five steps 2 preparation of the biological samples to be studied 3 extraction and labelling of the RNA (or a representation of the RNA) from the samples 4 hybridisation of the labelled extracts to the array 5 scanning of the hybridised array. The scanned image is the...

A1 Statistical analysis BCLUST

A program to assess reliability of gene clusters from expression data by using a consensus tree and bootstrap resampling method, as described by Zhang and Zhao (2000). BRB ArrayTools is an integrated package, developed by Richard Simon and Amy Peng at the National Cancer Institute, for the visualisation and statistical analysis of DNA microarray gene expression data. The package uses an Excel front end with integrated analytical and visualisation tools.

Time series analysis

Time series experiments provide a particular type of gene expression profile, revealing information about the order and the time scale of the expression events. There are a number of ways one can treat time series diat would not be meaningful for odier types of expression profile. An obvious example of time series analysis is directed towards finding periodicity or a trend (Figure 4.18a). This approach was used by Spellman et al. (1998) for identifying genes whose expression correlates with...

Hierarchical agglomerative clustering

Hierarchical agglomerative clustering is a process in which the data are successively fused, typically until all the data points are included. For hierarchical agglomerative clustering usually all the pair-wise distances between objects need to be defined. An agglomerative process typically starts by considering each object data point as a separate, or singleton, cluster. Starting with n objects, the result of the first iteration of clustering is that the two objects that are most similar are...

Distance and similarity measures in expression space

Ejaculation Distance

Most of die gene expression data analysis methods are based on comparisons between die gene or sample expression profiles. In order to make these comparisons first we need a way to measure similarity or dissimilarity between these objects, i.e. between vectors representing genes or samples. Often it is easier to measure die distance between the objects (vectors in our case) instead of the similarity, though one can be transformed into the other. The distance between A and B, D(A, B), is said to...

Clusfavor

CLUSter and Factor Analysis using Varimax Ordiogonal Rotation performs cluster and factor analysis of gene expression data obtained from cDNA mi-croarrays Peterson, 2002 . The user can perform cluster analysis and varimax orthogonal rotation, view dendrograms, and run factor analysis on selected cluster-specific genes. An optional output contains matrices for the input data, distance matrices, factor loadings, eigen-values, eigen-vectors, and the percentage of total variation for genes within a...

What are microarrays and how do they work

A microarray is typically a glass or polymer slide, onto which DNA molecules are attached at fixed locations called spots or features in the context of microarrays these will be treated as synonyms . There may be tens of thousands of spots on an array, each containing tens of millions of identical DNA molecules or fragments of identical molecules , of lengths from tens to hundreds of nucleotides. For gene expression studies, each of these molecules should identify a single mRNA molecule, or...

Principal component analysis eigen vectors and eigengenes

Principal component analysis PCA is one of die most common methods used for gene expression data analysis, primarily to reduce the dimensionality of data and to find combinations of experiments or genes that joindy contribute most to variability in the data. Although the techniques used in PCA are not simple, the underlying idea is quite intuitive - it is based on finding the directions in multidimensional vector space that have the largest amplitude in the dispersion of data points. These...