One way to prevent degenerate clustering is to remove genes or samples that are changing "too little." A variation filter is used to ensure that genes or samples have a wide enough dynamic range of measurements. Dynamism can be measured in both absolute and relative terms. For example, a gene could be removed from further analysis if there is less than a two fold difference in the measurements across the samples (a relative term) or if there is less than 35 expression units of difference in the measurements across the samples (an absolute term).
make a new empty list called post_filter loop through each gene G to be filtered find the minimum expression level for G find the maximum expression level for G set the flag keep_gene to false if maximum - minimum is over 35
set keep_gene to true if maximum / minimum is over 2
set keep_gene to true if the flag keep_gene is true add this gene G to the post_filter list end loop
There are those investigators who will object to this kind of filter because there may well be biologically important changes in expression that are very small and will be removed by this variation filter. However, if functional genomics experiments are seen as a funnel (see figure 1.6) for determining which are the high-yield hypotheses to pursue for biological testing, then the high false-positive rate of genes hypothesized wrongly to be biologically relevant may prove to be costly (see figure 2.9 in section 2.1.4).
Was this article helpful?