Data Storage and Reanalysis

Microarrays provide a vast quantity of data, and interpretation of these data is a complex subject matter in its own right. Often the results of experiments such as those above are open to question. The reinterpretation of microarray data will be key in the future as we learn the best ways to handle it. Reinterpretation is not possible, however, if the original data values for the array are not released upon publication of the results. Currently there are no universal guidelines from journals as to what is required in terms of data when a piece of research is submitted for publication, although it is common (but not compulsory) for the complete data set to be published online for access by other researchers. The format of the data is extremely varied and this is a major hurdle to their reinterpretation. Several groups have been set up which aim to standardize what is required of micro-array data when it is published. The foremost of these is MIAME (Minimum Information About a Microarray Experiment,, which aims to facilitate the access to, and usability of, microarray data. The standardization of microarray data, and the format in which they are presented, will facilitate the rea-nalysis of data and allow other researchers to use the data for meta-analyses etc. Centralized databases that are capable of compiling and storing raw microarray data from many sources, the conclusions drawn from the data, and the experimental conditions under which they were derived are slowly being developed. An increase in the use of these databases, perhaps made mandatory, is required if they are to be of real use to researchers in the future. In the meantime, however, simple guidelines for microarray experiments have been proposed [46]:

• Data should be in tab-delimited and easily accessible format, such as a text (.txt) file and not as a pdf (.pdf) file. This would allow them to be easily transported into databases and further analysis software.

• The data should include the definitive accession number for each gene as well as (or opposed to) common gene names, which are often inconsistent and regularly changing.

• Data that are not included in the analysis (e.g., values for spots lower than the background) should be identified and not merely left out.

If followed, these guidelines would make the reuse of microarray data much easier. 2.6

0 0

Post a comment