Identification of Protein Function

The next task is to determine the complete repertoire of protein functions. Note that one should first establish all molecular functions of a protein. This means taking stock of all the different domains contained in an individual protein using tools such SMART (annotation of functional domains) [11] and COG (clusters of orthologous genes, i.e., genes with similar function) [12]. As another example of a

tool, the AnDOM ("annotation of domains") server allows the analysis of structural domains in given protein sequences in order to identify parts of the sequence that are homologous to a known three-dimensional structure [13]. It utilizes position-specific scoring matrices (PSSMs) made from a large alignment of homologous sequences to individual structural domains (the parts of a protein folding as independent folding units with a specific function such as catalysis or cofactor binding) of known experimental structure (using PDB [protein data bank] as a reference database). Comparing of a query sequence to the stored PSSMs allows rapid identification of any structural domains homologous to a known structure domain according to SCOP, structural classification of proteins [14] (

