## Automated Analysis of Data with an Unknown Number of Exponentials

A frequently encountered problem in luminescence or radioactive decay experiments is that the number of decaying species in a sample is not known. However, the number of decays can be found by a computer program. An automated method for analyzing exponential decay data with up to three contributing components was developed  based on deviation pattern recognition.

Deviation pattern recognition was developed by Meites  as a basis for decisions concerning the best model for a set of data. The technique employs the shapes of residual or deviation plots obtained after a regression analysis (see Chapters 2 and 3) to make automated decisions about goodness of fit. Recall that residual plots of dev/SD vs. Xj will have a random distribution of points around devy/SD = 0 if the regression model fits the data within the standard error in measuring the signal, y. If the model does not describe the data correctly, the residual plot will have a distinct nonrandom pattern. Scatter will be superimposed on this pattern reflecting the random noise in the signal.

Deviation patterns for fits onto a given model can be generated by nonlinear regression analysis of noise-free theoretical data sets representing other models in a library of models . Information about these can be coded into the computer and used as a basis for construction the basis of a "decision tree" on which to base decisions in an expert system .

The multiexponential decay problem does not require the full power of deviation pattern recognition, because it is a linear classification problem. That is, we can test models with one, two, and three exponentials in sequence and accept the first model that gives an acceptable fit to the data based on deviation plots and other statistical criteria (Section C.3.1).

A partial program listing for automated analysis of exponential decay data is given in Box 13.1. This program  assumes that the data were collected by using a counting detector, and the regression analysis is therefore weighted for Poisson statistics (Section 2.B.5). It also assumes that all extraneous signals, such as a source light pulse for example, have been removed prior to the analysis of the data. The computer program sequentially fits models to the experimental data involving one, two, and three exponentials, stopping when an acceptable model is found. The choice of the correct model is based on achieving predefined criteria of randomness in smoothed residual plots.

The program in Box 13.1 estimates initial values of yt and r, by preliminary linear regressions. At the beginning of the program, a linear regression of all data as In y vs. t are used to estimate initial values of yx and tj, and these values are employed to begin the nonlinear regression analysis onto the model in Table 13.1 with p = 1.

Box 13.1 FORTRAN code for automated analysis of exponential decay data. (This listing should be modified for linkage to an appropriate nonlinear regression subroutine. The nonlinear regression program referred to in the listing is denoted NONLINEAR REGRESSION.)

 LINE(S) STATEMENT 1 IMPLICIT INTEGER (T) 2 INTEGER P,PLOT 3 LOGICAL*l HEAD(80) 4 DOUBLE PRECISION X,Y,Q,V,W,Y1,U,G, D,DO, DABS,DLOG, 5 LDSIGN,DFLOAT,DEXP, DSQRT,C,SS,S1,M,S,SS2, AOUT1, AOUT3 , 6 LR,DARSIN,DIFF,ZX,S9 7 COMMON X(IOO),Y(100),Q(100),V(12),W(100),Yl(100), 8 LPT{lOO),U(12),DIFF(100),S,IO,N1,T,Tl,Q1,JA,DO,J8,IWT,19,J9 9 DATA LINP/5/,LOUT/6/ 10 1 READ (LINP,10,END=15) HEAD 11 10 FORMAT(80A1) 12 9100 FORMAT(311) 13 9200 FORMAT(213,14,D6.4) 14 READ(LINP,9100) T,Tl,PLOT 15 READ(LINP,9200) P,IO,Nl,DO 16 READ(LINP,*) IWT 17 J9=0 18 JA=0 19 DO 70 N=l,Nl 20 70 READ(LINP,*) Y(N) 21 DO 90 N=10, 64 22 90 X(N-9)=N*20. + 10. 23 CALL GRAPH 24 75 CALL LINREG 25 CALL NONLINEAR REGRESSION (P,DO,HEAD, PLOT) ( SUBROUTINE FOR NONLINEAR REGRESSION) 26 GO TO 74 27 74 DO 76 N=1,N1 28 X(N)=X(N) 29 IF(N)=Y(N) 30 76 CONTINUE 31 10=10 + 2 32 IF(10-7) 79,79,15 33 79 DO 80 1=1,10 34 D(I)=DABS(V(I))/2 35 80 CONTINUE 36 J9=J9 + 1 37 GO TO 75
 38 15 WRITE(LOUT,200) 39 CALL EXIT 40 200 FORMAT('0+++++++++PR0GRAM '11TERMINATION++++++++'/ 41 1 •O END OF FILE ENCOUNTERED DURING DATA READ') 42 STOP 43 END 44 SUBROUTINE GRAPH 45 LOGICAL *1 LINE165),BLANK,STAR,CHRS(IO),HEAD(80) 46 INTEGER N,PT,J,IDIG 48 DOUBLE PRECISION X, Y, Q, V, W, Y1, U, G, D, DO , DABS, DLOG, LDSIGN,DFLOAT,DEXP,DSQRT,C,SS,SI,M,S,SS2,AOUTl,AOUT3, 49 LR,DARSIN,DIFF,ZX,S9 50 COMMON X(IOO),Y(100),Q(100),V(12),W(100),Y1(100), 51 LPT(100),U(12),DIFF(100),S,10,N1,T,Tl,Ql,JA,DO,J8,IWT,19,J9 52 COMMON/COMDEV/AOUT2 {100 ) 53 DATA CHRS/,0,,'1',,2',13,,'4,,,5 ,'6', '7', '8', '9', 54 L'*'/ 55 DATA BLANK/' '/,STAR/'*'/ 56 DATA LINP/5/,LOUT/6/ 57 WRITE(LOUT,2) 58 2 FORMAT(5X,22HGRAPH OF LN(I) VS TIME) 59 DO 1 N=1,N1 60 1 AOUT2(N)=DLOG(Y(N)) 61 I8=60/INT((AOUT2(I)-AOUT2(Nl))*10.) 62 N3=N1 + 1 63 DO 840 N2=1,N3 64 DO 810 J=1,65 65 810 LINE(J)=BLANK 66 N=Nl-N2 + 1 67 IF(N) 827,827,812 68 812 PT(N)=INT(AOUT2(N)*10) 63 LINE(62)=STAR 70 I9=INT(ABS(AOUT2(I)*10))*I8 71 17=19-60 72 IF (I9-IABS(PT(N))) 825,820,820 73 820 J=PT(N)»I8-I7 74 LINE(J)=STAR 75 825 IF (N-(N/10)*10) 830,826,830 76 826 LINE(62)=CHRS(N/10 + 1) 77 LINE(63)=CHRS(1) 78 GO TO 830 79 827 DO 829 J=1,62 80 829 LINE(J)=STAR 81 830 WRITE(LOUT,855) LINE 82 855 FORMAT(1H,6X,65A1)

Box 13,1 (continues)

 83 840 CONTINUE 84 STOP 85 END 86 SUBROUTINE LINREG 88 DOUBLE PRECISION X,Y,Q,V,W,Y1,U,G,D,DO , DABS,DLOG, LDSIGN, DFLOAT,DEXP,DSQRT,C,SS,S1,M,S,SS2,AOUTl,AOUT3, 89 LR,DARSIN,DIFF,ZX,S9 90 COMMON X(IOO),Y(100),Q(100),V(12) , W(100),Yl(100), 91 LPT(100),U(12),DIFF(100),S,10,N1,T,T1,Q1,JA,DO,J8,IWR,19,J9 92 10 IF (J9) 13,13,21 33 21 IF(J9-1) 80,11,22 94 22 IF(J9-2) 80,14,23 95 23 IF(J9-3) 80,15,24 56 24 IF (J9-4) 80,11,25 97 25 IF(J9-5) 80,16,26 98 26 IF(J9-6) 80,17,27 99 27 IF(J9-7) 80,18,28 100 28 IF(J9-8) 80,11,11 101 13 N2=l. 102 N3=N1 103 GO TO 12 104 14 N2 = l 105 N3=Nl/2 106 GO TO 12 107 15 N2=Nl/2 108 N3=N1 109 GO TO 12 110 16 N2 = l 111 N3=Nl/3 112 GO TO 12 113 17 N2=Nl/3 114 N3=2*Nl/3 115 GO TO 12 116 18 N2=2*Nl/3 117 N3=N1 118 GO TO 12 119 11 RETURN 120 12 DO 20 I=N2,N3 121 Yl(I)=DLOG(Y(I)) 122 W(I)=1. 123 IF(IWT.EQ.l) GO TO 20 124 W(I)=1./Y1(I) 125 20 CONTINUE
 126 GO TO 70 127 70 SUM=0 128 SUMX=0. 129 SUMY=0 130 SUMXY=0. 131 SUMXX=0. 132 SUMYY=0. 133 DO 100 I=N2,N3 134 SUM=SUM+W(I) 135 SUMX=SUMX+X

Box 13.1 (continues)

172 GO TO 50

175 WRITE{6,66)V(1),STDA,V(2),STDB,SIGMA,SRHO

177 GO TO 10

178 80 STOP

179 57 FORMATi//' STATISTICAL WEIGHTING, W(I)=1./Y(I)')

181 B6 FORMAT (//' FOR Y=A + BX, A= ' , 1PD12 . 3 ,

182 1 ' +/- '1PD10.2,//21X, B=',1PD12.3,' +/- ,1PD10.2.

184 3 //' AN.D THE CORRELATION COEFFICIENT IS ',1PD12.3//}

186 END

187 SUBROUTINE NONLINEAR REGRESSION (P, DO , HEAD, PLOT) 352 456 AOUT2 (N) =AOUTl/!SS2*DSQRT (Y(N) ) )

360 DO 500 1=1,N5

361 SUM=0

362 SUM=SUM + AOUT2(I) + AOUT2Q + 1) + AOUT2(I + 2)

363 SUM=ABS(SUM)

364 WRITE(6,1461) SUM

365 IF(SUM-3.75) 500,500,490

(THRESHOLD TEST - THRESHOLD MAY BE CHANGED BY THE USER)

366 1461 FORMAT (5X,lPD13. 6)

367 500 CONTINUE

368 JA=JA+1

423 SUBROUTINE ERRSUM(II,*)

(USE THE FOLLOWING MODIFICATION FOR POISSON WEIGHTING)

(A SUBROUTINE IS REQUIRED HERE TO MAKE A GRAPH OF THE DEVIATION PLOT)

 490 (TO BE INSERTED INTO THE NONLINEAR REGRESSION SUBROUTINE, V(K) ARE THE MODEL PARAMETERS) 491 DOUBLE PRECISION X,Y,Q,V,W,Y1,U,G,D,DO,DABS,DLOG, 492 LDSIGN, DFLOAT, DEXP, DSQRT,C , SS, SI,M, S.SS2 , AOUTl,AOUT3, 493 LR, DARSIN, DIFF, ZX, S9 494 COMMON X{100),Y(100),Q(100),V(12),W(100),Y1(100), 495 LPT(100),U(12),DIFF(100),S,10,N1,T,Tl, Q1, JA,DO,J8,IWT,19,J9 496 IF(10-3) 24,24,27 497 24 IF(V(2)-10.) 3,3,25 498 3 DO 4 N=1,N1 499 4 Q(N)=Q(N)*1.01 500 25 DO 26 N=1,N1 501 26 Q(N)=V(1)*DEXP(-X(N)/V(2)) 502 RETURN 503 27 IF(10-5) 29,29,33 504 29 IF(V(2)-10.) 6,6,5 505 5 IF(V(3)-.001) 6,6,8 506 8 IF(V(4)-10.) 6,6,30 507 6 DO 7 N=1,N1 508 7 Q(N)=Q(N)*1.01 509 30 DO 31 N=1,N1 510 31 Q(N)=V(1)*DEXP(-X(N)/V(2)) + V(3)»DEXP(-X(N)/V ( 4) ) 511 RETURN 512 33 IF(IO-7} 37,37,36 513 37 IF(V(2)-10.) 9,9,12 514 9 V(2)=DABS(V(2)*50.) 515 12 IF(V(4)-10.) 10,10,13 516 10 V(4)=DABS(V(4)*50.) 517 13 IF(V(6)-10.) 11,11,34 518 11 V(6)=DABS(V(6)* 50.) 519 34 DO 35 N=1,N1 520 35 Q(N)=V(1)*DEXP(-X(N)/V(2)) + V(3)*DEXP(-X(N)/V(4)) + 521 LV(5)*DEXP(-X(N)/V(6)) 522 RETURN 523 36 STOP 524 END

Note: that Subroutine CALC is to be inserted into the nonlinear regression subroutine used with this listing

Box 13.1 (continues)

DOCUMENTATION:

1 MAIN PROGRAM

2-9 DEFINE VARIABLES IN DOUBLE PRECISION. COMMON

STATEMENT. ETC.

11-13 FORMAT STATEMENTS FOR HEADING, T, T1, PLOT. AND P, 10, Nl, DO.

(RELATIVE ERRORS) Tl=l (SIGNS CERTAIN) T1=0 (SIGNS

UNCERTAIN P=1 (DEVIATION PLOT DESIRED) P=0 (NO DEVIATION PLOT).

15 READ P. 10, Nl, DO. P=NUMBER OF CYCLES BETWEEN PRINTOUT OF THE ERRORSUM; 10 REPRESENTS THE NUMBER OF PARAMETERS; Nl THE TOTAL NUMBER OF DATA POINTS; DO THE ESTIMATED RELATIVE ERROR IN THE PARAMETERS.

16 READ IWT, A WEIGHTING FACTOR; 1=UNIT WEIGHTING, 0=STATISTICAL WEIGHTING.

17 READ J9. THE COUNTER THAT INDICATES WHICH SECTION OF DATA TO USE FOR LINEAR REGRESSION. I.E. FIRST 1/3. SECOND 1/3, OR THIRD 1/3 OF DATA TO DETERMINE THE PARAMETER VALUES V(l), V(2); V(3), V(4); AND V(5). VI6), FOR A THREE EXPONENTIAL CURVEFIT.

18 READ JA. THE COUNTER FOR THE NUMBER OF TIMES NONLINEAR REGRESSION HAS BEEN PERFORMED.

### 19-22 READ Y DATA. CALCULATE X DATA ACCORDING TO

CHANNEL NUMBER AND THE TIME EACH CHANNEL REPRESENTS. EXAMPLE: IF EACH CHANNEL REPRESENTS 10 MSEC AND CHANNELS 90 TO 190 ARE BEING READ, THE COMPUTER STATEMENT SHOULD READ: DO 90 N=90,190 90 X(N-89)=N*10.

23 CALL "GRAPH" TO GRAPH LOG-(INTENSITY) VS^. TIME.

THIS MAY BE AN AID TO DETERMINING HOW MANY

EXPONENTIAL DECAYS THE DATA CONTAINS.

24 CALL "LINREG" WHICH DOES A LINEAR REGRESSION OF

LOG,,(INTENSITY) VS^. TIME USING ALL THE DATA TO DETERMINE THE PRE-EXPONENTIAL FACTOR AND FLUORESCENCE LIFETIME (FOR I EXPONENTIAL DECAY); THE DATA IS DIVIDED IN HALF TO CALCULATE TWO PRE-EXPONENTIAL FACTORS AND TWO FLUORESCENCE LIFETIMES (FOR 2 EXPONENTIAL DECAY); THE DATA IS DIVIDED INTO THIRDS TO CALCULATE 3 PRE-EXPONENTIAL FACTORS AND 3 FLUORESCENCE LIFETIMES (FOR 3 EXPONENTIAL DECAY).

25 AFTER DETERMINING ROUGH VALUES FOR PRE-EXPONENTIAL FACTORS AND FLUORESCENCE LIFETIMES. SEND THE INFORMATION TO (NONLINEAR REGRESSION SUBROUTINE).

26 AFTER CURVEFIT HAS BEEN COMPLETED, GO TO STATMENT 74.

27 - 30 (STATEMENT 74) ESTABLISH VALUES FOR X(N), AND Y(N).

31 INCREASE BY TWO THE NUMBER OF PARAMETERS.

32 IF THE NUMBER OF PARAMETERS IS GREATER THAN 7. PRINT "PROGRAM TERMINATION", IF NOT, GO TO STATEMENT 79.

33 - 35 (STATEMENT 79) ESTABLISH NEW VALUES FOR RELATIVE

ERROR IN THE PARAMETERS.

36 INCREMENT COUNTER FOR LINEAR REGRESSIONS

37 START FITTING SERIES AGAIN 38-43 STOP PROGRAM

44 SUBROUTINE GRAPH (GRAPHS LOG£(INTENSITY) VS. TIME)

45 THESE ARE LOGICAL VARIABLES USED IN GRAPH SUBROUTINE

46 DEFINE INTEGERS

47 - 52 DEFINE DOUBLE PRECISION VARIABLES

53 & 54 DEFINE CHARACTERS USED IN GRAPH SUBROUTINE

Box 13.1 (continues)

55 DEFINE BLANK AND STAR

56 LINP=5, LOUT-6

59 S 60 DEFINE A0UT21N) = LOGgfINTENSITY)

61 CALCULATE 18. THAT IS USED TO SCALE THE GRAPH.

62 CALCULATE N3 = TOTAL NUMBER OF LINES IN THE GRAPH

63 START THE LOOP TO PRINT THE GRAPH.

64 & 65 LEAVE LINE BLANK.

66 CALCULATE A VALUE FOR N, THE DATA POINT NUMBER.

67 IF N IS LESS THAN 0 GO TO STATEMENT 827 WHICH SAYS TO PRINT A LINE OF ASTARISKS. OF N IS GREATER THAN 0 GO TO STATEMENT 812.

68 MAKE AN INTEGER OUT OF LOGE<INTENSITY)*10.=PT(N)

69 POSITION AN ASTARISK IN THE RIGHT HAND MARGIN OF THE GRAPH.

70 - 73 SCALE THE DATA TO FIT THE GRAPH.

74 POSITION THE ASTARISK WHERE THE DATA POINT LIES ON

THE GRAPH.

75-77 PRINT THE DATA POINT NUMBER IN THE MARGIN EVERY 10TH POINT.

78 GO TO 830 WHICH SAYS TO PRINT "LINE". LINE CONSISTS OF AN ASTARISK AT THE CORRECT POSITION FOR THE DATA POINT VALUE AND AN ASTARISK (OR NUMBER) IN THE MARGIN.

79 & 80 IF "LINE" IS THE LAST LINE, PRINT A LINE OF

ASTARISKS AT THE END OF THE GRAPH.

81 & 82 (STATEMENT 830) PRINT "LINE".

83 CONTINUE PROCEDURE TO CALCULATE AND PRINT ANOTHER

LINE.

84 & 85 END PROGRAM

86 SUBROUTINE LINREG ( DETERMINES PRE-EXPONENTIAL FACTORS AND FLUORESCENCE LIFETIMES)

87 - 91 DOUBLE PRECISION VARIABLES AND COMMON STATEMENT.

92 - 119 THESE STATEMENTS DIRECT THE PROGRAM TO DIVIDE THE DATA ACCORDING TO THE NUMBER OF NONLINEAR REGRESSIONS WHICH HAVE BEEN DONE.

120 - 126 DETERMINE THE WEIGHTING FACTORS.

149 - 172 CALCULATE VALUES FOR PRE-EXPONENTIAL FACTORS AND FLUORESCENCE LIFETIMES.

173 - 186 OUTPUT FOR THE SUBROUTINE. GIVE THE TYPE OF WEIGHTING, INCREMENT J9 - THE COUNTER WHICH INDICATES WHICH SECTION OF DATA TO USE.

187 SUBROUTINE CFT4A - STATEMENTS ARE THE SAME AS THEY

ARE IN THE ORIGINAL EXCEPT FOR THE LINES INDICATED BELOW.

352 A0UT2(N) = (Q(N)-Y<N))/SS2*SQRT(Y<N)) WHERE SS2 IS

THE STANDARD DEVIATION OF REGRESSION. DIVISION BY SQRT(Y(N)) IS FOR STATISTICAL WEIGHTING.

360 - 368 SUM UP 3 SUCCESSIVE DEVIATIONS AND COMPARE THE SUM TO 4. IF THE SUM > 4, THIS INDICATES THAT THE PLOT IS NONRANDOM AND THE PROGRAM GOES TO STATEMENT 490 (WHICH INCREMENTS COUNTER JA, EQUAL TO THE NUMBER OF CURVEFITS, BY 1 AND RETURNS TO THE MAIN PROGRAM TO START THE NEXT CURVEFIT). IF THE SUM < 4, INDICATING THAT THE DEVIATION PLOT IS RANDOM, THE PROGRAM GOES TO STATEMENT 500 (WHICH INCREMENTS COUNTER JA, BY 1 AND CONTINUES ON TO PRINT OUT THE STATISTICS AND THE DEVIATION PLOT).

423 SUBROUTINE ERRSUM

Box 13.1 (continues)

437 THIS LINE IS MODIFIED FOR STATISTICAL WEIGHTING

449 SUBROUTINE PLOT - UNCHANGED

490 SUBROUTINE CALC

491 - 495 DEFINE VARIABLE TYPES. DIMENSION AND COMMON

STATEMENTS

496 DETERMINE THE NUMBER OF PARAMETERS. IF < 3. GO TO STATEMENT 24. IF > 3, GO TO STATEMENT 27.

497 (STATEMENT 24) IF PARAMETER V(2) IS < 10 GO TO STATEMENT 3 WHICH INCREASES THE ERROR SUM. IF THE PARAMETER > 10 GO TO STATEMENT 25.

498 & 499 (STATEMENT 3) INCREASE THE ERROR SUM.

500 - 502 (STATEMENT 25) CALCULATE THE FLUORESCENCE INTENSITY FOR 1 EXPONENTIAL DECAY AND RETURN TO CFT4A SUBROUTINE.

503 (STATEMENT 27) CHECK THE NUMBER OF PARAMETERS. IF THE NUMBER < 5 GO TO STATEMENT 29. IF THE NUMER > 5 GO TO STATEMENT 33.

504 - 508 (STATEMENT 29 AND SEQ.) CHECK PARAMETERS TO BE SURE

THEY AREN'T UNREALISTICALLY SMALL OR NEGATIVE. IF THEY ARE. THE ERROR SUM IS INCREASED.

509 - 511 CALCULATE FLUORESCENCE INTENSITY FOR 2 EXPONENTIAL DECAY AND RETURN TO CFT4A SUBROUTINE.

512 CHECK TOE NUMBER OF PARAMETERS. IF THE NUMBER

< 7 GO TO STATEMENT 37. IF THE NUMBER > 7 GO TO STATEMENT 36.

513 - 518 CHECK EXPONENTIAL LIFETIME PARAMETERS. IF THEY ARE

UNREASONABLY SMALL THEIR VALUES ARE INCREASED.

519 - 522 CALCULATE FLUORESCENCE INTENSITY FOR 3 EXPONENTIAL DECAY AND RETURN TO CFT4A SUBROUTINE.

523 & 524 (STATEMENT 36) TERMINATE THE PROGRAM IF THE DATA

HAS BEEN FIT TO 3 EXPONENTIALS AND NEITHER THAT NOR THE PREVIOUS DEVIATION PLOTS HAVE BEEN RANDOM.

Note that subroutine CALC is to be inserted into the nonlinear regression subroutine used with this listing

Box 13.1 (continued)

Following this first nonlinear regression, a weighted residual or deviation plot of dev, = [y;(meas) - y;(calc)]/[SD{_y;(meas)}1/2] is constructed. From this plot, a smoothed deviation plot is constructed by using a moving three point sum. That is, smoothing defined by Ds k = dcv*. , + dev* + dev*+] for the &th data point is done throughout the entire set of devy vs. t. An example of such a plot is given in Figure 13.1. This plot denotes a poor lit, as indicated by its nonrandom pattern.

Smoothing is used in the program to minimize the influence of noise. It allows the computer to identify nonrandom deviation plots more efficiently. The program was given a definition of a peak in a smoothed deviation plot to identify nonrandom plots. Based on studies with noisy theoretical data, a peak is defined within the program as any value of Ds which exceeds 3.75 SD. This is called the threshold value for a peak. Statistical considerations based on the normal curve of error showed that the probability of finding one Ds not representing a true peak but exceeding 3.75 SD was <1%. Hence, there is about a 1% chance for classifying a plot as nonrandom when in fact it is random.

The absolute values of Ds are compared to the threshold of 3.75 SD. If the threshold is not exceeded in a smoothed deviation plot, it is considered random and the p = 1 hypothesis is accepted. However, if the threshold is exceeded by one or more Ds, the smoothed deviation plot is considered nonrandom, and the p = 1 hypothesis is rejected. This is seen to be the case for the data in Figure 13.1, where many Ds values are smaller than -3.75 SD in the region near 600 |iis.

Suppose that the preceding scenario has been followed and the single exponential (p = 1) fit has been rejected. In preparation for the regression analysis onto the double exponential (p = 2) model, the data are arbitrarily split into two parts along the t axis. Semilogarithmic linear regressions are o