## Info

Step 2

Identify what proportion of the population is at each allele being tested. The population database is created or a pre-existing database identified. The product rule method is based on the assumption that the population shows Hardy-Weinberg equilibrium. This means that the population has random mating and thus the allele selections are statistically independent from a common gene pool, so the results are independent associations. This assumption is based on the Hardy-Weinberg Principle. The Hardy-Weinberg Principle is an elementary formula for population genetics. A chi-squared analysis can be used to determine if the population is in Hardy-Weinberg equilibrium. This test for independence cannot prove independence, but it can find dependence if it exists.

Thus, as you see in Table 12.4, the three possible genotype frequencies in the offspring are:

Table 12.4 Punnett square for Hardy-Weinberg equilibrium for alleles 'A' & 'a' at a given locus

Female

Therefore the equation for genotype frequencies is: P2 + 2pq + q2 = 1

There are several DNA databases currently, with the two largest being the UK National DNA Database (>3.4 million profiles) and the US FBI CODIS (Combined DNA Index System) database (>3.5 million profiles). The CODIS system tests for 13 STRs and the amelogenin sex test. Recently a 16-loci multiplex system has been introduced as a possible upgraded system with more loci. (Greenspoon et al., 2004) The CODIS system includes at least four population substructure reference databases.

Step 3

Calculate the frequency for each locus. For a homozygous genotype: P = p2

For the data in Table 12.3, the genotype (15, 15) at locus D3S1358 is calculated as follows. From the population reference database, the genotype 15 frequency is 17.3%, therefore:

P = p2 = 17.3% x 17.3% = 0.173 x 0.173 = 0.1732 = 0.030 = 3.0%

For a heterozygous genotype:

For the data in Table 12.3, the genotype (14, 16) at locus vWA is calculated as follows. From the population reference database, genotype 14 is estimated to be 15.7% and genotype 16 is 22.7%, therefore:

P = 2pq = 2(15.7%)(22.7%) = 2(0.157) (0.227) = 0.071 = 7.1%

The two step 3 frequency estimates can be seen in Table 12.5 and the same process would be undertaken for each locus.

By calculating all of the loci, a profile such as the one in Table 12.6 is obtained.

Table 12.5 Profile with population frequencies estimated for two loci

Locus