Analysis of STR data

Kill Your Stutter Program

Kill Your Stutter Program

Get Instant Access

The review of STR data can be relatively straightforward for pristine, single-source specimens such as those collected from convicted offenders. Still, regardless of sample type, a number of samples will present some anomalous features that may interfere with the final allele call and genotype assignment. Most anomalies (e.g. pull-up, saturated and split peaks, elevated baseline, heterozygote ratio imbalance, elevated stutter, profile slope) will present electrophoretic data signatures that can be recognized by suitably designed algorithms and quality metrics. In fact, automated quantitative assessment of quality metrics is likely to be more efficient at enforcing quality control thresholds than through human review. However, not all qualitative problem scenarios can be anticipated and coded into quality control algorithms intended for automated data review systems. In order to ensure that all anomalous results are scrutinized, the experience of trained analysts must bear on the final interpretation of the data. Historically, it has been customary for all data to be subjected to dual review by separate analysts, although the practice can become a limiting throughput factor in certain environments.

Many large data banks have developed automated data review packages, or 'expert systems', integrated in their LIS systems, and several software packages are commercially available as well. Quantitative measurements used to affirm quality control thresholds form the basis of these review applications. Detection of the presence of mixtures, genuine or due to cross-contamination within a processing batch, are additional features available in current applications. Elec-trophoretic data signatures of frequently encountered anomalies, commonly referred to as 'rules', are used to filter data sets and flag electropherograms that require a second, human review. The available published data (Kadash et al.,

2004; http://www.cstl.nist.gov/biotech/strbase/pub_pres/NIST_FSSi3_Mar2006. pdf) on the performance of these systems on convicted offender type samples indicate that, on average, 30% of allele calls are flagged by at least one rule. This significant number of flagged allele calls may reflect the removal of subjectivity of human review from the process, resulting in more consistent examination of every locus, as well as the enforcement of more stringent thresholds to ensure the capture of all problematic data. However, very high allele call concordance (>99.9% in most studies) between manual and automated review results was reported, which provides support to the feasibility of replacing a two-person review process with a single reviewer assisted by a computer algorithm.

Casework data review presents additional complexities linked to the often-compromised nature of the casework specimens. Increased slopes across profiles, dropped-out alleles and mixtures are regularly encountered with these specimens. An automated data review system configured for databanking needs may indicate which casework samples need further review.

Analysis of mixtures

Some of the most complex crime scene evidence samples from a data analysis standpoint contain biological material from multiple contributors, typically encountered with sexual assault evidence. The simplest and most common mixture scenario stems from a contamination of the differential lysis sperm fraction with the epithelial fraction originating from the victim. These are the easiest analytical circumstances since the known genotype of the victim may be subtracted from the mixed profile to establish a list of obligate alleles for the perpetrator. If the male contributor to the mixture represents the major profile in the mixture, it may prove possible to deduce a complete male genotype, which is ideally suited for a search of both the convicted offender and crime scene indices of data banks.

The next most common mixture situation involves a sperm fraction holding two male profiles - the perpetrator's and the victim's consensual partner -reflecting consensual intercourse in the hours / days preceding the assault. In the much rarer instance of collective sexual assaults, multiple male contributors may be recovered from the sperm fraction, generating a much more complex STR profile that may prove impossible to deconvolute with technologies available at this time. Under these last two scenarios, no procedure akin to differential lysis can alter the major : minor contributor ratio to facilitate discrimination between contributors.

When the investigation has produced individual(s) suspected to have contributed to a mixture, a first investigative step is to attempt to exclude the suspect(s) as contributor(s) to the mixture. In the absence of suspects, the perpetrators' genotypes must be dissected out from the mixed profile to allow for a search against a convicted offender data bank. The deconvolution of a mixture involves three steps: the ascertainment of the number of contributors, the estimation of the proportion of the individual contributions to the mixture and the establishment of a list of possible contributing genotypes that could explain the mixed profile along with probability estimates. When the ratio of components of a two-contributor mixture is more than 1 : 3, peak height or peak area information may allow an analyst to visually resolve the major and minor components of the mixture. With other ratios of contributors, the ascertainment of the number of contributors can prove a challenge in itself, and an incorrect assessment may have dramatic effects on the interpretation of testing results (Paoletti et al., 2005). Many approaches to mixture deconvolution have been proposed over the years (Weir et al., 1997; Clayton et al., 1998; Evett et al., 1998a, 1998b; Gill et al., 1998; Curran et al., 1999; Perlin and Szabady, 2001; Fung and Hu, 2002; Wang et al., 2002; Cowell 2003; Mortera et al., 2003; Bill et al., 2005; Curran et al., 2005; Cowell et al., 2006), but a consensus on mixture interpretation guidelines has yet to emerge (Gill et al., 2006; Schneider et al., 2006).

Several LIS systems designed to assist with mixture deconvolution are being evaluated in the community. Bill et al. (2005) proposed a computerized algorithm to estimate the proportion of the individual contributions in two-person mixtures and to rank the genotype combinations based on minimizing a residual sum of squares, eliminating unreasonable genotypic combinations. Perlin and Szabady (2001) and Wang et al. (2002) have proposed linear mixture analysis and least square deconvolution models, respectively, to estimate mixture proportion and enumerate a complete set of possible genotypes that may explain the mixed profiles. Cowell et al. (2006) have proposed a model unifying, under a single Bayesian network model, many of the elements of the above-mentioned models. No model currently takes into account all potential technical complications such as dropped-out alleles, stutter and excessive profile slopes.

Was this article helpful?

0 0
Stammering Its Cause and Its Cure

Stammering Its Cause and Its Cure

This book discusses the futility of curing stammering by common means. It traces various attempts at curing stammering in the past and how wasteful these attempt were, until he discovered a simple program to cure it. The book presents the life of Benjamin Nathaniel Bogue and his struggles with the handicap. Bogue devotes a great deal of text to explain the handicap of stammering, its effects on the body and psychology of the sufferer, and its cure.

Get My Free Ebook


Post a comment