Bioinformatics

The comparison of genotypes is an intrinsic component of all forensic genetic analysis. Comparative analysis can link crime scene to suspect, victim to relative and reveal cases that share perpetrators. Most computations performed in data banks involve searches for perfect or partial genetic matches between a query and an entry in one of the data bank indices. System innovations currently focus on 'familial searching' algorithms (Bieber and Lazer, 2004; Bieber, 2006; Bieber et al., 2006). As nearly half of jailed inmates in the USA are reported to have at least one close relative who has been incarcerated (Bieber et al., 2006), the use of likelihood ratio computations to detect potential parent-child or sibling relationships between an unknown perpetrator's genotype recovered in crime scene evidence and that of a catalogued offender offers new investigative possibilities, and could lead to a substantial increase in cold hit rates, if implemented.

Over the last decade, the forensic community was confronted with several challenges, unique in scope of work and technical complexity, in the wake of large-scale MFIs. From airliner mishaps to the World Trade Center attacks to tsunamis, large increases in the number of victims and extreme fragmentation and/or decay of recovered remains have led to a considerable paradigm shift in the way that identification initiatives are conducted and in the laboratory infrastructure needed to handle such events. DNA typing has taken a major role in large-scale events involving high body fragmentation. The complexity of these initiatives is such that bioinformatics have become a crucial component for identification mandates to be met (Brenner and Weir, 2003; Cash et al., 2003; Leclair, 2004; Leclair et al., 2004a; Biesecker et al., 2005; Budowle, 2005; Leclair et al., 2007).

All MFIs are unique in the circumstances of the incident, and the required identification solutions will vary between incidents. There are common limitations to these events that create unique demands on LIS applications to support DNA-based identifications. The first limitation pertains to unavailability of a reference genotype, usually generated from trace biological material recovered from a personal effect (i.e. personal hygiene items), for a proportion of the victims of MFIs as probative personal effects are often destroyed in many events (e.g. air crashes). This limitation makes it necessary to supplement direct matching algorithms with computationally-intensive large-scale kinship analysis and parentage trio searching routines. The second limitation is linked to the obligation of performing parentage trio searching routines, and results from the inaccuracy of some reported biological relationships as a consequence of incorrect information capture. Traditional triangulation methods, by which the third member of a parentage trio can be quickly located within a genotype data set with the help of a list of obligate alleles derived from the two surviving members of the trio, rely on the accuracy of next-of-kin self-reported biological relationships. An alternate procedure is required to detect parentage trios solely on a genetic basis, immune to sample accessory information errors. This is especially important for missing persons databasing (MPD) initiatives as, contrary to MFIs, MPDs are open-ended, long-term initiatives and, as such, problematic data may go undetected indefinitely. The alternative procedure is also crucial for events where families are among the victims (e.g. most airliner mishaps) as pedigrees must be re-assembled, solely on a genetic basis, from within the victims' genotype data set (Leclair et al., 2004a). This alternative procedure calls for the evaluation of every possible parentage trio involving each victim and every pair of related and unrelated next-of-kin, an evaluation that may amount to a very substantial and escalating computing workload.

Many additional event-related limitations add more complexity still. More often than not in MFIs, remains recovery is partial, a proportion of recovered remains has incurred significant thermal / chemical / bacterial decay leading to the production of partial genotypes during analysis, many potential contributors to parentage trios involving older victims may be pre-deceased and many victims may have few next-of-kin who can be used as genotypic references. Computing solutions are expected to provide ways to mitigate the lack of complete STR profiles from the recovered remains and the absence of important reference genotypes.

Finally, the scale of the incident has significant impact on computing parameters. As pair-wise comparisons are the mainstay of computing efforts in MFI victim identification, the computing load increases nearly exponentially with the number of victims. These complicating factors add considerable complexity to the design of bioinformatics tools required to produce the necessary identification inferences.

As much as the World Trade Center appears to have been the most demanding identification initiative to date, the circumstances of the incident could have substantially increased the complexity of DNA-based identifications. The data processing contingencies would have been very different if the collapse of the World Trade Center towers had trapped the normal weekday occupancy of 50 000 instead of 2749, or had included families (Leclair et al., 2007). High body fragmentation / high remains dispersal incidents involving 100 000 or 1 000 000 casualties as a result of natural, accidental or terrorist activity are within the realm of possibilities. Bioinformatic tools that can handle this sample load have been developed, however it is unclear whether the STR loci currently in use in forensics would provide sufficient discrimination power to support large-scale kinship analysis and parentage trio searching algorithms. These scenarios can be simulated and conclusions drawn as to the genetic marker set and computing capabilities that would be required in varying circumstances.

Was this article helpful?

0 0

Post a comment