Math could help solve forensic genetic cases 10 times faster

Math could help solve forensic genetic cases 10 times faster

Researchers have a new strategy that could speed up cold case investigations.

Solving crimes with forensic genetic genealogy is slow and complicated. The researchers’ new mathematical analysis could decipher cases 10 times faster.

For nearly 37 years, she was known as the Buckskin Girl – an unnamed young murder victim found outside of Dayton, Ohio, wearing a deerskin poncho. Then, in April 2018, police announced that the mystery of his identity had been solved. Her name was Marcia L. King, and she had been identified by linking a snippet of her DNA to one of her cousins.

It was one of the first high-profile cases in which this investigative method had been used to identify an unclaimed body. Two weeks after King’s name was revealed, California police announced that they had used similar techniques to track down the Golden State Killer. Suddenly, the combination of genetic sampling, genealogical research and old-fashioned scaling was hailed as a revolutionary breakthrough that would crack hundreds of cold cases.

Since then, forensic genetic genealogy has solved over 400 cases in the United States. Yet this detective work is complex and time-consuming.

While King was identified after only a few detective hours, most cases take much longer. On average, they take over a year to successfully resolve. Much remains unfinished: law enforcement agencies can run out of funding before a person can be identified, and investigators can give up if they encounter too many dead ends.

To develop the new mathematical research method, Lawrence Wein, a professor of operations, information, and technology at Stanford University’s Graduate School of Business, and Mine Su Ertürk, a doctoral student, teamed up with the DNA Doe Project, a Californian nonprofit that has solved more than 65 unidentified remains cases, including the King case.

It provided researchers with data on 17 cases, eight of which were unsolved at the time. “That’s pretty similar to the historical average of cases they’ve solved,” Wein says. “There is therefore no reason to suspect that these cases are much more difficult or much easier than randomly selected cases.”

Using this real-world data, Wein and Ertürk examined how forensic genealogy research is commonly done, then tested their method, which aims to maximize the likelihood of finding a solution in the most promptly.

“It turns out to be much faster,” Wein says of the new approach, which is nearly 10 times faster. “If they only solve a small number of cases using the current method, and we can get them to solve them 10 times faster, then they could solve many more cases.”

Family Tree Forensics

A typical genetic genealogy investigation begins with a DNA sample from a “target” such as an unidentified body or a murder suspect. It is uploaded to a DNA database such as GEDmatch or FamilyTreeDNA, which generates a list of “matches” – people who share pieces of the target’s genome.

A search can reveal hundreds of such matches, usually distant cousins ​​whose common ancestors may have died more than a century ago. The cases analyzed by Wein and Ertürk had between 200 and 5,000 matches.

That’s just the beginning: drawing a line between those distant relatives and the target requires building a family tree that includes as many family members as possible. Here too, the magnitude of the problem is formidable.

“These are huge trees,” Wein says. “It’s really hard to visually present something larger than twenty people.” As the tree grows, the chances of identifying the target improve, but the search time also increases.

Next, the relevant people in the tree must be identified. It requires scouring public records, genealogy sites, and social media—time-consuming work that combines intuition and skill. “It’s quite an art,” says Wein.

“Using marriage records, death records, birth records, Facebook, and all kinds of different records to try to figure out who people are and who their ancestors and offspring are.”

It is not immediately obvious which matches will provide the best path to the target. Investigators’ strategies for following up on these leads tend to be decentralized, Wein says. “You have a team of people doing this and they will each decide to take a game to investigate and then they will go off on their own to try and build a family tree over time from each game. They don’t think about the big picture holistically. »

By stepping back and assessing the whole problem, Wein and Ertürk provide a roadmap for genetic genealogists looking for the most efficient path to an unidentified target.

“Basically, we’re telling them, ‘Given where you are in research right now, that’s what you should do next,'” Wein says.

Untangling the probabilities

Explaining the difference between the new search method and the standard, or “reference” method is complicated, but Wein sums it up this way: “The reference method searches for common ancestors between different matches. What you really want to find is the most recent common ancestor between a match and the unknown target, and that’s a slightly different issue.

The most recent common ancestor of first cousins, for example, is a grandparent; first cousins ​​share a great-grandparent, and so on.

After identifying a list of the most recent possible common ancestors, Wein and Ertürk’s method “aggressively” populates the family tree with their descendants, even if there is only a slim chance that the ancestor of the target is on the list.

This leap is accomplished by using probability theory to track research progress. “We do this by describing the reconstructed family tree as a set of probabilities that represent the probability that each person on our tree is a correct ancestor of the target,” Ertürk explains. “Then by looking at those probabilities, you can tell which parts of the tree you should explore more.”

This approach proves effective even with smaller family trees, which means faster resolution times. After performing hundreds of simulated searches, Wein and Ertürk conclude that their method can solve a case with a family tree of 7,500 people about 94% of the time. The success rate of the standard method in these cases is about 4%.

Wein hopes these findings will help Project DNA Doe and other investigators refine their approach and solve more cases. He notes that his analysis ignores some of the “tricks” used by genetic researchers to narrow their searches, such as focusing on family members who lived in a particular location.

“Our algorithm is in no way intended to replace genealogists,” he says. “But if they’re really stuck, it will give them ideas that may not be obvious.”

Wein sees forensic genetic genealogy as another crime-solving tool that can be improved so that it can deliver on its promise.

“It’s an interesting field that combines probability and statistics and optimization and sometimes game theory,” he says. “That’s how, from a mathematical point of view, I stayed attracted to these problems.”

Source: Dave Gilson for Stanford University

This article originally appeared in Futurity. It has been republished under the Attribution 4.0 International license.

#Math #solve #forensic #genetic #cases #times #faster

Leave a Comment

Your email address will not be published.