They create software that improves the effectiveness of computational genetic analysis models to identify missing people.
After the high-profile case of attempted rape followed by the death of teenager Ángeles Rawson in Argentina, in 2013, the National Registry of Genetic Data (RNDG) was created in the country. There, information on those convicted of sexual crimes began to be stored so that it could be contrasted with the evidence (DNA samples) collected at scenes of crimes against sexual integrity. Almost in parallel, an interdisciplinary team of Argentine scientists developed GENis, a local software that allows storing, sharing and comparing genetic profiles, which is already installed in 20 Argentine provinces and last year began to be used by the Forensic Medical Corps, which depends on the Supreme Court of the Nation. It is also used by the Attorney General's Office of Mexico City.
In these 10 years of existence, GENis has evolved and each significant improvement has been shared with the global scientific community in academic article format. Now, the latest advance has been published in the academic journal Forensic Science International: Genetics. There, the work of Argentine scientists proposes a novel approach based on information theory to make algorithms more efficient and improve the results of computational models when identifying missing people.
Generally, in these cases, a genetic relationship test is usually required to determine the relationship between an unidentified individual and the relatives of the person being searched. When not enough genetic evidence has been collected, the lack of statistical power of those tests can lead to unreliable results, something especially true when only a few distant relatives are available for genotyping.
“What we did was add information theory to quantify and evaluate the forensic genetic evidence that we have. This allows us to characterize the genetic family tree (pedigree) of the lost person, and to know in advance how much information the relatives we have can provide us with before launching a search: if we realize that there is a lack of data to obtain a decisive result “The effort can be increased to find new contributors,” physicist Ariel Chernomoretz, professor at the Faculty of Exact and Natural Sciences of the University of Buenos Aires (UBA), researcher at the National Scientific Research Council and researcher at the CyTA-Leloir Agency, explained to the CyTA-Leloir Agency. Techniques (CONICET) and head of the Integrative Systems Biology Laboratory of the Instituto Leloir Foundation (FIL), in Argentina.
Since carrying out these studies is very expensive, in this way it is possible to prioritize who to analyze and who not to analyze, he added.
Information theory (IT) is a broad area of knowledge proposed in the mid-20th century, which establishes fundamental principles of communication systems and are also applied in physics and biology; It is a branch of mathematical probability theory. The authors of the new study chose that theoretical framework because it is “a powerful and elegant tool for quantifying concepts such as uncertainty and information” and used that approach to compare parentage probability distributions before and after incorporating genetic evidence.
“These methodological improvements result in a faster and more optimized calculation process, which allows us to handle large volumes of data and perform analyzes more efficiently,” they say. The scientists also added that the methodology was specifically designed to be implemented in GENis, the first software made in Latin America to compare DNA profiles from biological samples in order to identify criminals or missing people and/or victims of disasters. Furthermore, the developed technology can also be easily incorporated into systems that use other algorithms to perform forensic analysis.
Developed by the Sadosky Foundation, together with the Argentine Society of Forensic Genetics, CONICET, the Leloir Institute Foundation and the Argentine Association of Bioinformatics and Computational Biology, among other institutions, GENis is free, open source and allows comparisons of profiles. forensic genetic data stored in databases of different jurisdictions.
“In Argentina, each province has its own legal system, its laws, and so to implement a national registry you have to make many different aspects compatible, which makes it very complex. But a tool like GENis, which standardizes things, which establishes parameters for uploading and processing information, serves as the backbone for that,” Chernomoretz described.
GENis is a digital public good, accessible to the entire Ibero-American community of forensic geneticists to use and improve it together. It is available in the GitHub repository, it is translated into English and queries have even been received from countries in Asia and Africa. To support its implementation in judicial organizations, the Sadosky Foundation provides technical training and support to users to establish the initial parameters.
Ariel Chernomoretz, CONICET researcher at the Leloir Institute. (Photo: CyTA-Leloir Agency / CONICET).
Tracing genetic fingerprints
Within an individual's DNA there are small fragments of non-coding DNA (also known as “junk” DNA) that are repeated many times. The number of these repetitions, which are called microsatellites, varies from person to person and is used like a genetic fingerprint. Among other things, it allows establishing identities and performing kinship analysis.
“The basis of GENis is that it uses the information from these microsatellites to establish a genetic identity and compare it with the information we have stored. “This way we can establish the most probable profile of a wanted missing person and their possible links to others,” Chernomoretz highlighted.
Beyond the RNDG, which contains information on those convicted of sexual crimes, since 1987 Argentina has had a National Genetic Data Bank (BNDG), which was a pioneer in the world and was established to contribute to the identification of children born in captivity. during the last military dictatorship. In 2017, at the request of the BNDG, the GENis development team began the construction of a special module for searching for missing persons.
The cases that this institution usually manages have very incomplete pedigrees, which results in a significant complexity of the statistical calculations to establish affiliations. After the work of various scientific organizations and universities, the MPI module was incorporated into the local software, which allows the BNDG to identify people seeking their biological identity much more quickly. (Source: CyTA-Leloir Agency)