In Genetic Data, Gaps That Affect Indigenous Communities

When Andres Moreno-Estrada began studying genetics back in the early 2000s, the high cost of sequencing DNA was the biggest barrier to understanding the role of genes in human health and disease. But with time, the problems shifted.

“Technology is no longer the limit,” said Moreno-Estrada, now a population geneticist at the National Laboratory of Genomics for Biodiversity in Mexico. “Sequencing or getting genetic data is cheaper than before. The problem is in the unbalanced way this genetic information is being generated worldwide.” Researchers today rely on genetic data that’s disproportionately drawn from people with European ancestry, and mounting analyses suggest that their databases fail to capture the full scope of human genetic diversity. The result is a set of clinical tools that may not work as well for people whose ancestors lived outside of Europe.

Those issues are especially acute in Latin America, where new research suggests that more robust genetic data could allow physicians to better target certain medical treatments, especially for Indigenous groups.

At stake is the practice of precision or personalized medicine, which uses individual variability, including genes, to make decisions regarding diagnosis or treatments of health conditions. A certain medication, for example, may be highly effective for people carrying one version of a gene — but may not work, or could even be harmful, to people with another version. In an ideal world, physicians would simply find out which specific version of the gene each patient has, and then give them the right drug with the right dosage. In the absence of that kind of personalized data, they typically rely on other information, such as a patient’s ethnic identity, that allows them to make an informed guess about whether a particular genetic variant is likely to be present.

But when physicians don’t have detailed genetic information available for certain communities, they can’t make those kinds of informed guesses.

“Sequencing or getting genetic data is cheaper than before. The problem is in the unbalanced way this genetic information is being generated worldwide.”

Consequently, communities that are underrepresented in these biobanks are left behind in terms of care, said Eduardo Tarazona-Santos, a human geneticist at the Federal University of Minas Gerais in Brazil. And labeling people as belonging to a broader group can miss subtle, important patterns in genetic variation that could help clinicians make better decisions.

A new analysis from Tarazona-Santos’ team, published in the journal Cell, highlights how certain populations thought to be homogenous differ in genes related to drug responses. The analysis revealed that Andean and Amazonian individuals in Peru, some coming from communities that are only about a hundred miles apart, tend to differ in key genes that influence how individuals metabolize and respond to heart medications.

Tarazona-Santos, who himself has Indigenous ancestry, is worried about the dearth of data. Certain genes, his team has found, don’t look the same even in Indigenous populations that are geographically close.

The paper examined samples from 294 individuals — some from the arid Andean highlands and some from the Amazon. They looked at genetic variants involved in responses to rosuvastatin and warfarin, two drugs that can be used to reduce the risk of heart attack and stroke, among other issues.

Even though they’re relatively close geographically, these two populations showed greater genetic differences in their response to rosuvastatin than those observed between Europeans and East Asians. For some traits, ethnic groups that researchers call and think of as homogeneous may be genetically similar. But in some instances, they are not, Tarazona-Santos said.

The findings could affect decisions that doctors make when treating patients. For example, one genetic variant is associated with a better response to rosuvastatin but also more side effects, while another may be linked to a reduced response to the therapy. Based on the frequency of variants in the populations, around 16 percent of Amazonians but only 2 percent of Andeans would qualify for a lower initial dosage of rosuvastatin or even another statin altogether. Similarly, the team calculated that 93 percent of Amazonian individuals and 69 percent of individuals from Andean populations likely required lower doses of warfarin.

Using that information, a clinician can make informed guesses about which dose might be best for a patient. But without more genetic data from Indigenous groups, clinicians can’t employ precision medicine strategies and may not be treating individuals with the appropriate doses, some experts warn. “Previously, we just don’t know any genetic information about, say, people from Peru, so we can’t make those personalized decisions about dosing drugs like warfarin,” said Mashaal Sohail, a population geneticist at the National Autonomous University of Mexico, who was not involved with the recent analysis.

These two populations showed greater genetic differences in their response to the drug rosuvastatin than those observed between Europeans and East Asians.

The study is “presenting something to get us out of a way of thinking that we’ve been stuck in for a long time,” she said. More fine-grained groupings could provide more accurate information and better care.

Still, Sohail warned, even with better data and even for those ancestries that are represented in biobanks, knowing a person’s ancestry is always an imperfect proxy for understanding their unique genetic code. “No ancestry is homogenous,” Sohail said.

“It’s really just labels we’ve assigned,” Sohail said. “And then within that, you can have a lot of diversity as well.”


Across Latin America, efforts are underway to collect more genetic data, with an eye toward more effective precision medicine. According to study posted on the pre-print server medRxiv, which has not been peer-reviewed, as of August, the Peruvian Genome Project has collected samples from 30 communities, while the Mexican Biobank, which Moreno-Estrada helps lead, has collected 40,000 DNA samples across all 32 states in Mexico. Because of the cost, Moreno-Estrada’s team has only sequenced about 6,000 of these, but they have observed similar patterns to the ones identified by Tarazona-Santos: The Indigenous populations in Mexico seemed to be highly diverse. For example, individuals with Mayan ancestries are closer genetically to other Indigenous groups in the Gulf of Mexico and Central Mexico, while they are more distinct from Indigenous populations in northern Mexico and the central highlands, said Sohail. The differences between some people with ancestry in the Yucatán Peninsula and some people in Sonora are larger than those typically seen between someone with Japanese ancestry and someone with ancestry in Finland, said Moreno-Estrada, mirroring the findings from Tarazona-Santos.

Capturing that diversity in global biobanks, experts say, can improve genetic tools for everyone. More diverse biobanks can outperform the classical biobanks for some traits, a 2023 study by Sohail, Moreno-Estrada, and others showed, doing a better job linking a person’s trait variation from their unique configuration of genes, including cholesterol levels.

Experts underscore the need to continue finding, collecting, and storing the unique genetic signals that characterize populations. “We need to find these examples because they have a clinical significance,” says Tarazona-Santos.

Knowing a person’s ancestry is always an imperfect proxy for understanding their unique genetic code. “No ancestry is homogenous.”

Outside of Latin America, researchers are also trying to create more diverse biobanks that can be used for clinical research. Segun Fatumo, a professor of genomic diversity at Queen Mary University of London, points out that the proportion of individuals with African ancestry actually decreased across genomic studies in the last few years. Efforts like Nigerian 100K Genome Project, which Fatumo co-leads, are working to expand African representation in global genetic databases.

However, because Indigenous and other underrepresented populations have long been excluded and harmed by science, there is often mistrust that can hamper data collection for biobanks, experts say. One response is to ensure that local scientists drive the research. Additionally, there need to be clear protocols for data sharing, ownership, and benefit-sharing with communities, said Sohail. For example, for the first two years, researchers for another database of Mexican individuals stated they would make its data freely accessible only to researchers in Mexico.

Moreno-Estrada and others are part of an initiative called the LatinGenomes network, which aims to help establish a network of cohorts and biobanks across Latin America.

“I think the LatinGenomes network may be a way to raise our voice louder,” Moreno-Estrada said.

Claudia López Lloreda is a senior contributor at Undark and a freelance science journalist covering life sciences, health care, and medicine.