Genome studies have collected data primarily from those of European descent since the field was founded. Learn about the organizations trying to change that.
How precise can precision medicine be if researchers are only studying and developing treatments based on a certain part of the population?
This is the billion-dollar question healthcare professionals and researchers around the world and at events like the J.P. Morgan Healthcare Conference are asking.
In 2009, 96 percent of data in genome-wide association studies (GWAS) was from people of European descent. Overall diversity has increased since, in part because many Asian countries—most notably China—has increased its genetic studies spending and efforts.
However, whites of European ancestry still make up the vast majority in large genetic studies: over 80 percent.
But other large population groups including Africans, Latin Americans, and native or indigenous people have been largely left out of genomics research. This is a dangerous precedent that sets up researchers, doctors, and, ultimately, patients, to fail.
By missing key genetic factors that play a role in disease susceptibility and drug response among different population groups, lives are literally at stake.
Why the GWAS Isn’t Cutting It
GWAS involves rapidly scanning markers across the complete sets of DNA, or genomes, of many people to find genetic variations associated with a particular disease. With the completion of the Human Genome Project in 2003 and the International HapMap Project in 2005, researchers now have a limited set of research tools that make it possible to find the genetic contributions to common diseases.
Many researchers in the field believe the GWAS Catalog is the most important tool in modern genetics. However, it’s nowhere near close to completion.
Researchers’ goal through these studies is to explain the missing heritability problem: the fact that single genetic variations—also known as common variants—cannot account for much of the heritability of diseases, behaviors, and other phenotypes.
With studies turning to understand rare variants—which are group specific, like variants linked to diabetes in Europeans aren’t found in other groups—researchers are realizing that there is not enough data.
A Weighted Focus on European Descent
There are several reasons why data on genomes is majority focused on people of European descent.
The United States and Western Europe are leaders in the study of genomics. Since both regions have a majority of European ancestors, it’s a conveniently accessible group.
Each of these studies takes an incredible amount of infrastructure, and the more populations that are included in a study, the more variables there are to control for. To keep things simple, researchers often go where there is a strong base of existing cohorts, which means the pools they are pulling from are mostly of European descent.
Repeat sample is also part of the problem, albeit to a small degree. Newer cohorts with a more racially diverse make-up haven’t yet been reused as often.
Before 2010, technology wasn’t quite up to par in studying the mixed ancestry of many of the world’s groups. Because genetic differences in ancestry can mask relationships between mutations and diseases, those with diverse backgrounds were more difficult to study. This is not the case anymore.
In certain parts of the world, limited access to medical centers or cultural or historical reasons lead to some populations being bypassed for these studies.
An ignorance on the level of the research community is also to blame. Many used to assume that poor countries didn’t necessarily require genomic studies—the biggest killers were seen as infectious diseases. Now researchers are realizing that’s not the case. Chronic diseases like diabetes and heart disease in Asia, Africa, and Central and South America prove this.
These diseases “are now on the rise in lower-income countries too, and Adeyemo [deputy director of the Center for Research on Genomics and Global Health at the National Human Genome Research Institute] says more genome studies are needed to understand different populations’ risk factors for these conditions,” shared Emily Mullin of MIT Technology Review.
Another component is the racial makeup of employees in biomedical institutions worldwide. In the United States in 2012, less than four percent of the tenured and tenure-track faculty members in research-intensive biomedical departments were African American, Hispanic, or Native American.
But these justifications for avoiding diversity no longer hold. There are powerful statistical methods today for handling mixed ancestry populations, and more comprehensive DNA analysis technologies.
Right now, the message being shared by the scientific and medical genomics community to the rest of the world is currently a harmful and misleading one: the genomes of European descendants matter the most.
But there are a handful of organizations trying to change the rhetoric.
Groups Challenging the Status Quo
The National Institutes of Health mandated diversity in federally funded genetics studies in 1985, although that hasn’t been followed to the letter. But genome sequencing costs are plummeting, and there’s no longer an excuse for remaining exclusive.
Researchers have to want to do research in non-European populations, and scientific organizations around the world have to provide funding to support those kinds of studies.
Luckily there is a good collection of organizations pursuing diverse data.
Human Heredity and Health In Africa (H3Africa) Initiative, Illumina, and 23andMe are operating in Africa, collecting data from genome-wide association studies as well as genome sequencing data—the entire readout of a person’s DNA—from thousands of participants.
The Hispanic Community Health Study/Study of Latinos was launched in 2013 by the National Institutes of Health (NIH), and focuses on establishing the risk factors for cardiovascular and pulmonary disease and chronic diseases in Latin Americans. This organization is also in the middle of a 30-year-long study of American Indian people and cardiovascular disease.
The NIH is also supporting long-term, resource-intensive study cohorts that better represent minority populations, such as the Multi-Ethnic Study of Atherosclerosis and the Hispanic Community Health Study.
The National Cancer Institute recently instituted an effort to collect information on breast cancer genetics in African-American women.
There’s Still Work to Do
In their article for Nature, researchers Alice B. Popejoy and Stephanie M. Fullerton lay out steps that research organizations can take to create more diverse genomic studies.
Funding agencies can develop financial incentives for the creation of diverse cohorts of study participants. They recognize that training programs and new infrastructure, such as good healthcare clinics that provide genetic testing in predominantly black or Hispanic neighborhoods, could enhance trust and allow people to engage in projects as stakeholders rather than as study participants.
“A culture shift is required at every level,” the researchers wrote. “Efforts to recruit participants for biomedical research in underrepresented communities have been most successful when conducted by investigators of concordant racial or ethnic background, and in partnership with institutions trusted by those communities—such as historically black colleges and universities in the United States.”
If geneticists continue to conduct their research in the European bubble, they will continue to miss important information about disease biology, and potentially delay important treatments for the diverse populations of the world.