You are here:: Products and Programs

Race & Genetics FAQ

E-mail Print

About the Program

NCHPEG developed this 90-minute CME live discussion for broadcast via the Voluntary Hospital Satellite Network. It includes six case-based interactions between patients and providers with commentary by three panelists: Gary Gibbons, MD (Morehouse School of Medicine), Howard Levy, MD, PhD (Johns Hopkins), Charmaine Royal, PhD (Howard University).

The following cases are used to discuss the pertinent issues:

  • Metabolic Syndrome
  • prostate cancer screening
  • G6PD deficiency
  • BiDil
  • Tay-Sachs Disease
  • hemochromatosis

The Robert Wood Johnson Foundation provided the funding for this program.

How do we measure genetic variation within and between populations?

The most direct way to measure genetic differences, or genetic variation, is to estimate how often two individuals differ at a specific site in their DNA sequences -- that is, whether they have a different nucleotide base pair at a specific location in their DNA. First, DNA sequences are obtained from a sample of individuals. The sequences of all possible pairs of individuals are then compared to see how often each nucleotide differs. When this is done for a sample of humans, the result is that individuals differ, on average, at only about one in 1300 DNA base pairs. In other words, any two humans are about 99.9 percent identical in terms of their DNA sequences.

During the past several years, a new type of genetic variation has been studied extensively in humans: copy number variants (CNVs) are DNA sequences, 1,000 base pairs or larger, that are deleted, duplicated, or inverted in some individuals but not others. Several thousand CNVs have been discovered in humans, indicating that at least 4 million nucleotides of the human genome (and perhaps several times more) vary in copy number among individuals. CNVs thus represent another important class of genetic variation and contribute to at least an additional 0.1% difference, on average, between individuals.

Comparisons of DNA sequences can be done for pairs of individuals from the same population or for pairs of individuals from different populations. Populations can be defined in various ways; one common way is to group individuals into populations according to a continent of origin. Using this definition, individuals from different populations have roughly 10 percent to 15 percent more sequence differences than do individuals from the same population (this estimate is approximately the same for both SNPs – see below – and CNVs). In other words, people from different populations are slightly more different at the DNA level than are people from the same population. The slightness of this difference supports the conclusion that all humans are genetically quite similar to one another, irrespective of their geographic ancestry.

Because it is still fairly expensive to assess DNA sequences on a large scale, investigators often study genetic variation at specific sites that are known to vary among individuals. Suppose that a specific site in the DNA sequence harbors an A in some individuals’ DNA sequences and a G in others. This is a single nucleotide polymorphism (SNP), where polymorphism refers to a genetic site that exists in multiple forms. The proportion of individuals who have an A and the proportion who have a G give the frequency of each form, or allele, and this frequency can be estimated for a sample of individuals from a population. If the frequencies of A in three different populations are .10, .20, and .50, the genetic distance between the first two populations is smaller than that between the third population and the first two. On the basis of this assessment, the first two populations are genetically more similar than either is to the third. To get a more accurate picture of genetic differences, hundreds or thousands of SNP frequencies would be assessed to yield the average genetic difference among pairs of populations.

These comparisons can be summarized graphically in a variety of ways. An example is given in Figure 1, which portrays population differences measured in approximately 11,000 SNPs throughout the human genome. This diagram shows that populations that are geographically closer together tend to be genetically more similar to one another. This is expected, because geographic neighbors are more likely to have historical connections and to exchange mates.

Figure 1. A display of population differences. See the text for explanation. SNP data from Shriver et al. 2005. Human Genomics 2: 81-89.

How is genetic variation related to disease?

Nearly all human diseases are influenced by genes. Because individuals have different variants of genes, it follows that the risk of developing various diseases will also differ among individuals. Consider a simple example. Jim Fixx, a well-known runner and fitness enthusiast, died of a heart attack at the age of 52. Sir Winston Churchill, who was renowned for his abhorrence of exercise and his love of food, drink, and tobacco, lived to the age of 90. It is plausible that genetic differences between Fixx and Churchill were responsible, at least in part, for the paradoxical difference in their life spans. (Indeed, Jim Fixx’s father had a heart attack at the age of 35 and died of another heart attack at the age of 43.)

Because genes are passed down from parents to offspring, diseases tend to “cluster” in families. For example, if an individual has had a heart attack, the risk that his or her close relatives offspring or siblings will have a heart attack is two to three times higher than that of the general population. Similar levels of increased risk among family members are seen for colon cancer, breast cancer, prostate cancer, type 2 diabetes, and many other diseases. This clustering in families is partly the result of shared non-genetic factors (e.g., families tend to be similar in terms of their dietary and exercise habits) and partly the result of shared genes.

As we have seen, populations differ somewhat in their genetic backgrounds. It is thus possible that genetic differences could be partly responsible for differences in disease prevalence. For many disorders caused by genetic changes in single genes, these differences are readily apparent. Cystic fibrosis, for example, is seen in about one in 2,500 Europeans but only in one in 90,000 Asians. Sickle-cell disease is much more common in individuals of African and Mediterranean descent than in others, although it is found in lower frequency in many other populations due to migration and intermarriage (Figure 2).

Figure 2. Prevalence of several single-gene disorders (per 10,000 births) in a series of human populations. The prevalence of Tay-Sachs disease in the Ashkenazi Jewish population refers to pre-1980 data; widespread genetic screening has greatly reduced the prevalence of this disease in the Ashkenazi population. Source: Jorde, LB. 2007, Human Genetic Variation and Disease, In Meyers RA (ed.), Genomics and Genetics: From Molecular Details to Analysis and Techniques, pp. 939-953, Weinheim: Wiley-VCH Publishers, pp. 939-953, used by permission.

These differences in prevalence can be attributed to the evolutionary factors that influence genetic variation in general. Mutation is the ultimate source of all genetic variation. In some cases, such as hemochromatosis in Europeans and sickle-cell disease in Africans, the responsible mutations have arisen within the last few thousand years, helping to account for a fairly restricted distribution of the disease. Natural selection also plays a role in population differences in some genetic diseases. For sickle-cell disease and related diseases known as the thalassemias, heterozygotes (those who carry a single copy of the disease-causing mutation) are relatively resistant to the malaria parasite. Cystic fibrosis heterozygotes are resistant to typhoid fever, and hemochromatosis heterozygotes absorb iron more readily, perhaps protecting them against anemia. Also, the process of genetic drift, which is accentuated in small populations, can raise the frequencies of disease-causing mutation quickly just by chance (e.g., Ellis van Creveld disease, a reduced-stature disorder, is unusually common among the Old Order Amish of Pennsylvania). In contrast to the effects of natural selection and genetic drift, which tend to promote population differences in disease prevalence, gene flow (the exchange of DNA among populations) tends to decrease differences among populations. With the enhanced mobility of populations worldwide, gene flow is thought to be increasing steadily.

These same factors can affect common diseases such as cancer, diabetes, hypertension, and heart disease, but the picture is more complex because these diseases are influenced by multiple genetic and non-genetic factors. Common diseases do vary in frequency among populations: hypertension occurs more frequently in African-Americans than European-Americans, and type 2 diabetes is especially common among Hispanic and Native-American populations.

Although genes clearly play a role in causing common diseases, it is less clear that genetic differences between populations play a significant role in causing differences in prevalence rates among populations. Consider another example: the Pima Native American population in the southwestern United States now has one of the highest known rates of type 2 diabetes in the world. About half of adult Pimas are affected. Yet this disease was virtually unknown in this population prior to World War II. Obviously the Pimas’ genes have not changed much during the past 50 or so years. Their environment, however, has changed dramatically with the adoption of a “Western” high-calorie, high-fat diet and a decrease in physical exercise. In this case, it is almost certain that the rapid increase in type 2 diabetes prevalence has much more to do with non-genetic than genetic causes.

But why does a Western diet seem to have a greater effect on some populations than others? Perhaps differences in genetic background, interacting with dietary and other lifestyle changes, help to account for this variation. As additional genes that influence susceptibility to common diseases are discovered, and as the roles of non-genetic factors are also taken into account, it is likely that this picture will become clearer.

What is the population history of modern Homo sapiens, and how did that history contribute to the current structure of human populations?

Information about the history of the human species comes from two main sources: bones and artifacts gathered from archaeological sites, and the distribution of genetic variants in the human population today. Both sources of information are fragmentary, but both are converging on the same general story.

The earliest fossil skull with features similar to those of anatomically modern humans (including a rounded braincase, reduced brow ridges, and a distinct chin) comes from Ethiopia’s Omo River and is estimated to be about 190,000 years old. Later fossils (with estimated ages) that have at least some modern characteristics have been found elsewhere in Ethiopia (150,000 years), in the Middle East (100,000 years), in southern Africa (100,000 years), in Australia (40,000 years), in eastern Europe (35,000), and in the Americas (13,000 years). These discoveries suggest that anatomically modern humans evolved in eastern Africa and then spread out to occupy the rest of Africa, Asia, Europe, and the Americas (see Figure 3).

Figure 3. A map showing the dispersal of modern humans out of Africa. Source: Hedges SB. 2000. A start for population genetics. Nature 408(6813):652-653, used by permission of the author.

Genetic evidence supports this conclusion. The genetic diversity of indigenous human populations drops with increasing geographic distance from eastern Africa. One would expect this pattern if groups of migrants moving away from Africa carried with them just part of the genetic variation existing in Africa. Consistent with this picture, the broad patterns of genetic variation found outside Africa tend to be a subset of those found inside Africa.

Populations of archaic humans lived in Africa and Eurasia as modern humans expanded outward from eastern Africa, including Neanderthals in Europe and Homo erectus in Asia. Some fossil and genetic evidence suggests that modern humans interbred with these archaic humans during their expansion, so that some genetic variants from these populations may still be present in modern populations. But the genetic evidence also indicates that the amount of interbreeding must have been small if it occurred at all. Most of the genetic differences between human populations today appear to have developed as populations of modern humans became widely dispersed during their worldwide expansion.

How much genetic variation is there in Homo sapiens? How different are human populations from each other genetically, and what is the meaning of those differences biologically?

Populations of humans from different parts of the world are surprisingly similar genetically, given our large numbers and worldwide distribution. This low level of variation suggests that the size of the human population was much smaller – perhaps just a few thousand people – in the relatively recent past. This finding further supports the idea that modern humans evolved as a relatively small group in eastern Africa within the past 200,000 years and then spread out to occupy the rest of the world, with little or no interbreeding between modern humans and the archaic humans that they gradually replaced.

When populations become dispersed, individuals tend to mate with others who are geographically nearby. In this way, new genetic variants that appear in a population tend to become localized, and geographically separated populations gradually diversify genetically. However, the diversification of the human species has been limited by the recency of our common ancestry and by continued migration between separated populations.

When averaged over the entire genome, about 85 to 90 percent of the genetic diversity present in the human species can be found in any human group. Thus, two individuals chosen from different continents would be expected to differ genetically by just 10 to 15 percent more than two individuals chosen at random from the same continent. However, this level of differentiation generally is still large enough for geneticists to make broad estimates of where an individual’s ancestors lived.

Genetic differences between populations can be amplified by natural selection. For example, the different skin colors seen in human populations today appear to have resulted from natural selection maintaining dark skin in areas of intense sun (to protect the body from sunburn, skin cancer, and other harmful effects) and perhaps favoring light skin in areas of less intense sun (for example, to allow the body to produce sufficient vitamin D to maintain health). Natural selection related to regional differences in foods, parasites, climate, and other environmental factors may have contributed to other genetic differences between populations, but the extent to which these genetic differences affect health is largely unknown.

Why does research on genes and disease so often involve specific racial and ethnic groups?

As we have seen, many diseases vary in prevalence among populations, which are sometimes defined as ethnic groups or races. This variation reflects population differences in genetic and non-genetic factors, which tend to be shared within populations. For example, the foods eaten in Germany are quite different from those eaten in Japan, Afghanistan, Ghana, or Papua New Guinea. Similarly, population differences are seen in typical leisure activities, occupations, and exposure to harmful substances such as tobacco smoke. Accordingly, it is often illuminating to compare the frequencies of diseases and susceptibility factors among populations.

An instructive example is given by Alzheimer disease (AD), the most common cause of dementia in older members of many populations. A major genetic risk factor for AD is the ε4 allele of the apolipoprotein E locus. Among Europeans, a large study showed that individuals who inherit two copies of the ε4 allele have a 15-fold elevation in the risk of developing Alzheimer disease. Among Japanese individuals who have two copies of ε4, the increase in risk is even greater: 33-fold. Among Hispanics and African-Americans, the elevation in risk is substantially lower than in Europeans. Thus, the risk conferred by this disease-causing allele varies greatly, depending on other factors that vary among these populations. These factors include other genetic variants that may contribute to AD susceptibility as well as non-genetic factors such as diet, exercise, socioeconomic level, and access to health care. Interestingly, different African-American populations vary substantially in the level of risk conferred by the ε4 allele, showing that specific segments of this population are likely to have quite different risk factors. Further comparisons among populations may help to pinpoint these factors.

In some cases, a disease-causing variant that is present in some populations may be rare or absent in others. An example is given by Crohn disease, a form of inflammatory bowel disease. Three disease-causing variants of the CARD15 locus have been associated with Crohn disease in European populations, but none of these variants has been found in Asian populations. This is thought to reflect a relatively recent origin of these variants in European populations, such that the variants have not yet had the opportunity to be incorporated in significant numbers into Asian populations.

In the United States, there are substantial differences among populations in terms of access to education, economic opportunities, and health care. These disparities can have substantial impact on common diseases such as cancer, heart disease, and diabetes. Partly because of this, investigators funded by government agencies such as the National Institutes of Health are expected to include members of all populations or ethnic groups in their studies. Understanding the causes of disease in all populations will help to ensure that advances in medical research will benefit everyone.

Although population comparisons can often be informative, it is important that investigators take care to avoid misunderstanding or stereotyping. This is especially critical when studying traits or diseases that may be socially or politically sensitive (e.g., addiction to alcohol or other drugs, susceptibility to infectious agents such as HIV, or variation in behavioral traits).

How does variation in Homo sapiens compare to variation in other species?

The fact that any two humans are approximately 99.9 percent identical at the DNA sequence level indicates that we are genetically quite similar to one another. To put this into perspective, however, one needs to compare the average nucleotide diversity in humans (about 1/1300 base pairs) to the levels of nucleotide diversity in other species. Such comparisons have been done using blood group and protein polymorphisms, but these comparisons suffer from potential biases: the loci were identified originally on the basis of their extensive variation in one species (typically humans), so they often exhibit more variation in that species.

A better approach is to identify a specific region of DNA and then sequence it in large, random samples from multiple species. This has been done, to some extent, with mitochondrial DNA, where comparisons suggest that humans are substantially less variable than other primates. (On the other hand, humans have more diversity than animals, such as cheetahs, that have undergone substantial population bottlenecks.) However, the mitochondrial genome is potentially subject to the effects of natural selection and is more strongly affected by the random process of genetic drift than is nuclear DNA. Thus, these comparisons must be regarded with caution.

Noncoding nuclear DNA sequences, which are less affected by selection and drift, offer perhaps the best assessment of genetic variation within species. Although an increasing number of species are being sequenced, only a few studies have reported large-scale sequencing of multiple individuals from the same species (a requirement for estimating nucleotide diversity). Studies of West African chimpanzees and bonobos (“pygmy chimpanzees”) reveal nucleotide diversity levels similar to that in humans, while diversity in central African chimpanzees and in gorillas is roughly twice that in humans. An estimate of genetic diversity in wild-derived mouse (Mus musculus) strains, based on the number of variants found in a series of DNA sequences, reveals that these strains exhibit about ten times as much diversity as humans. Similarly, a comparison of humans and fruit flies (Drosophila pseudoobscura) suggested that these flies harbor about ten times as much genetic diversity as do humans.

These findings are limited in scope and need further corroboration with additional species and more DNA sequences. They offer a preliminary assessment, however, that humans may be less variable genetically than many other species. This is thought to reflect a recent, common origin of anatomically modern humans from a relatively small number of founding individuals in Africa. Subsequently, humans likely underwent mild to moderate bottlenecks in population size as they ventured out of Africa some 50,000 to 100,000 years ago to colonize the rest of the world. These historical factors help to account for an apparent moderate reduction of genetic variation in our species.

How and when did the biological concept of race develop? On what was it based, and what were some of the social and political implications?

Humans have always separated themselves into groups based on shared characteristics. But the characteristics used to make these distinctions and interpretations of these characteristics have varied in different places. Classical civilizations from Rome to China did not see membership within socially delineated groups as hereditary and unchangeable. In many ancient societies, individuals with widely varying ancestries or physical appearances could become accepted members of a society by growing up within that society or by adopting the society’s cultural norms.

The modern concept of race began to take shape during the European era of exploration. As Europeans encountered people from different parts of the world, they began to sort themselves and others into groups characterized by physical appearance. The English word “race” – which may be derived from the Spanish word raza, meaning breed or stock – first appeared in the late sixteenth century and referred to groups of people united by common descent or shared features. Over the next two centuries, various folk beliefs associated innate intellectual, behavioral, and moral qualities with groups distinguished by physical characteristics, and scientific investigations that began in the second half of the eighteenth century sought to buttress these beliefs.

The ideas associated with “race” have had a profound effect on modern history. They helped justify the barbarous treatment of some groups – most notably, slaves taken from Africa, who increased in numbers in the seventeenth and eighteenth centuries as a preexisting trade in slaves from elsewhere in the world declined. In the late nineteenth and early twentieth centuries, eugenicists cited these ideas about race to argue for the biological inferiority of particular groups. Campaigns of oppression and genocide throughout the twentieth century have used supposed racial differences to motivate inhuman acts against others.

Today, ideas about race remain powerful lenses through which people view the world. People tend to attribute certain characteristics or shared experiences to others on the basis of their physical appearance (which in some cases becomes a self-fulfilling prophecy). Even when those characteristics are thought to be the product of a shared culture rather than biology, these cultural influences can be seen as so overwhelming that they are essentially equivalent to inherited differences.

How is the concept of race used elsewhere in biology? How do these uses compare to its use in Homo sapiens?

In non-human biology, the term “race” is not used as commonly as “subspecies,” and the two terms tend to be used interchangeably. Some biologists, however, regard races as less differentiated than subspecies. Although there is no accepted quantitative definition of subspecies or race, subspecies have traditionally been defined qualitatively on the basis of morphological similarities and a common geographic location. The appropriate definitions of geographic boundaries and key morphological characters are both open to debate. More recently, the subspecies definition has been sharpened somewhat by including an expectation of “the concordant distribution of multiple, independent genetically based traits” (O’Brien and Mayr, 1991). The use of genetic traits, such as DNA sequences, should increase the objectivity of these definitions, but arguments can be made about the relative weighting, for example, of non-coding versus coding DNA sequence variation.

The term subspecies has seldom been applied to humans, although its use was advocated by the eminent population geneticist Sewall Wright. In contrast, much discussion and debate has been focused on the existence and definition of human “races.” Considering the lack of specificity in definitions, it is not surprising that estimates of the number of human races vary from zero (i.e., there are no biological races in our species), to several (e.g., Africans, Asians, Europeans, Australians, and Native Americans), to dozens or hundreds. The confusion is magnified by the fact that physical differences among human populations are correlated, to some extent, with geographic location and with attributes such as culture, religion, and language (hence historical references to a “Jewish race” or a “Hispanic race”). As human populations become increasingly mobile and their members intermarry, any definition of race becomes less and less precise.

One way to assess the amount of genetic differentiation among populations is to measure the proportion of genetic variation that can be attributed to population subdivision. This proportion is estimated by the statistic, Fst, which is equal to (Ht – Hs)/Ht, where Ht is the total amount of genetic variation (heterozygosity) in a population and Hs is the average amount of variation within each subdivision of the population. If the human population is subdivided into the major continents, Fst is consistently estimated to be around 10 percent to 15 percent. This means that 85 percent to 90 percent of genetic variation exists between individuals from the same continental population, while the remaining 10 percent to 15 percent of variation is due to differences between continental populations.

Can this statistic shed light on the question of human races? A reasonable approach would be to estimate Fst for a series of other species in which races or subspecies have been defined and to compare this with the estimate of Fst in humans. A legitimate comparison would require that the same DNA regions be sequenced and compared in many individuals from each species, an exercise that has not yet been done. Even if this comparison were done, it would not necessarily solve the question whether there is any biological basis for human races. Fst can vary substantially depending on how the subdivisions are defined and how individuals from each subdivision are sampled. For example, to which subdivision should African-Americans or Hispanic-Americans be assigned, given that each group is a heterogeneous collection of individuals whose DNA derives from multiple continents? These types of questions illustrate some of the limitations and difficulties encountered when using genetic data to address complex issues related to biological classification.

How and why has biological thinking about race changed during the last ten years? How does it compare to the social meaning of race today?

During the past decade, a heated discussion has taken place about whether the ideas associated with race are either accurate or useful. Some argue that humans can be divided into relatively bounded groups with distinctive physical appearances (and, presumably, other biological differences) and that “race” remains a valid way of describing these differences. Others contend that the distribution of genetic variation in humans is too complex to be captured by a term as categorical as “race.” They also argue that the term is hopelessly compromised because it uses both social and biological factors to make racial distinctions, often without distinguishing between the two.

Most people in the United States (and in many other countries) see such discussions as beside the point, since they readily identify the groups known as races and sort people into those groups. But attitudes vary widely as to whether the members of those groups share distinctive characteristics (beyond their physical features) as a result of their genetic inheritance. Focus groups show that most people attribute an individual’s behaviors to culture, to how that person was raised, or to a person’s “willpower” or personal “essence.” But history demonstrates that ideas about the shared cultural distinctiveness of a group can easily transition into a belief that genetic factors contribute directly to a group’s cultural distinctiveness.

Can one draw significant and reliable conclusions about individual members of a given population on the basis of information about the population to which he or she belongs?

For almost all traits influenced by genetics, it is not possible to predict reliably an individual’s characteristics based on that person’s membership in a group defined by race. Because approximately 85 percent of the genetic variation present in the human population can be found in any relatively large group, each group has approximately the same range of variation in biological traits.

A prominent exception involves biological traits that have been under intense selective pressure, such as skin color in people living near the equator, structural abnormalities of hemoglobin molecules in people living in malarial regions, and the ability to digest milk among adults in pastoral populations. Other exceptions consist of genetic variants that are concentrated in particular groups for historical or cultural reasons. Groups descended largely from a relatively small founding population may have genetic disorders that were present among the founders. Examples include lysosomal storage diseases (like Tay-Sachs disease) among Ashkenazi Jews, Ellis-van Creveld syndrome among the Old Order Amish, and a collection of rare genetic diseases among the Finns. However, diseases caused by single genetic variants account for a small percentage of the total burden of disease among humans.

Some genetic variants involved in more common diseases are present in varying percentages from group to group. Examples include genes that, when present in particular forms, contribute to Alzheimer’s disease, asthma, breast cancer, and diabetes. But these variants tend to be just one of many genetic and environmental factors involved in the development of a disease. As a result, the frequency of a genetic variant in a group is not necessarily closely tied to the incidence of the associated disease in that group.

As Francis Collins, director of the National Human Genome Research Institute, has stated: “’Race’ and ‘ethnicity’ are poorly defined terms that serve as flawed surrogates for multiple environmental and genetic factors in disease causation....Research must move beyond these weak and imperfect proxy relationships to define the more proximate factors that influence health.”

How can we use our understanding of human genetic variation to inform our understanding of race and to improve personal and public health?

A wealth of new genetic data supports several broad conclusions about human genetic variation. Most genetic variation is found within major human populations, and genetic variants typically are shared among populations. While these variants differ in their frequencies among populations, very few are found exclusively and commonly in only one specific population (a pattern noted for physical traits more than a century ago by Charles Darwin). Because populations located close to one another tend to share histories and mates, such populations tend to be slightly more similar genetically than are populations from geographically distant locations. A correlation between geographic distance and genetic similarity is found in humans and most other species, and it is unsurprising to biologists. At the same time, it is difficult to delineate precise boundaries between populations, because of our history of migration and mate exchange. For example, there is no clear genetic boundary that separates European from Asian populations. Instead, variation is distributed in a gradual, clinal pattern as one proceeds eastward from Europe to Asia. This argues against a discrete typology of humans.

By examining large numbers of genetic variants, it is now possible to infer – at least roughly – the genetic ancestry of individual humans. Several studies have shown that one’s continent of origin can be inferred with substantial accuracy if 100 to 200 loci are examined. Another study of 377 loci in 3,600 Americans was able to classify all but six of them accurately into broad categories: European-American, African-American, Asian-American, and Hispanic-American. Importantly, these exercises seldom allocate individuals to a single population with 100 percent probability. Typically, there is at least a small probability that an individual could be allocated to another group, reflecting a mixed genetic heritage in many or most humans and again providing evidence against a discrete typology.

These results do not necessarily indicate that population affiliation is an accurate indicator of important biomedical parameters, such as response to a therapeutic drug. Such responses are likely to be mediated principally by a relatively small number of genes, as well as environmental factors. When only a small number of variants are considered, an individual from one population is often genetically more similar to someone from a different population than to someone from his own population. Consider the relationships shown in Figure 4, where genetic similarities for 14-kb sequences of the angiotensinogen gene (associated with susceptibility to hypertension) are shown for a collection of Europeans, Asians, and Africans. For this medically significant gene, a European is often more similar to an African or Asian than to another European. This illustrates in part a history of population mixture, and it implies that population affiliation would not be a good predictor of variation in this hypertension-susceptibility gene.

Figure 4. Genetic similarities among individuals from Africa, Asia, and Europe, based on a 14-kb DNA sequence in and near the angiotensinogen gene. Reprinted by permission from Mcamillan Publishers Ltd: Jorde LB, Wooding SP. 2004. Nature Genetics 36(11s):S28-S33. (

Another example is given by the drug gefitinib, which down-regulates epidermal growth factor receptor (EGFR) and is used to treat non-small cell lung cancer. Gefitinib is effective in about 30 percent of Asian patients but only 10 percent of European patients. This suggests that continental ancestry might help to decide who should get this drug. Irrespective of population affiliation, however, 80 percent of patients with somatic mutations in EGFR respond to the drug, while only 10 percent of those without mutations respond. Thus, direct examination of an important gene in each individual patient is a far better predictor of drug response than is one’s population affiliation. This is likely to be true for many medications and other therapeutic regimens.

One can argue that population membership, in spite of its imprecision, does provide some useful clinical information. Disease incidences and drug responses sometimes vary among populations, just as they often vary by age or sex. And medical decision making is typically a probabilistic process: certain diseases are more likely to occur in one sex or the other, or in older versus younger individuals. Thus, age, sex, and population affiliation can provide information that guides physicians as they consider possible diagnoses and treatments. But population membership must be evaluated cautiously. The complex and intertwined history of humankind guarantees that traditional designations of “race” will be imprecise and often misleading. Because human populations share most significant genetic variants, mere population membership is an inefficient predictor of biomedical phenotypes like drug response. As our ability to survey individual genomes improves, it is likely that medical care focused on the genes and environment of each individual patient (“personalized medicine”) will gradually supplant the use of inexact categories such as population or race in medical decision making. Because of the enormous and often harmful social baggage associated with “race,” this would be a significant improvement for everybody.

How might a better understanding of human genetic history help to inform discussions about cultural perspectives on race and their relationship to personal and human health?

A person’s race influences health through a complex interplay of social, psychological, environmental, biological, and genetic factors. When the complexity of this relationship is ignored, misinformation and outright prejudice can replace informed thinking.

For example, attributing differences in the health of groups (or other kinds of biological differences) solely to genetic factors overlooks the social processes through which groups are created and treated collectively within society. This process of “racial formation” can cause the members of a group to have subtly different experiences or environmental exposures that can result in health outcomes that seem genetically based. For example, many disadvantaged groups have higher rates of diabetes and obesity than does the general population. Searches for genetic factors contributing either to obesity or diabetes can overlook the complex social processes that concentrate these conditions within particular groups.

In most cases, the origins of group differences in health outcomes remain unknown. Keeping this uncertainty in mind can guard against the temptation to attribute group differences in health to one factor and not another.

Why do some people object to the study of human variation within and between populations?

The history of misuse of racial ideas urges great caution in using racial categories in health care or research. Yet investigation of human genetic differences has the potential to save lives by demonstrating why some people get sick and others don’t. Given this potential, research into human genetic variation – including the differences between human groups – can be expected to accelerate.

The findings of genetics can help dispel racial stereotypes by revealing the complex origins of traits and the close biological affinities among all human groups. Genetics research has shown that people are much more closely related than they think. For example, because of the way genetic variation is distributed in the human population, compatible blood and tissue donors are almost as likely to come from another racial group as from one’s own.

Stories in newspapers, magazines, and television shows can strongly influence how people think about these issues. The media tend to emphasize the discovery of genetic variants that differ in percentage among groups and appear to contribute to disease or to human traits. Follow-up findings that reveal the complexity (or, sometimes, the nonexistence) of these linkages get much less play. A healthy skepticism among reporters could lessen the chance that their work will intensify misconceptions about race.

The existing evidence shows that most of the health differences among groups arise through the effects of discrimination, differences in treatment, poverty, lack of access to health care, health-related behaviors, stress, and other socially mediated forces. The effects of genetic variation on group health disparities remain largely unknown but appear to be small.

What is the role of behavioral and environmental factors in group differences in the prevalence of disease?

Behavioral and environmental factors play enormous roles in the causation of common diseases. It is estimated, for example, that one-third of U.S. cancer deaths can be attributed to tobacco use, while another one-third are caused by dietary factors. Mortality due to coronary heart disease has decreased by more than 50 percent during the past 50 years because of improvements in diet, exercise, and medical treatment. Meanwhile, the incidences of obesity and type 2 diabetes have increased dramatically and threaten to reverse some of the progress made in preventing heart disease and other ailments. Considering these statistics, it is likely that non-genetic factors are far more important than genes in determining susceptibility to common diseases.

The prevalence of most common diseases varies among populations. Hypertension is approximately 50 percent more common among African-Americans than European-Americans, and type 2 diabetes is seen in nearly half of Pima Native American adults. Again, differences in behavior and environment probably account for most of this variation. For example, the prevalence of hypertension varies substantially among populations of African origin, depending on their environment. About 40 percent of African-American adults are hypertensive, while only a few percent of rural Cameroonians have this disease (reflecting a much lower incidence of obesity and other risk factors).

In assessing the effects of environmental factors on disease, it is especially instructive to examine disease prevalence in individuals and families who have moved from one setting to another. A classic example is the risk of specific cancers in Japanese individuals who moved from Japan to Hawaii and then to the United States. The lifetime risk of colon cancer in the U.S. population is about 5 percent, while – until recently – it was only 0.5 percent in Japan. Among first-generation Japanese individuals in Hawaii, the colon cancer risk rose several-fold, and, among second-generation Japanese who moved to the United States, the risk rose to 5 percent. Meanwhile, the risk of stomach cancer, which is relatively common in Japan but rarer in the United States, declined steeply in the Japanese migrants. (Not surprisingly, the risk of colon cancer is rising in Japan as this population has begun to consume a more “Western” diet.)

These examples illustrate the important effects of lifestyle factors on disease prevalence. Moreover, they lend support to the notion that appropriate lifestyle modifications can substantially improve the health of all populations.


Bottleneck – a dramatic decline in population size that causes the gene pool to become less diverse.

Clinal pattern – a gradient in gene frequencies from one region of a population to another.

DNA (deoxyribonucleic acid) – the genetic material; the information molecule that carries hereditary information from one generation of cells to the next and from one generation of individuals to the next. Genes are made of DNA, which is the molecular basis of heredity.

Gene – a segment of DNA that contains instructions for making a specific protein or proteins required by the body. Human beings have about 20,000 to 25,000 genes.

Gene flow – the movement of genes that results from the movement of people from one region to another.

Genetic distance – a measure of the evolutionary relatedness of two populations based on accumulated genetic differences.

Genetic drift – the random fluctuations of gene frequencies caused by sampling errors. Although drift occurs in all populations, its effects are most evident in very small populations.

Fst – a measurement of the genetic distance between populations; sometimes used to determine the existence of a separate subspecies in a population

Mitochondrial DNA (mtDNA) – the genetic material of the mitochondria, the organelles that generate energy for the cell. Mitochondria, which have their own DNA, are transmitted from one generation to the next only by females. This makes mtDNA quite useful for analyses of population history.

Mutation – a change in DNA. Mutations can be beneficial, neutral, or harmful (i.e., disease-causing).

Natural selection – a mechanism of evolution whereby members of a population with the most successful adaptations to their environment are most likely to survive and reproduce.

Polymorphism – a common variation in the sequence of DNA among individuals.

Race – a group that possess characteristic traits and gene frequencies that distinguish it from other groups in the same species. Racial designations are arbitrary and have not held up well when applied to Homo sapiens.

SNP (single nucleotide polymorphism) – variations of a single nucleotide at a given position in the genomes of a population. SNPs are unique and efficient “signposts” that are useful in scanning an entire genome for important mutations.