When the draft sequence of the Human Genome Project was completed in 2003, seventy-five-year-old James Watson—the co-discoverer of DNA—dreamt of soon finding the fundamental causes of all human outcomes in its string of letters. Indeed, scientists thought it would be just a few short years before the five genes behind heart disease, the dozen or so implicated in schizophrenia, and the twenty involved in cognitive ability would be discovered in this so-called dictionary of life. This optimism was well founded.
Article continues after advertisement
The latest medical genetics had already delivered the genes that were the most common causes of intellectual disability (FMR1), of breast cancer (BRCA1 and BRCA2), and of cognitive decline due to Alzheimer’s (APOE4). Once we knew the genes that lay behind the plaque in our arteries, various neurological and blood disorders, and even our tendency to smoke or drink too much, these conditions would become relics of the past. As it turned out, it took a decade to learn that most traits and diseases were influenced not by a handful of genetic differences, but by thousands. Most traits or diseases were highly polygenic.
At first, geneticists despaired: How could they understand the fundamental biology of, say, high blood pressure when it involved a multitude of genes? Moreover, what hope was there of ever devising a cure for schizophrenia if there wasn’t one, or even a dozen, but rather a thousand genes that a pharmaceutical needed to mimic or block in order to treat the disease—especially when those thousand genes were implicated in many other biological processes in the body?
There would not be just a couple of proteins that were the keystones to combatting heart disease or memory decline, proteins whose actions could be blocked (or enhanced) by drugs or, today, by mRNA vaccines. There would be no simple pill to eliminate diabetes. No obvious gene therapy for depression or for cognitive enhancement, for that matter. The 2011 Bradley Cooper film, Limitless, about a struggling writer who takes a new drug and ends up a financial wizard, would remain science fiction for the foreseeable future.
While the molecular genomics revolution has not yet led to a suite of customized pharmaceuticals to make us taller, leaner, smarter, and healthier, it has led to a new science of prediction.
Without going too deep into technical terminology: we all have three billion base pairs in our genome. These base pairs are nucleotides that exclusively bond to each other and are often referenced by the first letters of their names: (A)denine, (C)ytosine, (G)uanine, and (T)hymine. A always bonds with T, and C goes with G—hence the term base pairs. Only about 0.1 percent, or one in a thousand beads on that string of DNA, differs across people. Hence the common adage that we are all 99.9 percent identical. Understanding how variation in that 0.1 percent affects who we become required measuring DNA bead by bead across all three million strung on our twenty-three pairs of chromosomes.
The good news was that once cheap, genome-wide data became available for large numbers of people, we could test almost all those beads, and not just a handful we suspected might be associated with disease, using a method called a genome-wide association study or GWAS (pronounced “g-wass”). The first GWAS was run in 2005 on the now puny sample size of 96 cases of people with macular degeneration along with 50 controls. With just those 146 subjects, the authors of that pioneering study were still able to locate a key gene that increased risk for this eye disease by a factor of over seven-fold.
Another early and important GWAS was conducted on schizophrenia—a condition that bedevils about 1 percent of the population; its symptoms are devastating to sufferers (and their families). The typical schizophrenic lives twelve to fifteen years less than a non-sufferer. And the quality of those years is much reduced, as anyone with a relative or friend who has the disease well knows. Schizophrenia’s onset in early adulthood is difficult to predict.
But scientists have known for a long time that it is highly heritable—that is, that the likelihood of getting it is influenced by one’s genetic makeup. Most twin and adoption studies had arrived at a figure of 80 percent for its heritability, meaning that four-fifths of the variation in incidence in a population is due to genetic differences within that population. Some scholars thought that schizophrenia resulted from rare mutations that had big effects; others posited that it resulted from thousands of tiny effects across all the chromosomes.
In 2009, Shaun Purcell and a vast team from the Psychiatric Genetics Consortium published a paper in one of the leading science journals, Nature, arguing that it was not rare mutations but many common variants with small effects that explained the genetic risk for this devastating illness. To support this claim, they developed the first wide-net polygenic index based on 8,008 cases and 19,077 controls. Polygenic indices, such as those created by Purcell and his colleagues, simply sum the GWAS results for all the beads together into a single number for each individual that predicts their likelihood of, in this case, schizophrenia.
Soon the polygenic index approach was employed for a wide range of phenotypes. A phenotype is any outcome you can measure—a trait or a disease. PGIs have been developed for phenotypes ranging from height to blood pressure to education. Each one is calculated based on the same DNA loci in our genome. It’s just that the value or weight of each locus differs depending on what PGI we are calculating. PGI construction is kind of like a cookbook that tells you how to make one thousand recipes all from the same ingredients by merely adjusting the amount of each one.
Fast forward a decade and a half, and over six thousand GWAS studies have been run on over 3,500 traits or diseases to calculate polygenic indices. As sample sizes increased, predictive power improved. And as that power improved, polygenic prediction soon spread across the field of human genetics like wildfire. In the years since the first polygenic risk score paper was published, over twenty-five thousand have been published using the terms polygenic risk score (PRS), genome-wide risk score (GRS), polygenic score (PGS), or polygenic index (PGI). Over six thousand scientific articles appeared just in the last twelve months. The deployment of this tool shows no signs of abating.
If you can measure a trait in children or adults, you can calculate a PGI for it—anything from cleft palate to sleep chronotype to handedness. So, while the molecular genomics revolution has not yet led to a suite of customized pharmaceuticals to make us taller, leaner, smarter, and healthier, it has led to a new science of prediction. Today, we can predict a US child’s (or embryo’s) adult height, how far the child will go in school, and whether that child will be overweight as an adult—all from a cheek swab, finger prick, or vial of saliva.
Take education: the first polygenic index trained to predict how far someone went in school was calculated in 2013 based on analysis of 126,559 subjects and only explained 3 percent of the variation. But by 2022, the fourth iteration (EA4 as we called it) could explain 16 percent. As noisy as that is, it’s still a powerful tool for prediction: someone in the bottom 10 percent of the education PGI ranking has about an 8 percent likelihood of completing a four-year degree; meanwhile, an individual in the top tenth has about a 70 percent chance of graduating with a bachelor’s degree.
If you asked me to tell you whether your kid is going to graduate school based on their PGI, I would have a good chance of being wrong. But give me one hundred kids to test and rank order by education PGI, and the odds of predicting graduation outcomes for the top and bottom groups start tilting in my favor based on these stark average differences.
The main way that polygenic indices have been useful to clinicians so far is as a predictive tool—much like family history. Medical professionals are already attempting to triage risk for, say, heart disease, based on polygenic indices. The idea is to prescribe statins—which lower cholesterol levels—much earlier in life to those patients with an extremely high PGI for cardiovascular disease. That’s not nothing, in terms of medical utility, but it’s a lot less than we had hoped for when we embarked on the Human Genome Project. The lack of an obvious pharmaceutical translation to recent genetic discoveries means that the biggest consequences of the PGI revolution were more likely to be felt in the public health and social scientific landscapes than in medical treatment regimes. Indeed, that is what is happening now.
*
But not without serious questions about how such work will be used.
An adage that has been attributed to Irving Kristol, one of the fathers of modern American conservatism, says that a “neoconservative is a liberal who has been mugged” by reality. If that’s the case, someone must have really done a job on both Richard Herrnstein and Charles Murray, authors of The Bell Curve, published in 1994. Both would have seemed, early in their careers, to be unlikely candidates to author one of the most controversial books of the twentieth century.
Herrnstein, a Harvard psychology professor, has been described as B. F. Skinner’s star student. Along with John Watson and Ivan Pavlov, Skinner is considered a founder of behaviorism, a strand of psychology positing that all our actions can be explained by costs and benefits of those actions that we have learned through reinforcement. Put simply, when we get rewarded for doing something, we do it more, and when we are punished, we do it less often. We are totally conditioned by the environment. Skinner went so far as to argue that free will was an illusion. He supported his theories through numerous experiments on lab animals that he conditioned in his eponymous Skinner Box, technically known as an operant conditioning chamber.
A Skinner Box is an environmentally controlled dark chamber meant to house a mouse or other lab animal where the only stimuli the animal receives is controlled by the experimenter. The box contains levers for the animal to press in response to the stimuli. A food dispenser rewards the subject, and an electrified grid in the floor of the box punishes the caged animal.
As a research assistant to Skinner, Herrnstein conducted experiments using the operant conditioning chamber. (He oversaw the pigeons.) His doctoral thesis showed with mathematical elegance how the frequency of actions taken were in direct proportion to the rewards associated with those actions. This steeping in behaviorism wouldn’t have seemed to set Herrnstein on a path to coauthor a book that argued for the primacy of genetics in determining who got ahead in American life.
Nonetheless, in 1971, Herrnstein waded into the uproar caused by a 1969 academic journal article on the heritability of IQ. Herrnstein penned a piece in the Atlantic Monthly arguing that IQ was largely inherited (biologically)—and thus efforts to improve it or to ameliorate gaps between social groups, were largely for naught. For the rest of the decade, his classes were often interrupted by student protestors. Undeterred, he used the positions he established in his Atlantic article to form the core of what would become The Bell Curve’s argument.
Unlike Herrnstein, Charles Murray may have always been a conservative deep down. But his early support for labor unionism and the fact that he joined the Peace Corps as a volunteer in 1965 might have led an observer to peg him as liberal. The Peace Corps sent him to Thailand. He stayed there for a few years after his assignment ended and soured on development efforts: he came to see aid programs as counterproductive because they privileged the goals of bureaucrats over those of local citizens and, moreover, because the rapid change brought by such programs undermined local norms and traditions that had evolved over the years. These observations informed his thinking in graduate school at MIT and his 1974 doctoral thesis, “Investment and Tithing in Thai Villages: A Behavioral Study of Rural Modernization.”
Murray’s breakthrough moment came in 1984 when he published Losing Ground: American Social Policy 1950–1980. The book transposed many of the same arguments Murray had applied to Thai villages to the modern-day United States. Murray argued that welfare incentivized long-term poverty. Simply put, if you pay the poor, you induce more people to join their ranks. If you make marriage and job-seeking costly by reducing benefits when people wed or find employment, then you get less marriage and fewer job seekers. Over the long term, social norms—like the expectation to marry or the stigma attached to remaining unemployed for long periods—erode, and a culture of dependency emerges.
Not surprisingly, much debate ensued, and Murray became a superstar on the right. Losing Ground shook the foundations of the U.S. social policy debate and directly contributed to welfare reform legislation President Bill Clinton signed a dozen years after its publication. The new law imposed work requirements on recipients that were meant to combat what Murray had highlighted as the perverse incentives of the existing system. Twelve years may seem like a long time, but it is the blink of an eye in the domains of social science and public policy.
It’s not hard to see how such an incentive-based account attracted the attention of Herrnstein, the erstwhile operant conditioner. The pair teamed up for five years to write The Bell Curve. The 912-page tome has rightly become a byword for bad science and racism. Herrnstein and Murray’s central argument was that as a result of Civil Rights legislation and the tearing down of barriers like old boy networks at elite colleges and firms, one’s place in society was no longer determined primarily by social background (race and class). Rather, thanks to meritocracy, where we ended up on the socioeconomic ladder was now largely a result of innate ability—that is, our genes.
The authors claimed that inequality was rising not because of tax policies, the decline of labor unions, offshoring, or any of the other developments those on the political left point to. Rather, they claimed that genetic elites were marrying elites to a greater degree than ever before, leading to greater economic disparities as innate advantages were redoubled.
Moreover, Hernstein and Murray claimed that society overall was becoming less intelligent since those with lower ability tended to have more children than those with high cognitive ability. Though most of their analysis focused on white people, Herrnstein and Murray argued that the fifteen-point gap in average test scores between Black and white individuals was due to genetic differences and thus not worth trying to narrow through government intervention. This final argument is the main reason for the book’s enduring infamy.
Scholars killed forests rebutting The Bell Curve. Some of the academic critiques were on the mark; some less so. But the academic community was united in calling the book dangerous pseudoscience. Indeed, Herrnstein and Murray had conducted quite shabby analyses. First, their claims about the increasing salience of genetics (and the corresponding decrease in the importance of social class background) were based on data across a mere seven years of birth cohorts (1957–1964).
If a dozen years is a blink of an eye for a social policy idea to jump out of a book and become law, then seven years is a nanosecond in terms of diagnosing a societal trend. It could be those seven years—the tail end of the baby boom—represented a statistical blip. Even if the change in sorting they detected was real, it certainly was not enough to explain the rapid rise in economic inequality that started at the end of the 1960s (and has continued long after The Bell Curve was published).
Second, Herrnstein and Murray never actually measured genes at all. Back in the early 1990s, the closest they could come to assessing “innate” ability was to use data from students’ performance on a cognitive test during high school. Since they measured cognitive ability among teenagers, their results did not show unmitigated genetic potential; rather, these test scores also reflected social (dis)advantages throughout childhood. Third, though at some points in the volume they conceded that heritability is a population-specific concept and contributes nothing to understanding group differences, they ignored this reality and rushed ahead to conclude that race differences in test scores were genetically based.
It is scientifically ridiculous and irresponsible to look at PGI differences by racial or ancestral groups because those differences are meaningless.
Regardless of the dubious scientific merits of their claims, The Bell Curve spent fifteen weeks on the New York Times bestseller list. Its pernicious influence is still evident today. The book’s impact can be seen among white supremacist groups that desperately cling to random genetic facts to claim that Europeans are innately superior to other peoples. (One common image in this world is a white person with a milk moustache, since Europeans have the highest rate of lactase persistence—the ability to drink milk into adulthood—a quality that white supremacists argue provides cognitive advantages.) The peal of The Bell Curve can also be heard in manifestos like that written by the Buffalo mass shooter in 2022, among other hate crime perpetrators.
Murray has continued to practice bad science. (Herrnstein died in 1994, within days of the book being published.) At first blush, arrival of PGIs to the scientific scene would seem to have solved one of The Bell Curve’s major limitations—the lack of any genetic measures. Back in 1994, Herrnstein and Murray had to use test scores to make their argument that genes had eclipsed social background in determining who ended up on which rung of the socioeconomic ladder. But three decades later, we have an actual tool for measuring genetics.
It would seem an easy step from calculating a PGI for education or cognitive ability to tabulating the average PGIs for different racial groups and concluding that group differences in IQ are or are not, in fact, innate and intractable. Murray, in his 2020 follow-up to The Bell Curve, entitled Human Diversity, does back of the envelope calculations to “show” that Black people (who are an admixed population of African and European ancestry, mostly) have lower average education PGI scores than non-Hispanic whites. But this is pure bunk.
As in The Bell Curve, Murray borrowed a small seed of scientific knowledge and then sowed it in toxic soil, bending and twisting it in unscientific ways to assert a political agenda that appears to have the legitimacy of scientific knowledge but really does not.
The truth is that you cannot use PGIs to make cross-group genetic inferences. That’s because genes vary in distinct ways across ancestral populations. For example, the PGI for height—calculated among those of exclusively European descent—predicts Black Americans to be substantially shorter than white Americans, which is patently wrong. In a 2000 article called “Beware the Chopsticks Gene,” Dean Hamer and L. Sirota explained part of the problem like this: Imagine that you had a sample of mixed ethnicities. In your sample, Chinese people had a high rate of having Cs at a particular location on chromosome 16—say, 70 percent. But Europeans and Africans in the data only had Cs at a 20 percent rate. This is not uncommon. There are a lot of allele frequency differences across groups due to random fluctuation over the course of many generations. If you ran a GWAS on the outcome, “knows how to use chopsticks,” you would find that locus on chromosome 16 would be highly significant in predicting chopstick skills.
But it would be a false finding. It’s likely that the Cs are not causing anything biological having to do with finger dexterity. They are merely marking an individual as coming from an Asian culture where chopstick use is common. We could actually test this. If we reran the analysis within ancestry groups—that is, conduct a separate GWAS each for Chinese, Africans, and Europeans—we would likely find that the so-called chopsticks gene had no effects within any of the groups. The finding resulted from a cultural difference tagged by a genetic marker correlated with that cultural difference. To avoid this problem, researchers typically run GWAS only on a single ancestral group at a time—for instance, Han Chinese or white British or European. And when you do that, you can’t transport the results to another group that lives in a different environment and has a different genetic history.
Scholars continually emphasize that polygenic indices only work within groups, not across groups—that is, the PGIs for height and education developed within white, European samples don’t predict well for non-white groups. The result of this valid scientific concern is that over 85 percent of GWAS studies are run on samples with only people of European descent (because there are currently more data on these folks), and only 3 percent are performed on people of predominantly African ancestry. So, the GWAS results we obtain to construct PGIs are based on how those alleles work in white people. Thus, it is scientifically ridiculous and irresponsible to look at PGI differences by racial or ancestral groups because those differences are meaningless. It is easy to dismiss exercises like Murray’s on scientific grounds.
__________________________________
Adapted from The Social Genome: The New Science of Nature and Nurture by Dalton Conley. Copyright © 2025. Available from W.W. Norton & Company.