Boy blowing candles on birthday cake while celebrating with family at home

Opinion: What the ‘Birthday Paradox’ Can Teach Us About Black History

Statistics can help fill in historical gaps, revealing surprising and empowering things about the past and present.

What happens when a provably true idea nevertheless feels like it must be impossible? This is called a veridical paradox, a situation when our intuition fails, and everyday reasoning badly underestimates how often coincidences or overlaps occur.

A famous example is the birthday paradox. It is a mainstay of introductory statistics classrooms, an example that reveals both the magic of probability and the limits of human perception. The birthday paradox can be defined many different ways, but a simple version can be described in two parts: Think back to childhood, when class birthdays were a significant event. The birthday paradox asks how large a class needs to be for it to be more likely than not that two people in the class share a birthday. (Assuming birthdays are equally likely throughout the year, rather than somewhat seasonal.)

The answer is not hard to find in cyberspace, so I will spare the suspense: The number is 23.


Pressure

SELECTIVE PRESSURE:  Exploring the collisions of science, culture, and belief.


As described above, this feels impossible because there are 365 days in a year. And yet, only 23 random individuals are enough to make it more likely than not that two people share a birthday. The large odds of the overlap occur because of the sleight-of-hand of statistics: When we are calculating the odds of a birthday overlap, we are doing so based on comparisons among all 23 people — 253 total comparisons in the group. When we think of the situation that way, the odds of having an overlap increase in a hurry.

Mathematically speaking, there is nothing special about birthdays. The logic of the birthday paradox applies to any problem with the same underlying structure: What is the probability that a set of random individuals are connected through some event, whether it’s a birthday or another occasion?

A study published last year demonstrates how concepts like the birthday paradox can tell important stories about human society, how historical events shape who we are, and how we relate to one another. In that study, a team of scientists applied similar reasoning to a question of social relevance. But instead of using the birthday as the anchoring event, the research team pointed to a historical event: the transatlantic slave trade. The researchers then determined, using the parameters of the birthday paradox, the probability of two living Black Americans, descended from individuals who were enslaved in America, sharing a common ancestor.

To understand the study, we must start with facts about ancestry: The number of ancestors each person has grows exponentially with every new generation, while the historical population they descend from is fixed. That is, the further back you go, the more ancestors each person has.

The paradox resides in the fact that, for Black Americans born in the 1960s, there is largely one source population — individuals brought to North America via the transatlantic slave trade — and so the family trees of those living today begin to intersect as you move backward through time. More recent immigration from the Caribbean and Africa has diversified the contemporary Black population, but for a substantial segment of Black Americans the demographic arithmetic is unchanged. As with the birthday paradox, overlap emerges far sooner than intuition would predict: Move back enough generations, and seemingly distant lineages begin to converge.

The model in the study uses demographic records and makes several assumptions to estimate how often that overlap should occur. And the numbers do tell a quiet story of convergence. For two Black Americans born in the early 1960s, researchers put the probability of shared enslaved ancestry at somewhere between 19 and 31 percent, or nearly one in three. Move forward a generation, to children born to two of those Black Americans around 1990, and that figure is above 50 percent. Many of the descendants of the Middle Passage are, statistically speaking, family.

Not only is slavery a defining issue of American history, but it is also reflected in the ancestral relationship between many millions of individuals.

How do we interpret the findings, and what, if anything, does this say about the role of statistical modeling in our understanding of social institutions like slavery?

For one, the results of the study demonstrate how large-scale ordeals leave a meaningful signature on contemporary populations. Traumatic events are an unfortunate but very real feature of human history: War, famine, epidemics, and genocides have left an indelible mark on all aspects of human society — from how society is structured to who is rich or poor, the manner in which we govern, and what we value. Findings from the recent study reveal just how significant the slavery enterprise was and the powerful downstream effect of this forced migration in contemporary populations. Not only is slavery a defining issue of American history, but it is also reflected in the ancestral relationship between many millions of individuals.

Notably, the model cannot name specific relatives, and it does not replace genealogical and historical research. This is an important point to emphasize. I’m a computational biologist whose research investigates biological complexity across multiple scales, from the molecular to the social. The latter has afforded me the privilege of interacting with scholars of various stripes on projects that range from the study of pandemics and the criminal legal system to the nature of political identity. But multidisciplinary research is a high art, and we should be mindful of actors who (regardless of their intentions) oversimplify complex phenomena with quantitative methods.


Newsletter Journeys

SIGN UP FOR NEWSLETTER JOURNEYS:  Dive deeper into pressing issues with Undark’s limited run newsletters. Each week for four weeks, you’ll receive a hand-picked excerpt from our archive related to your subject area of interest. Pick your journeys here.


Skepticism surrounding “scientific” approaches to studying social phenomena is especially important when considering race. Since the 18th century, the natural sciences have opined on the nature of human differences, often with embarrassing, destructive imprecision. (For example, see the eugenics movement.) The fraught relationship between the use of statistics in race and crime was outlined in Khalil Gibran Muhammad’s 2010 classic “The Condemnation of Blackness.” In it, Muhammad demonstrates how early social scientists used crime statistics to recast racism as objective knowledge, hardwiring the association between Blackness and criminality into public discourse. And while the book’s major lessons are focused on crime control, it offers a larger imperative when bringing statistical methods into matters related to history and race: No one should parachute mathematics into social phenomena.

Thankfully, when telling stories about American history, we need not make a hard choice between numbers and narratives. Libraries are full of books and journal articles authored by scholars who leverage all sorts of methods in their precision storytelling. There are smart ideas and stupid ideas about race and history. But they are not smart or stupid simply because they are qualitative or quantitative.

What makes the birthday paradox study a compelling example of using statistical modeling for the study of American history? For one, the model requires none of the large data or algorithms that typify many quantitative expeditions at the border between the natural and social sciences. There is something special about the use of mathematical tools that are transparent. The assumptions made in the birthday paradox model are clear, which make it easy to replicate, debate, refute, and improve.

Family is family, whatever our genetic relationship. Impressively, the birthday paradox study captures this possibility.

Another strength of the approach: It fills in gaps left by archives and historical records that can be sparse for colonized, subjugated, and enslaved populations, which make the act of constructing cohesive stories about the past challenging. But because the model in the birthday paradox study is probabilistic, complete archives aren’t required, and its definition of a “relative” is genealogical rather than genetic. Ancestry in this sense involves connections adhering to a more general definition of “family,” and so the assumptions of standard population genetics are less relevant. That is, the study is aligned with a cultural norm commonly practiced in the Black American community: our famously informal use of relational terms like “brother,” “sister,” or “cousin.” Family is family, whatever our genetic relationship. Impressively, the birthday paradox study captures this possibility.

In the end, we are left with more questions than answers. The study’s probabilistic findings on shared ancestry don’t solve the conundrum that is race in America, nor do they replace the centrality of Black scholarship that has interrogated this history using literary criticism, archival research, ethnography, archaeology, and legal scholarship.

But because Black studies is under attack, now might be the time to add new weapons that support the work of hundreds of scholars and activists from the past and present. Many brave individuals sacrificed their lives to tell the world about the story of the transatlantic slave trade, its millions of descendants, and the America that it made. In light of this, mathematics may not lead us to the truth, but it can help.

Republish

C. Brandon Ogbunu is an associate professor in the Department of Ecology and Evolutionary Biology at Yale University, a professor at the Santa Fe Institute, and the author of Undark's Selective Pressure column.