Statistical Concepts and Intersectionality

We can formulate this situation into an example of Simpson’s Paradox. When employee outcomes were examined overall, there was no evidence of discrimination between men and women. However, if employee outcomes were to be further broken down by race, there would have been a very clear discrepancy between the Black women and white women…

Statistical Concepts and Intersectionality

A Literal Case Study

Sara Stoudt
Bucknell University

Can data help support or refute claims of wrongdoing? Take a case about claimed hiring discrimination. What information would you want to know about a company’s hiring practices before you made a decision? Maybe you would want to compare the demographics of the application pool to the demographics of those actually employed by the company to see if there are any discrepancies. If you found any, you might investigate further to see if those discrepancies are too large to have happened just by chance.

Statistical ideas abound in Kimberlé Crenshaw’s 1989 paper discussing antidiscrimination court cases. This paper also first defined the term “intersectionality.” Intersectionality provides a framework to explain that different elements of a person’s identity combine to create privilege or pathways to discrimination. If we try to think about this statistically, it means that for a response variable of “how people treat you” there might be an interaction effect of, for example, race and sex, as well as an additive effect of those characteristics individually. For example, a Black man may be treated differently than a Black woman. Although this term is often used in social discourse, did you know it has its origins in a legal setting?

The legal scholar Kimberlé Crenshaw, a Black woman, stands before colorful signs reading "Intersectionality is...", "Europe", "Celebrate Differences", and "Complexity Rules!"
Kimberlé Crenshaw in 2018. Photo by the Heinrich Böll Foundation, CC-BY SA 2.0.

We will consider the first two cases considered in Crenshaw’s work and then a case brought up by Ajele and McGill in a further study of intersectionality in the law to make some connections to statistical ideas. Full disclosure before moving forward: I do not have any legal training, just an interest in how statistics is used in the courtroom, so this is my interpretation of these court summaries. If you have extra insight to share about the legal process, please reach out!

While this post was in press, the United States Supreme Court made a decision to overturn Roe v. Wade. We acknowledge that reading about court case implications is particularly heavy at this time. We frame some of the statistical questions that these cases bring up as intellectual exercises to emphasize the way that small, seemingly abstract decisions can have huge impacts on millions of people’s lives.

Simpson’s Paradox and DeGraffenreid v. General Motors

In this case, a group of Black women alleged that General Motors’ system of using seniority as a factor in determining who was laid off during a recession continued the effects of past discrimination against Black women. Importantly, the court would not allow the class of “Black woman” to be protected but required the plaintiffs to argue a sex discrimination case or a race discrimination case, but not both. As Arehart points out, the use of “or” in the Civil Rights Act (protects against discrimination based on race, color, religion, sex, or nationality) has led courts to interpret this as a plaintiff needing to choose one characteristic to focus on in their case.

There are some details that led the plaintiffs in this case to choose to pursue a sex discrimination claim. It was revealed during the case that General Motors did not hire Black women before the Civil Rights Act of 1964, so when everyone hired after 1970 was laid off, that meant that Black women were more likely to have less seniority. However, because white women were hired before 1964, there was a large enough pool of women who were not laid off that the court decided there was not enough evidence to support sex discrimination in this policy.

What does this have to do with statistics? We can formulate this situation into an example of Simpson’s Paradox. When employee outcomes were examined overall, there was no evidence of discrimination between men and women. However, if employee outcomes were to be further broken down by race, there would have been a very clear discrepancy between the Black women and white women.

To look at it visually, there is a very narrow pathway towards remaining at the company for Black women (in purple) while there seems to be a reasonable pathway towards remaining at the company for all women (in red).

A horizontal rectangle is split into about ⅔ remaining and ⅓ laid off. A rectangle below it shows the date hired with a cutoff at 1964 and 1970. A rectangle below the second one splits both the remaining and laid off chunks into male and female categories. A final rectangle below splits each male and female category into white and Black. The path towards remaining for females is large, but the path towards remaining for Black females is small (only if hired between 1964 and 1970).

Power Analysis and Moore v. Hughes Helicopter

What if a plaintiff was allowed to combine identities in a claim? In this case, the plaintiff, a Black woman, alleged discrimination based on race and sex. The court then determined that because the claim was made as a Black woman the plaintiff could not represent all Black workers nor all female workers. This limited the pool of workers that could be used in the statistics supporting the discrimination claim. The plaintiff could not use data for all female workers to make an argument, nor could they use data for all Black workers to make an argument. Instead, they were left with the small number of Black women as their data pool with which they could make an argument.

Three 2x2 grids are displayed where the rows represent male and female groups while the columns represent white and Black groups. The first grid labels analysis by rows as a sex discrimination investigation. The second grid labels analysis by columns as a race discrimination investigation. The third grid labels analysis between the female and Black cell and the other three cells a a sex *and* race discrimination investigation.

By limiting the pool of people eligible to be included in an analysis, the power to detect a real discrimination effect decreases. Consider a null hypothesis that the company is not discriminating based on race and sex. The power to reject that hypothesis when it is actually false is related to the sample size of each group, making a small group size a limiting factor. The court’s decision effectively raised the probability of a false negative, i.e., falsely concluding that there was no discrimination when there actually was.

If we consider a simplified framing of this question and determine the difference between the proportion of Black women promoted and the proportion of non-Black women promoted, we can use the pwr R package to investigate the power to detect a difference in proportions with unequal sample sizes. Take this investigation by Seongyong Park. They find that if the proportion of those who were fired in two groups is 0.15 and 0.30 (one group is twice as likely to be fired than the other) and both groups have an equal number of people in them, the power to detect the difference is about 0.86. However, if one group is 10 times as large as the other, the power drops to 0.69. Go ahead and use this code to investigate other situations! What would it take for the power to drop to 0.5?

Interaction Effects and Love v. Alamance County Board of Education

Is there another way for a plaintiff to combine identities in a claim and still face a statistical challenge? In this case, a Black woman alleged that she was discriminated against due to her race and sex. The court did evaluate both race and sex claims, but it did so separately. The court found no evidence of race discrimination alone nor evidence of sex discrimination alone, but the interaction was not investigated.

What makes this case particularly interesting? I picked this case because of a footnote about the statistician expert witness. From the case overview:

Dr. Jane Harworth, Ph.D., an expert in statistical analysis, examined applicant flow and hiring data for 1975-1983 and performed a binomial distribution analysis. When the raw data involved small pools, she utilized the Fisher’s Exact Test, a more precise version of the Student T Test. Dr. Harworth testified that there is no statistical support for the allegations of the existence of a non-neutral policy or of a pattern or practice of discrimination against blacks or females. Her analysis showed that the actual numbers of black or female hirees were within the range of two standard deviations. She further observed that the success rate of blacks and females exceeded whites and males, respectively.

This is an interesting example of how statistics are explained to the court. Note the translation of Fisher’s Exact Test as “a more precise version of the Student T test” and the decision to focus on plus or minus two standard deviations. However, there is nothing technically preventing a Fisher’s Exact Test from being used to compare Black women to everyone else.

Time for another exercise for the reader! I don’t love the binary distinctions in these court case scenarios, so let’s pick a different set of categories to work with. Consider a population of 100 people who can prefer summer or winter and who can prefer vanilla or chocolate. I’m considering this information when determining who to be friends with. Can you design a situation where it does not look like I discriminate based on season preference nor flavor preference yet it does look like I prefer to befriend a particular combination of of season and flavor preference? Here’s a hint: what if the recession happened a little earlier in the Moore example such that anyone hired after 1964 was laid off? It might be useful to sketch a 2×2 table.

Concluding Thoughts

Decisions made about how to measure discrimination involve statistical decisions behind the scenes. In fact, Crenshaw points this out in a footnote:

“A central issue in a disparate impact case is whether the impact proved is statistically significant. A related issue is how the protected group is defined. In many cases a Black female plaintiff would prefer to use statistics which include white women and/or Black men to indicate the policy in question does in fact disparately affect the protected class. If, as in Moore, the plaintiff may use only statistics involving Black women, there may not be enough Black women employees to create a statistically significant sample.”

Thinking about statistical decisions in context as well as the implications of court precedent or practice in terms of statistical concepts can help us both refine our practice of statistics and consider the consequences of our work. Real people are impacted by data-driven decisions; we must recognize and bear the responsibility of that.

References and Resources

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

56,007 Spambots Blocked by Simple Comments