Behind Blue Eyes - Part 1 : Establishing a Baseline

April 29, 2026·

Like many people, I watch movies and TV shows. A significant portion of the content available in France comes from the US. For years, I’ve often caught myself thinking : “huh, the lead actor has blue eyes again.” Apparently, I’m not the only one asking that question. People on Reddit also are.

Since this question has been nagging at me, I’d like to try to answer it as rigorously as possible. There’s some work involved, but it should be manageable. On paper, it’s fairly straightforward. The broad approach :

Establish the distribution of eye color in the US.
Establish the distribution of eye color among American productions.
Compare these distributions using a statistical test.

Of course, we won’t be able to go much further than “there is a difference” or “we cannot conclude.” If a difference does exist, we can only formulate hypotheses, and further analyses would be needed to attempt to validate them. Correlation is not causation!

It’s also worth noting that I do live in France, so there might be a bias regarding my observations.

Introduction

Since eye color can already be subjective, even more so when perceived through a screen, we’ll define some broad categories : Blue/Grey, Brown/Hazel and Green/Other. This will help to unify figures across studies. From what I could read, confusion often occurs between blue or grey, and brown or hazel. When a fallback category (usually Other) is defined, it’s usually too small to be meaningful. Thus, it’s merged with Green to have more relevant numbers.

Finding data is hard

The scientific literature on the subject is fairly sparse. Some websites (such as Neoris ¹) provide figures, but none of them are sourced, making them essentially worthless.

A few interesting studies do exist, but they are either limited in scope or do not focus on the US, which would introduce significant bias. Furthermore, the color categories used are not always consistent across studies. Notably :

Katsara et al. ² (2019): Relatively recent, but this study focuses primarily on Europe. It does, however, confirm one thing: eye color varies considerably from country to country.
Grant et al. ³ (2002): A somewhat dated study relying on 1970s-1980s US data, covering only Caucasian populations, which makes it unrepresentative. Demographic dynamics have likely shifted since then. Not necessarily a good baseline.
Walsh et al. ⁴ (2012): More data on Europe.
Hashemi et al. ⁵ (2019): More recent, but limited to rural populations in Iran. Unfortunate.

In 2014, a survey ⁶ of 2,000 Americans yielded the following figures (using our classes). It gives a rough idea, but the cohort is probably too small to be truly reliable : Blue/Grey (27%), Brown/Hazel (63%) and Green/Other (10%).

A more serious study

Fortunately, a much more comprehensive recent study exists (Hashem et al. ⁷ (2025)). It is based on US driver’s license data from 2016, which is relatively recent and the best available source. The study is fairly exhaustive, aggregating over 230 million data points, and is published in a reputable journal. That said, a few limitations are worth noting:

Only 31 of the 50 US states are represented in the study, for various reasons. This could introduce bias depending on the population distribution across states.
Eye color is self-reported, which may introduce additional bias. Likely sources of confusion include : hazel/green, hazel/brown, and blue/grey. There also might many cultural influence that would tend people not to objectively self-report.
It only covers drivers, which inherently excludes certain populations : minors under 16, urban residents who don’t drive, etc. There may therefore be a slight discrepancy with the general population, but this is likely the best proxy available.

The study distinguishes several categories : grey, blue, green, hazel, brown/dark, and other. Looking at the data, two states immediately stand out for the other category : Arizona at 29.5% and California at 8.6% (see paper). All other states range between 0 and 1%. The gap is large enough that other almost certainly encodes a different concept in those two states (likely a catch-all administrative category rather than a true eye color) so including those records at face value would corrupt the distribution.

That said, removing them is a significant move : California is the most populous state in the study, and together the two states account for roughly 62 million records (~26% of the original dataset). Two additional concerns are worth flagging explicitly.

Sensitivity analysis

A quick look at those two states reveals the problem immediately. Scripts used to compute the numbers can be found on GitHub : 01_sensitivity_analysis.ipynb.

State	Blue/Grey	Brown/Hazel	Green/Other	Total
Arizona	17.4%	46.0%	36.6%	11,129,395
California	15.9%	68.5%	15.6%	51,198,859

Arizona’s Green/Other at 36.6% is three to four times higher than in any other state (typically 8–13%). California’s is also elevated at 15.6%. These figures confirm that other does not mean the same thing in those two states. It is absorbing a large category of records that would be distributed across the real eye-color classes elsewhere.

To quantify the impact, two scenarios are evaluated using both datasets :

Scenario A - include as-is : naively add California and Arizona with their other records folded into Green/Other (which is what baseline_31a.csv does). This is clearly wrong, but it shows the worst-case distortion.
Scenario B - include, drop other : set the raw other count to zero for California and Arizona, treating those records as missing. This removes 4,410,395 records from California (51,198,859 to 46,788,464) and 3,285,198 from Arizona (11,129,395 tp 7,844,197). The corrected dataset is baseline_31b.csv.

In Scenario B, we now have the following figures.

State	Blue/Grey	Brown/Hazel	Green/Other	Total
Arizona	24.8%	65.2%	10.0%	7,844,197
California	17.4%	75.0%	7.6%	46,788,464

Scenario A’s jump in Green/Other (+2.5 pp) confirms the contamination. Scenario B corrects for it : after dropping raw other, Arizona’s Green/Other falls to 10.0% and California’s to 7.6%, both well within the 8–13% range seen in every other state. This validates the correction. The resulting picture is that Blue/Grey falls from 27.3% to 25.2%, a drop of ~2 percentage points. The direction of the bias is conservative : by excluding California and Arizona entirely, our 29-state baseline overstates the share of light eyes. Any finding of overrepresentation of blue-eyed actors is therefore an underestimate, not a false positive.

Scenario	Blue/Grey	Brown/Hazel	Green/Other	Total
29 states (baseline)	27.3%	62.8%	9.9%	173,094,831
A - 31 states, `other` kept	24.4%	63.3%	12.3%	235,423,085
B - 31 states, `other` dropped	25.2%	65.4%	9.4%	227,727,492

Scenario B corrects the obvious contamination in the other category, but if a state’s recording practices diverge enough to produce a 30% other rate, there’s no guarantee the remaining categories are consistently defined either. Rather than trust a partial correction, we exclude both states entirely and accept the demographic coverage trade-off.

Demographic confound

2020 US Census shows that both California and Arizona are populated by more diverse populations than the other states. This possible bias should be acknowledged explicitly in the confounder analysis later.

Final baseline

This leaves us with a total of 173,094,831 individuals (compared to 235,423,085 in the original study), broken down as follows. You can find the data (filtered and consolidated from the tables in the study) here.

State	Blue/Grey	Proportion	Brown/Hazel	Proportion	Green/Other	Proportion	Total
Alaska	171,305	31.1%	318,925	58.0%	59,948	10.9%	550,178
Connecticut	408,657	23.4%	1,193,115	68.2%	146,947	8.4%	1,748,719
Delaware	179,891	25.6%	467,372	66.6%	54,977	7.8%	702,240
Georgia	1,641,800	22.2%	4,994,101	67.4%	769,892	10.4%	7,405,793
Idaho	745,842	36.6%	1,059,420	52.0%	232,892	11.4%	2,038,154
Illinois	3,917,952	23.3%	11,442,425	68.0%	1,469,440	8.7%	16,829,817
Indiana	2,572,846	32.9%	4,407,732	56.4%	832,163	10.7%	7,812,741
Kansas	980,851	32.3%	1,728,867	57.0%	322,978	10.6%	3,032,696
Kentucky	1,552,694	33.8%	2,379,739	51.8%	665,621	14.5%	4,598,054
Maine	752,911	38.3%	1,050,629	53.4%	164,756	8.4%	1,968,296
Michigan	2,706,298	30.0%	5,457,684	60.6%	842,956	9.4%	9,006,938
Minnesota	2,491,602	37.5%	3,414,006	51.4%	737,658	11.1%	6,643,266
Missouri	1,558,508	33.1%	2,550,401	54.1%	606,478	12.9%	4,715,387
Nebraska	472,500	35.8%	660,197	50.1%	185,348	14.1%	1,318,045
New Hampshire	489,654	34.6%	795,883	56.3%	128,588	9.1%	1,414,125
New York	3,106,752	20.3%	11,121,049	72.8%	1,054,923	6.9%	15,282,724
North Carolina	663,883	23.9%	1,851,215	66.6%	264,814	9.5%	2,779,912
Ohio	4,032,169	30.6%	7,641,671	58.0%	1,497,369	11.4%	13,171,209
Oregon	1,721,367	32.1%	3,075,731	57.3%	566,450	10.6%	5,363,548
Pennsylvania	2,578,482	29.2%	5,359,141	60.7%	894,152	10.1%	8,831,775
South Dakota	276,061	38.6%	352,985	49.3%	86,232	12.1%	715,278
Texas	4,260,320	16.8%	19,030,813	75.1%	2,057,737	8.1%	25,348,870
Utah	682,774	34.9%	1,024,876	52.4%	246,714	12.6%	1,954,364
Vermont	293,650	37.3%	422,835	53.6%	71,761	9.1%	788,246
Virginia	2,032,147	24.3%	5,542,765	66.2%	793,401	9.5%	8,368,313
Washington	3,009,059	30.9%	5,823,182	59.7%	913,725	9.4%	9,745,966
West Virginia	845,251	38.8%	1,053,838	48.4%	280,181	12.9%	2,179,270
Wisconsin	2,852,416	36.3%	4,058,535	51.7%	946,041	12.0%	7,856,992
Wyoming	335,079	36.3%	471,545	51.0%	117,291	12.7%	923,915
Total	47,332,721	27.3%	108,750,677	62.8%	17,011,433	9.9%	173,094,831

The DMV figures are consistent with the 2014 survey ⁶, though that comparison carries little weight : a 2,000-person survey has a higher margin of error (around 1–2% per category : $1.96 \sqrt{\frac{p(1 - p)}{2000}}$ ), wide enough that almost any plausible distribution would appear “fairly close”. The agreement is reassuring but not particularly informative.

Uncertainty in the baseline

Considering the number of individuals in the study, a confidence interval on the individuals would be ridiculously small. We can summarize this uncertainty with a simple question : What if we had surveyed different people in these same 29 states ? With 173 million observations, a standard binomial confidence interval on the Blue/Grey proportion is essentially a rounding error (fractions of a basis point). The statistical precision is not the problem.

However, the state coverage is incomplete (only 29 of 50 states). This raises another question : What if we had a different mix of states ? One tractable approach is to bootstrap over states : treat each state as a sampling unit, resample with replacement, and compute a population-weighted mean on each resample. This yields a 95% confidence interval that reflects geographic variance : How much the national estimate shifts depending on which states are in the sample. With 29 states, that interval is [X%, Y%]. It does not capture self-report misclassification, but it gives an honest lower bound on baseline uncertainty. An example of such approach is provided in 02_bootstrap_baseline.ipynb. Below are some estimated confidence intervals. We need to keep in mind that drawing from only 29 states is not optimal. Intervals should be interpreted as approximate.

Color	Estimate	Lower Bound	Upper Bound
Blue/Grey	27.3%	24.0%	31.4%
Brown/Hazel	62.8%	57.9%	66.8%
Green/Other	9.9%	9.1%	10.8%

Computing a confidence interval that honestly reflect these sources is non-trivial. For the purposes of this analysis, the 29-state figure of 27.3% Blue/Grey is used as the baseline, with the understanding that systematic biases (state coverage, self-report) likely push the true national proportion slightly lower. Any overrepresentation finding in movies is therefore conservative.

Conclusion

We now have something that appears to be a strong baseline : Blue/Grey (27.3%), Brown/Hazel (62.8%) and Green/Other (9.9%). We’ll use this for future work. Next time, we’ll work on the second part, which will aim to collect data about movies and TV shows.

Eye colour, distribution and origin. Neoris. https://neoris-eyes.com/en/eye-colour-distribution-and-origin/ ↩︎
Katsara, M.-A., & Nothnagel, M. (2019). True colors: A literature review on the spatial distribution of eye and hair pigmentation. Forensic Science International: Genetics, 39, 109–118. https://doi.org/10.1016/j.fsigen.2019.01.001 ↩︎
Grant, M. D., & Lauderdale, D. S. (2002). Cohort effects in a genetically determined trait: eye colour among US whites. Annals of Human Biology, 29(6), 657–666. https://doi.org/10.1080/03014460210157394 ↩︎
Walsh, S., Wollstein, A., Liu, F., Chakravarthy, U., Rahu, M., Seland, J. H., Soubrane, G., Tomazzoli, L., Topouzis, F., Vingerling, J. R., Vioque, J., Fletcher, A. E., Ballantyne, K. N., & Kayser, M. (2012). DNA-based eye colour prediction across Europe with the IrisPlex system. Forensic Science International: Genetics, 6(3), 330–340. https://doi.org/10.1016/j.fsigen.2011.07.009 ↩︎
Hashemi, H., Pakzad, R., Yekta, A., Shokrollahzadeh, F., Ostadimoghaddam, H., Mahboubipour, H., & Khabazkhoob, M. (2019). Distribution of iris color and its association with ocular diseases in a rural population of Iran. Journal of Current Ophthalmology, 31(3), 312–318. https://doi.org/10.1016/j.joco.2018.05.001 ↩︎
Lovering, C. (2023). Eye Spy: Worldwide Eye Color Percentages. Healthline. https://www.healthline.com/health/eye-health/eye-color-percentages#eye-color-percentages ↩︎ ↩︎
Hashem, F., Menicacci, C., & Granet, D. B. (2025). Iris color distribution in the United States of America. PLOS One, 20(9), e0312913. https://doi.org/10.1371/journal.pone.0312913 ↩︎

Last updated on April 29, 2026

Fixing Kubernetes Load Balancing with HTTP/3