Skip to content
Behind Blue Eyes - Part 1 : Establishing a Baseline

Behind Blue Eyes - Part 1 : Establishing a Baseline

April 29, 2026·Ben
Ben

Like many people, I watch movies and TV shows. A significant portion of the content available in France comes from the US. For years, I’ve often caught myself thinking : “huh, the lead actor has blue eyes again.” Apparently, I’m not the only one asking that question. People on Reddit also are.

Since this question has been nagging at me, I’d like to try to answer it as rigorously as possible. There’s some work involved, but it should be manageable. On paper, it’s fairly straightforward. The broad approach :

  1. Establish the distribution of eye color in the US.
  2. Establish the distribution of eye color among American productions.
  3. Compare these distributions using a statistical test.

Of course, we won’t be able to go much further than “there is a difference” or “we cannot conclude.” If a difference does exist, we can only formulate hypotheses, and further analyses would be needed to attempt to validate them. Correlation is not causation!

It’s also worth noting that I do live in France, so there might be a bias regarding my observations.

Introduction

Since eye color can already be subjective, even more so when perceived through a screen, we’ll define some broad categories : Blue/Grey, Brown/Hazel and Green/Other. This will help to unify figures across studies. From what I could read, confusion often occurs between blue or grey, and brown or hazel. When a fallback category (usually Other) is defined, it’s usually too small to be meaningful. Thus, it’s merged with Green to have more relevant numbers.

Finding data is hard

The scientific literature on the subject is fairly sparse. Some websites (such as Neoris 1) provide figures, but none of them are sourced, making them essentially worthless.

A few interesting studies do exist, but they are either limited in scope or do not focus on the US, which would introduce significant bias. Furthermore, the color categories used are not always consistent across studies. Notably :

  • Katsara et al. 2 (2019): Relatively recent, but this study focuses primarily on Europe. It does, however, confirm one thing: eye color varies considerably from country to country.
  • Grant et al. 3 (2002): A somewhat dated study relying on 1970s-1980s US data, covering only Caucasian populations, which makes it unrepresentative. Demographic dynamics have likely shifted since then. Not necessarily a good baseline.
  • Walsh et al. 4 (2012): More data on Europe.
  • Hashemi et al. 5 (2019): More recent, but limited to rural populations in Iran. Unfortunate.

In 2014, a survey 6 of 2,000 Americans yielded the following figures (using our classes). It gives a rough idea, but the cohort is probably too small to be truly reliable : Blue/Grey (27%), Brown/Hazel (63%) and Green/Other (10%).

A more serious study

Fortunately, a much more comprehensive recent study exists (Hashem et al. 7 (2025)). It is based on US driver’s license data from 2016, which is relatively recent and the best available source. The study is fairly exhaustive, aggregating over 230 million data points, and is published in a reputable journal. That said, a few limitations are worth noting:

  • Only 31 of the 50 US states are represented in the study, for various reasons. This could introduce bias depending on the population distribution across states.
  • Eye color is self-reported, which may introduce additional bias. Likely sources of confusion include : hazel/green, hazel/brown, and blue/grey. There also might many cultural influence that would tend people not to objectively self-report.
  • It only covers drivers, which inherently excludes certain populations : minors under 16, urban residents who don’t drive, etc. There may therefore be a slight discrepancy with the general population, but this is likely the best proxy available.

The study distinguishes several categories : grey, blue, green, hazel, brown/dark, and other. Looking at the data, two states immediately stand out for the other category : Arizona at 29.5% and California at 8.6% (see paper). All other states range between 0 and 1%. The gap is large enough that other almost certainly encodes a different concept in those two states (likely a catch-all administrative category rather than a true eye color) so including those records at face value would corrupt the distribution.

That said, removing them is a significant move : California is the most populous state in the study, and together the two states account for roughly 62 million records (~26% of the original dataset). Two additional concerns are worth flagging explicitly.

Sensitivity analysis

A quick look at those two states reveals the problem immediately. Scripts used to compute the numbers can be found on GitHub : 01_sensitivity_analysis.ipynb.

StateBlue/GreyBrown/HazelGreen/OtherTotal
Arizona17.4%46.0%36.6%11,129,395
California15.9%68.5%15.6%51,198,859

Arizona’s Green/Other at 36.6% is three to four times higher than in any other state (typically 8–13%). California’s is also elevated at 15.6%. These figures confirm that other does not mean the same thing in those two states. It is absorbing a large category of records that would be distributed across the real eye-color classes elsewhere.

To quantify the impact, two scenarios are evaluated using both datasets :

  • Scenario A - include as-is : naively add California and Arizona with their other records folded into Green/Other (which is what baseline_31a.csv does). This is clearly wrong, but it shows the worst-case distortion.
  • Scenario B - include, drop other : set the raw other count to zero for California and Arizona, treating those records as missing. This removes 4,410,395 records from California (51,198,859 to 46,788,464) and 3,285,198 from Arizona (11,129,395 tp 7,844,197). The corrected dataset is baseline_31b.csv.

In Scenario B, we now have the following figures.

StateBlue/GreyBrown/HazelGreen/OtherTotal
Arizona24.8%65.2%10.0%7,844,197
California17.4%75.0%7.6%46,788,464

Scenario A’s jump in Green/Other (+2.5 pp) confirms the contamination. Scenario B corrects for it : after dropping raw other, Arizona’s Green/Other falls to 10.0% and California’s to 7.6%, both well within the 8–13% range seen in every other state. This validates the correction. The resulting picture is that Blue/Grey falls from 27.3% to 25.2%, a drop of ~2 percentage points. The direction of the bias is conservative : by excluding California and Arizona entirely, our 29-state baseline overstates the share of light eyes. Any finding of overrepresentation of blue-eyed actors is therefore an underestimate, not a false positive.

ScenarioBlue/GreyBrown/HazelGreen/OtherTotal
29 states (baseline)27.3%62.8%9.9%173,094,831
A - 31 states, other kept24.4%63.3%12.3%235,423,085
B - 31 states, other dropped25.2%65.4%9.4%227,727,492

Scenario B corrects the obvious contamination in the other category, but if a state’s recording practices diverge enough to produce a 30% other rate, there’s no guarantee the remaining categories are consistently defined either. Rather than trust a partial correction, we exclude both states entirely and accept the demographic coverage trade-off.

Demographic confound

2020 US Census shows that both California and Arizona are populated by more diverse populations than the other states. This possible bias should be acknowledged explicitly in the confounder analysis later.

Final baseline

This leaves us with a total of 173,094,831 individuals (compared to 235,423,085 in the original study), broken down as follows. You can find the data (filtered and consolidated from the tables in the study) here.

StateBlue/GreyProportionBrown/HazelProportionGreen/OtherProportionTotal
Alaska171,30531.1%318,92558.0%59,94810.9%550,178
Connecticut408,65723.4%1,193,11568.2%146,9478.4%1,748,719
Delaware179,89125.6%467,37266.6%54,9777.8%702,240
Georgia1,641,80022.2%4,994,10167.4%769,89210.4%7,405,793
Idaho745,84236.6%1,059,42052.0%232,89211.4%2,038,154
Illinois3,917,95223.3%11,442,42568.0%1,469,4408.7%16,829,817
Indiana2,572,84632.9%4,407,73256.4%832,16310.7%7,812,741
Kansas980,85132.3%1,728,86757.0%322,97810.6%3,032,696
Kentucky1,552,69433.8%2,379,73951.8%665,62114.5%4,598,054
Maine752,91138.3%1,050,62953.4%164,7568.4%1,968,296
Michigan2,706,29830.0%5,457,68460.6%842,9569.4%9,006,938
Minnesota2,491,60237.5%3,414,00651.4%737,65811.1%6,643,266
Missouri1,558,50833.1%2,550,40154.1%606,47812.9%4,715,387
Nebraska472,50035.8%660,19750.1%185,34814.1%1,318,045
New Hampshire489,65434.6%795,88356.3%128,5889.1%1,414,125
New York3,106,75220.3%11,121,04972.8%1,054,9236.9%15,282,724
North Carolina663,88323.9%1,851,21566.6%264,8149.5%2,779,912
Ohio4,032,16930.6%7,641,67158.0%1,497,36911.4%13,171,209
Oregon1,721,36732.1%3,075,73157.3%566,45010.6%5,363,548
Pennsylvania2,578,48229.2%5,359,14160.7%894,15210.1%8,831,775
South Dakota276,06138.6%352,98549.3%86,23212.1%715,278
Texas4,260,32016.8%19,030,81375.1%2,057,7378.1%25,348,870
Utah682,77434.9%1,024,87652.4%246,71412.6%1,954,364
Vermont293,65037.3%422,83553.6%71,7619.1%788,246
Virginia2,032,14724.3%5,542,76566.2%793,4019.5%8,368,313
Washington3,009,05930.9%5,823,18259.7%913,7259.4%9,745,966
West Virginia845,25138.8%1,053,83848.4%280,18112.9%2,179,270
Wisconsin2,852,41636.3%4,058,53551.7%946,04112.0%7,856,992
Wyoming335,07936.3%471,54551.0%117,29112.7%923,915
Total47,332,72127.3%108,750,67762.8%17,011,4339.9%173,094,831

The DMV figures are consistent with the 2014 survey 6, though that comparison carries little weight : a 2,000-person survey has a higher margin of error (around 1–2% per category : 1.96p(1p)20001.96 \sqrt{\frac{p(1 - p)}{2000}}), wide enough that almost any plausible distribution would appear “fairly close”. The agreement is reassuring but not particularly informative.

Uncertainty in the baseline

Considering the number of individuals in the study, a confidence interval on the individuals would be ridiculously small. We can summarize this uncertainty with a simple question : What if we had surveyed different people in these same 29 states ? With 173 million observations, a standard binomial confidence interval on the Blue/Grey proportion is essentially a rounding error (fractions of a basis point). The statistical precision is not the problem.

However, the state coverage is incomplete (only 29 of 50 states). This raises another question : What if we had a different mix of states ? One tractable approach is to bootstrap over states : treat each state as a sampling unit, resample with replacement, and compute a population-weighted mean on each resample. This yields a 95% confidence interval that reflects geographic variance : How much the national estimate shifts depending on which states are in the sample. With 29 states, that interval is [X%, Y%]. It does not capture self-report misclassification, but it gives an honest lower bound on baseline uncertainty. An example of such approach is provided in 02_bootstrap_baseline.ipynb. Below are some estimated confidence intervals. We need to keep in mind that drawing from only 29 states is not optimal. Intervals should be interpreted as approximate.

ColorEstimateLower BoundUpper Bound
Blue/Grey27.3%24.0%31.4%
Brown/Hazel62.8%57.9%66.8%
Green/Other9.9%9.1%10.8%

Computing a confidence interval that honestly reflect these sources is non-trivial. For the purposes of this analysis, the 29-state figure of 27.3% Blue/Grey is used as the baseline, with the understanding that systematic biases (state coverage, self-report) likely push the true national proportion slightly lower. Any overrepresentation finding in movies is therefore conservative.

Conclusion

We now have something that appears to be a strong baseline : Blue/Grey (27.3%), Brown/Hazel (62.8%) and Green/Other (9.9%). We’ll use this for future work. Next time, we’ll work on the second part, which will aim to collect data about movies and TV shows.


  1. Eye colour, distribution and origin. Neoris. https://neoris-eyes.com/en/eye-colour-distribution-and-origin/ ↩︎

  2. Katsara, M.-A., & Nothnagel, M. (2019). True colors: A literature review on the spatial distribution of eye and hair pigmentation. Forensic Science International: Genetics, 39, 109–118. https://doi.org/10.1016/j.fsigen.2019.01.001 ↩︎

  3. Grant, M. D., & Lauderdale, D. S. (2002). Cohort effects in a genetically determined trait: eye colour among US whites. Annals of Human Biology, 29(6), 657–666. https://doi.org/10.1080/03014460210157394 ↩︎

  4. Walsh, S., Wollstein, A., Liu, F., Chakravarthy, U., Rahu, M., Seland, J. H., Soubrane, G., Tomazzoli, L., Topouzis, F., Vingerling, J. R., Vioque, J., Fletcher, A. E., Ballantyne, K. N., & Kayser, M. (2012). DNA-based eye colour prediction across Europe with the IrisPlex system. Forensic Science International: Genetics, 6(3), 330–340. https://doi.org/10.1016/j.fsigen.2011.07.009 ↩︎

  5. Hashemi, H., Pakzad, R., Yekta, A., Shokrollahzadeh, F., Ostadimoghaddam, H., Mahboubipour, H., & Khabazkhoob, M. (2019). Distribution of iris color and its association with ocular diseases in a rural population of Iran. Journal of Current Ophthalmology, 31(3), 312–318. https://doi.org/10.1016/j.joco.2018.05.001 ↩︎

  6. Lovering, C. (2023). Eye Spy: Worldwide Eye Color Percentages. Healthline. https://www.healthline.com/health/eye-health/eye-color-percentages#eye-color-percentages ↩︎ ↩︎

  7. Hashem, F., Menicacci, C., & Granet, D. B. (2025). Iris color distribution in the United States of America. PLOS One, 20(9), e0312913. https://doi.org/10.1371/journal.pone.0312913 ↩︎

Last updated on