Blog

Behind Blue Eyes - Part 3 : Comparing the Distributions

Now that we have our data, we can finally compare the eye color distribution of actors to the baseline established in Part 1. As a reminder, the baseline from 29 US states is : Blue/Grey (27.3%), Brown/Hazel (62.8%) and Green/Other (9.9%). Full Cast We start by aggregating appearance counts by year and eye color. For each group, we compute an appearance rate. Before running any test, it is worth looking at the data visually.

May 3, 2026

Behind Blue Eyes - Part 2 : Collecting Data

#uselessAI

To answer the question raised in Part 1, we need data on the eye color of lead actors. To my knowledge, no database exists with this information. Wikidata sometimes contains it, but the coverage is very patchy. The simplest and most reliable approach is probably to build an actor dataset, collect their photos, and annotate them. Luckily, that’s exactly what we’re going to do today. The analysis will then follow in a third post.

May 1, 2026

Behind Blue Eyes - Part 1 : Establishing a Baseline

#uselessAI

Like many people, I watch movies and TV shows. A significant portion of the content available in France comes from the US. For years, I’ve often caught myself thinking : “huh, the lead actor has blue eyes again.” Apparently, I’m not the only one asking that question. People on Reddit also are. Since this question has been nagging at me, I’d like to try to answer it as rigorously as possible. There’s some work involved, but it should be manageable. On paper, it’s fairly straightforward. The broad approach :

April 29, 2026

Fixing Kubernetes Load Balancing with HTTP/3

#kubernetes

This work is actually a few months old. I recently dug it back up and thought it would be worth sharing. Versions used at the time : FastAPI 0.115.2, Hypercorn 0.17.3, Pydantic 2.9.2, Locust 2.32.0, Niquests 3.9.1 and Trio 0.27.0. Kubernetes is a great technology. I use it at work, but also personally, through a lightweight distribution called k3s . It really simplifies application deployment and scaling.

April 10, 2026

Following Up on Pydantic & Polymorphism

#python #pydantic #softwareEngineering

In my previous blog post, I discussed polymorphism with Pydantic and the issues it caused. I suggested an elegant solution (at least I think so), but one that was perhaps not the most plug and play option. That said, we can try to simplify things and rely solely on Pydantic’s features to achieve the same goal, but this will require a few compromises. In particular, we will need to leverage Pydantic’s core schema generation API and use a specific annotation. It’s not ideal but there is not much choice, unless we switch back to the previous solution.

February 1, 2026

Pydantic & Polymorphism

#python #pydantic #softwareEngineering

For several years now, I have been using the Pydantic library quite a lot at work as well as in my personal projects. It is handy for validation, easy to pick up, a real game changer for managing an application’s settings via Pydantic Settings , and for simple use cases, I have nothing to complain about. On the other hand, as soon as you start doing things that are a bit more complex, it gets trickier.

January 24, 2026

Hardening Home Infrastructure

#kubernetes #nixos

For the past two years, I’ve been proudly running a “cluster” based on k3s (a lightweight Kubernetes distribution). The setup was minimalistic, yet almost elegant : One control plane with 8 cores, 64 GB of RAM and a casual 1 TB NVMe SSD; One worker node with 8 cores (and the performance of a sleepy cow 🐄), 16 GB of RAM and a 1 TB HDD for that authentic “retro datacenter” feel. Up until now, it worked. Which is to say: nothing was on fire most of the time, but “working” and “being a good idea” are, as it turns out, two very different concepts.

January 18, 2026

Exploring CUDA, Threading and Async Python - Part 3

#python #cuda #softwareEngineering

Previously, we discussed the impact of the GIL on CPU utilization, particularly relevant for pre-processing. We also looked at how batch size affects GPU utilization (and consequently FPS) in an ideal scenario. However, that was far from a real-world case. In practice, there’s usually a pre-processing phase handled by the CPU, with some parts potentially offloaded to the GPU (like normalization, if it makes sense). Regardless, the CPU needs to send data to the GPU while also providing it with instructions to execute.

November 5, 2024

Exploring CUDA, Threading and Async Python - Part 2

#python #cuda #softwareEngineering

In my previous blog post, we discussed pre-processing, particularly through multithreading. Now, let’s try to understand what it means to put the GPU under pressure. First, we’ll focus on “native” PyTorch. We might dive into model-level optimizations later (and what that means for execution). CUDA CUDA is a parallel computing platform and programming model developed by NVIDIA, for NVIDIA GPUs (breaking news). Basically, it allows developers to harness the massive computing power of NVIDIA GPUs for tasks far beyond just graphics rendering. While CPUs (processors) are optimized for handling sequential tasks, GPUs are built for parallelism and can process thousands of operations simultaneously.

September 29, 2024

Exploring CUDA, Threading and Async Python - Part 1

#python #cuda #softwareEngineering

I’ve been working with Python for quite some time now. When it comes to AI application development, I particularly enjoy using PyTorch . This has given me the opportunity to tinker a bit with CUDA . As models continue to grow in size, optimizing compute resources becomes critical, both for training and inference. Given the cost of GPUs (👋 NVIDIA), it’s essential to make the most out of the hardware. Put simply: the GPU needs to be running at full throttle. All the time. In my experience, that’s easier said than done. The Python x CUDA combo can often be a real headache when it comes to maximizing performance. Not that it’s impossible, but it’s definitely not straightforward.

September 14, 2024

Next 2/2