May 17, 2025

A Little Safari Through Patterns and AI

Last week I was at an outdoor photo shoot. The model wore sunglasses, which looked cool and added variety to the session. But my camera was confused: no autofocus on the eyes possible. This got me thinking: How do machines actually recognize faces?

Reading time: 2 Min.

In the past, it was pure pattern recognition. Two dark dots on top, a line in the middle, a line at the bottom — boom, face found! This is how the legendary Viola-Jones method from 2001 worked, which was the standard for years. Surprisingly simple, right? Unfortunately also surprisingly unreliable as soon as someone turns their head slightly.

Modern face recognition is much more sophisticated. It uses deep learning - artificial neural networks trained on millions of faces. The fascinating part: These networks learn independently what constitutes a face. They develop their own "patterns" that are far more complex than our naive "two-dots-plus-line" logic.

Take Google's MediaPipe Face Mesh. It doesn't just roughly detect "there's a face" but maps 468 three-dimensional points onto it. Mouth, nose, eyes, cheekbones, everything is precisely captured. At least theoretically.

In practice … well, I tested it myself using a demo that uses this exact neural network called TensorFlow in the browser. Sometimes the AI fails spectacularly. A frontal portrait in good light? No problem!

But once conditions aren't optimal, it gets tricky. For example in backlight or when the face is upside down. This isn't because the AI is "stupid" — it's just highly specialized. Like a Formula 1 car: unbeatable on the racetrack, but completely overwhelmed on a dirt road.

And then there are the "phantom faces." A phenomenon that regularly irritates me during nude shoots. Recently, the camera confidently detected an eye and focused precisely on a nipple. From the AI's perspective, it makes sense: a dark point with a circular structure around it? Must be an eye! Now I finally know why some men constantly stare at women's chests. They've simply recognized a face there. AI-trained, so to speak.

These "false positives" offer small glimpses into the AI's "thinking." They show how far machine perception still is from our human understanding of context. While we humans immediately recognize the difference between an eye and other central body parts, the AI only sees patterns and geometric shapes.

We humans have a decisive advantage: we understand context. When we see a photo of someone standing in the forest, we immediately know "there must be a face." Even if we only see the silhouette. The AI, on the other hand, rigidly checks its learned patterns. No wonder my camera gives up with sunglasses.

Navigate

‹

Home

›

Browse by category