Description

Fairness metrics are primitives that quantify algorithmic bias across demographic groups through standardized mathematical measures. Rather than claiming systems are "fair" without evidence, fairness metrics provide concrete numerical measures of disparities: statistical parity (equal outcomes across groups), equalized odds (equal error rates), calibration (equal accuracy per group), or group fairness (equal treatment of similar individuals). By quantifying fairness, researchers, developers, and regulators can identify discrimination, track bias reduction efforts, and establish acceptable fairness baselines.

The primitive emerged from Gender Shades research documenting that commercial facial recognition systems had error rates of 0.8% for light-skinned males but 34% for dark-skinned females—demonstrating that apparently working systems could systematically harm marginalized communities. Fairness metrics formalize what Gender Shades revealed: apparent system accuracy masks disparate impact. By measuring fairness for each demographic group, discrimination becomes visible and measurable.

Fairness metrics face inherent tensions (different fairness definitions may be mathematically impossible to satisfy simultaneously) and require judgment about which fairness concept to prioritize. This makes fairness metrics tools for transparency and deliberation rather than technical solutions, requiring community input into which fairness definitions matter.


Technical Specifications


Civic Applications