Algorithmic Design: Fairness Versus Accuracy

Script error: No such module "AfC submission catcheck".

In recent years, algorithms have been used in all areas of life, including as a support for making important decisions - for example in hiring, providing medical care or approving a loan.
Due to the criticality of the algorithm results, it is important to ensure that their error rates do not vary significantly across different groups of the population.

We will represent a framework in which an algorithm recommends making a decision to an individual, for example whether to hire, when the decision is based on observed covariates about the individual, for example opinions of previous employers. We will evaluate the algorithm error for the individual with a general loss function, where the error depends on the decision the algorithm recommended and the unobserved type of the individual, for example promotion or dismissal from the new position.
Finally, we will sum up the errors of pre-defined groups of the population, for example gender, where the group error is the expected error for all group members.

Given two pairs of group errors, pair A and pair B, we will define that A Pareto dominates B, if it holds that

For each group, the error group in A is lower than the error group in B, i.e. the accuracy of A is higher than the accuracy of B.
The difference between the errors in A is lower than the difference in B, i.e. the fairness between the groups is higher in A than in B.

Important point – we do not prefer any way of making a decision, fairness or accuracy, but choose the algorithm that improves these two parameters, according to the Pareto frontier.

Based on "Algorithmic Design: Fairness Versus Accuracy" article.^[1]

Example of Fairness-Accuracy tradeoff[edit]

File:Fairness vs Accuracy - example1.png

No information → treat everyone. Equal error rate of 7/15 for both populations.

access to covariate

x_{1}

File:Fairness vs Accuracy - example2.png

More accuracy, less fairness.

File:Fairness vs Accuracy - example3.png

More fairness, less accuracy.

access to covariate

x_{1}

and to covariate

x_{2}

File:Fairness vs Accuracy - example4.png

Improves both fairness and accuracy – no tradeoff.

Framework[edit]

Consider a population of subjects, each subjuect has the following parameters:

type $Y\in {\mathcal {Y}}$
group $G\in \{r,b\}$
covariate (feature) vector $X\in {\mathcal {X}}$ , when ${\mathcal {X}}$ is a finite set.

X is observed by the algoritm's designer, Y and G are not directly observed (but may be revealed by X).

Y, G, X are random variables, and the joint distribution (Y, G, X) ~ $\mathbb {P}$ is known to the designer.

The designer chooses an algorithm $f:{\mathcal {X}}\rightarrow \Delta ({\mathcal {A}})$ that maps an outcome $a\in {\mathcal {A}}=\{0,1\}$ for each individual.

The loss function of the algorithm is $\ell :{\mathcal {A}}\times {\mathcal {Y}}\rightarrow \mathbb {R}$ .

$\ell (a,y)$ can interpret as one of the following:

Social cost os assigning outcome a to an individual of type y.
The negative of the subject's payoff (ignoring the true type).

Definition
The error for group $g\in G$ given an algorithm $f$ is $e_{g}(f):=\mathbb {E} [\ell (f(X),Y)\|G=g]$ , i.e., the average loss for subjects in group g.

$e(f)=(e_{r}(f),e_{b}(f))$ is an error pair.

Fairness-Accuracy Pareto Frontier[edit]

File:Pareto dominance.png

e pareto-dominates e'

Definition
An error pair e pareto-dominates an error pair e' , or $e\succ e'$ if $e_{r}\leqslant e_{r}'\ ,\ e_{b}\leqslant e_{b}'$ (higher accuracy) and $\|e_{r}-e_{b}\|\leqslant \|e_{r}'-e_{b}'\|$ (higher fairness), with at least one inequality strict.

Let ${\mathcal {F}}$ denote the set of all algorithms $f:X\rightarrow \Delta (A)$ .

File:Important points.png

in this example,

e_{r}<e_{b}

at

B_{X}

(optimal for group b)

Definition
The feasible set given X is ${\mathcal {E}}(X):=\{e(f):f\in {\mathcal {F}}\}$ . $\Rightarrow {\mathcal {E}}(X)$ is a convex polygon, assumint that X is a finite valued. The pareto set given X is ${\mathcal {P}}(X):=\{e\in {\mathcal {E}}(X):no\ e'\in {\mathcal {E}}(X)\ s.t.\ e'\succ e\}$ . $\Rightarrow$ e is pareto-undominated in the fessible set.

.

Fix covariate X

An optimal point (with minimal error) for group $g\in G$ is $G_{X}:={\underset {e\in {\mathcal {E}}(X)}{\arg \min }}\ e_{g}$
An optimal point for fairness is $F_{X}:={\underset {e\in {\mathcal {E}}(X)}{\arg \min }}\ |e_{r}-e_{b}|$
Break ties in favor of accuracy

File:P(X) for g-skewed.png

P(X) for g-skewed

File:P(X) for group-balanced.png

P(X) for group-balanced

Definition
Covariate X is r-skewed if $e_{r}<e_{b}$ at $R_{X}$ and $e_{r}\leqslant e_{b}$ at $B_{X}$ b-skewed if $e_{b}<e_{r}$ at $B_{X}$ and $e_{b}\leqslant e_{r}$ at $R_{X}$ group-balanced otherwise

Theorem
P(X) is the lower boundry of ${\mathcal {E}}(X)$ between $G_{X}$ and $F_{X}$ if X is g-skewed $R_{X}$ and $B_{X}$ if X is group-balanced

Definition
X exhibits a strong accuracy-fairness conflict if there are two points $e,e'\in P(X)$ satisfying $e_{r}\leqslant e_{r}'$ and $e_{b}\leqslant e_{b}'$ but $\|e_{r}-e_{b}\|>\|e_{r}'-e_{b}'\|$

Example of strong accuracy-fairness conflict:

Compare e'=(1/2,1/2) versus e=(1/3,1/4)
e' involves higher errors for both groups, but improves on fairness

Collary
Suppose $F_{X}$ is distinct from $R_{X}$ and $B_{X}$ , then X exhibits a strong accuracy-fairness conflict if and only if it is group-skewed.

Bayes Design[edit]

Designer can only control the inputs to the algorithm.

Definition
A grabling $T:{\mathcal {X}}\rightarrow \Delta ({\mathcal {T}})$ is any mapping from old covariate values $x\in X$ to $t\in {\mathcal {T}}$ .

Examples of garblings:

No change: $T(x)=x$ w.p. 1
Drop an input: $x=(x_{1},x_{2},x_{3})$ and $T(x)=(x_{1},x_{2})$ w.p. 1
Add noise: $T(x)=x+\varepsilon$ where $\varepsilon \perp \!\!\!\perp (X,G,Y)$
No information: $T(x)=x_{0}$ w.p. 1 for all x ( $x_{0}$ is constant)

For each garbling T, let $f_{T}$ be the algorithm that maps each realization of T(x) into the bayes-optimal action.

$f_{T}(t)\in {\underset {a\in A}{\arg \min }}\ \mathbb {E} [\ell (a,Y)\ |\ T(x)=t]$ , i.e, the choice of utilitarian agent.

Definition
The feasible set under bayes design given X is ${\mathcal {E}}^{}(X):=\{e(f_{T}):T\ is\ a\ garbling\ of\ X\}$ The pareto set* under bayes design given X is $P^{}(X):=\{e\in {\mathcal {E}}^{}(X):no\ e'\in {\mathcal {E}}^{}(X)\ s.t.\ e'\succ e\}$ . $\Rightarrow$ e is pareto-undominated in ${\mathcal {E}}^{}(X)$ .

Definition
Let $e_{0}:={\underset {a\in A}{\arg \min }}\ \mathbb {E} [\ell (a,Y)]$ .

File:P under bayes design.png

Proposition
If X is g-skwewd, then $P(X)=P^{}(x)$ iff aggregate error at $G_{X}$ and $F_{X}$ is weakly less than $e_{0}$ . If X is group-balanced, then $P(X)=P^{}(x)$ iff aggregate error at $R_{X}$ and $B_{X}$ is weakly less than $e_{0}$ .

Definition
Let $H:=\{e\in \mathbb {R} ^{2}:P_{r}e_{r}+P_{b}e_{b}\leq e_{b}\}$ , when $P_{r}e_{r}+P_{b}e_{b}$ is the aggregate error in population.

Lemma
${\mathcal {E}}^{}(X)={\mathcal {E}}(X)\cap H$ and $P^{}(X)=P(X)\cap H$

Special Cases[edit]

X Reveales G: The Frontier is Rawlsian[edit]

File:X reveals G.png

X reveals G

Definition
G is reveled by X if $G\|X=x$ is a degenerate distribution for every $x\in X$ .

Proposition
If G is reveled by X, then ${\mathcal {E}}(X)$ is a rectangle whose sides are parallel to the axes, and P(X) is the line segment from $R_{X}=B_{x}$ to $F_{X}$ .

$\Rightarrow$ The frontier is rawlsian

Disatvantaged group gets it's minimral feaible error: rawlsian designer indifferent between all pareto points.
Advantaged group's error depends on designer preferences -
- Utilitarian designer prefers $R_{X}=B_{x}$ point.
- Egalitarian designer prefers $F_{X}$ point.

Comparing P(X) and P(X,G)[edit]

X is group-balanced[edit]

File:Comparing P(X) and P(X, G) - X is group-balanced.png

X is group-balanced

(X,G) reveals G, so the new feasible set is a rectangle.

Lemma
The left and bottom boundaries intersects $R_{X}$ and $B_{x}$ .

The new frontier does not overlap the old frontier $\Rightarrow$ uniform pareto improvment.
This means that regardless of the fairness-accuracy preferences, using G in some way can improve payoff.

X is group-skewed[edit]

File:Comparing P(X) and P(X, G) - X is group-skewed.png

X is group-skewed

The new feasible set is a rectangle whose left and bottom boundaries intersects $R_{X}$ and $B_{x}$ .
The new frontier intersects the old frontier $\Rightarrow$ not an uniform pareto improvment.

References[edit]

↑ Annie Liang, Jay Lu, Xiaosheng Mu (2021). Algorithmic Design: Fairness Versus Accuracy.

This article "Algorithmic Design: Fairness Versus Accuracy" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Algorithmic Design: Fairness Versus Accuracy. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.

[1] Annie Liang, Jay Lu, Xiaosheng Mu (2021). Algorithmic Design: Fairness Versus Accuracy.

[1]