A paper called

The Dunning-Kruger effect is (mostly) a statistical artefact: Valid approaches to testing the hypothesis with individual differences data.

has been doing the Twitter rounds recently. I read the paper. It provides evidence for the Dunning-Kruger effect.

## What’s the what

The Dunning-Kruger hypothesis states that the degree to which people can estimate their ability accurately depends, in part, upon possessing the ability in question.

To test this hypothesis, the paper authors asked people to guess their IQ and then gave them actual IQ tests. Then they analyzed this data and came to the conclusion the Dunning-Kruger effect is not present in their data.

The core of the article comes down to the following diagram. Essentially, the traditional argument in favor of the Dunning-Kruger effect is graphical, and comes in the form of the plots in panels A and B. However, these plots dichotomize objective IQ and turn out to be misleading, especially in the presence of measurement error. The arguments against this methodology are worthwhile but I will skip over them to consider evidence for the Dunning-Kruger effect itself.

What we care about is whether or not overconfidence varies as a function of skill. For this particular dataset, we operationalize

\[ \mathrm{overconfidence}_i = \mathrm{SAIQ}_i - \mathrm{IQ}_i \]

and ask if overconfidence varies with IQ. This turns out to be equivalent to regressing IQ on self-assessed IQ and checking if the slope of the IQ term differs from one.

Gignac and Zajenkowski (2020) does not report the regression `SAIQ ~ IQ`

, nor did they publish their data, but they do report the sample averages and standard deviations of `SAIQ`

and `IQ`

, as well as the sample size and the correlation between `SAIQ`

and `IQ`

. This lets us back out the results of a simple linear regression (thanks Twitter for the lazy math assist).

Anyway I did the calculations and we get the following regression table:

```
## # A tibble: 2 x 5
## term estimate std.error conf.low conf.high
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 intercept 89.0 3.94 81.3 96.8
## 2 slope 0.342 0.0385 0.266 0.417
```

Note that the slope is less than one for all compatible models. On average, for every one unit increase in `IQ`

, someone’s self-assessed IQ goes up by 0.34. To see how this is evidence of Dunning-Kruger, we can plot the regression.

Note that average person with an IQ of 80 self-assesses their IQ to be 116; that is, they are overconfident by 36 IQ points. The average person with an IQ of 100 self-assesses their IQ to be 123; they are only off by 23 IQ points. The average person with an IQ of 120 self-assesses their IQ to be 130; they are only off by 10 IQ points. Everyone is overconfident, but, in this dataset, people with higher IQs are less overconfident. This becomes even more clear if we look at fitted values of self-assessed SAIQ minus actual IQ, which I plot in the right panel.

Should we trust this regression? I’m cautiously optimistic. Eye balling panel C from the figure in the paper above, but things look pretty much ideal. There could still be measurement error issues such that OLS is not reliable here, but it’s certainly going to be way better than the weird quartile plots from before.

How does Gignac and Zajenkowski (2020) come to the conclusion that there is no evidence of Dunning-Kruger effect in this data? Well, they claim the Dunning-Kruger effect should show up as a non-linearity in the regression function, and then fail to find evidence of a non-linear conditional expectation. This reasoning doesn’t quite work because you can have a linear regression function that is consist with the Dunning-Kruger hypothesis, as I pointed out above.

## TL; DR

The previous quartile based approach to demonstrate the presence of Dunning-Kruger has problems. However, simple linear regression on the data reported in Gignac and Zajenkowski (2020) is still strongly suggestive of a Dunning-Kruger effect.

If you’d like to double check my code, it is available here.

**Update**: Thanks to James Pustejovsky for catching a dumb code error, which is now corrected.

## References

*Intelligence*80 (May): 101449. https://doi.org/10.1016/j.intell.2020.101449.