The Forecasting Proficiency Test
And what it teaches us about forecasting
A recent paper from the Forecasting Research Institute develops an easy-to-administer test for forecasting ability. While the best indicator of forecasting ability is performance on forecasting questions, performance on certain cognitive tasks involved in forecasting may be a decent preliminary indicator of forecasting ability.
The best way to tell how good a forecaster someone is is to see how well they do in practice; the best predictor of how accurate their forecasts will be is how accurate their forecasts have been. The original Good Judgment Project—which I participated in—was essentially a massive forecasting test: participants who were exceptionally accurate were identified as superforecasters. But it takes on the order of 100 unique questions to reliably distinguish between consistently accurate forecasters and people who just happened to make some lucky guesses. We had to spend nine months in a formal forecasting setting to establish a substantial enough track record to demonstrate forecasting skill.
We can potentially identify capable forecasters more quickly by using “intersubjective measures” like proxy scoring. Rather than waiting to see how forecasting questions resolve, we can score forecasters against one another. One way is to score individual forecasts against the aggregate forecast produced by a large group of forecasters. Crowd forecasts are generally fairly accurate since the individual forecasters’ errors tend to cancel one another out, so we can use the crowd aggregate as a proxy for actual outcomes. Another way is to ask people to make meta-predictions about what other forecasters will forecast. Forecasters who can accurately estimate what crowd aggregate demonstrate that they can reproduce a fairly accurate forecast. These intersubjective scoring techniques have been shown to be almost as effective at evaluating forecasters as scoring them against their actual results. While intersubjective scoring has the advantage that it can be done in real time, without waiting to see forecasting questions resolve, you’d still normally need to have access to a large sample of formal forecasts to assess forecasters with these techniques.


