A fascinating piece of investigation by the Microsoft Research team into the way a tiny number of people can massively distort a survey’s results (my emphasis):
Much of the information we have on cyber-crime losses is derived from surveys. We examine some of the dificulties of forming an accurate estimate by survey. First, losses are extremely concentrated, so that representative sampling of the population does not give representative sampling of the losses.
Second, losses are based on unveried self-reported numbers. Not only is it possible for a single outlier to distort the result, we find evidence that most surveys are dominated by a minority of responses in the upper tail (i.e., a majority of the estimate is coming from as few as one or two responses).
Finally, the fact that losses are conned to a small segment of the population magnies the dificulties of refusal rate and small sample sizes. Far from being broadly-based estimates of losses across the population, the cyber-crime estimates that we have appear to be largely the answers of a handful of people extrapolated to the whole population.
A single individual who claims $50,000 losses, in an N = 1000 person survey, is all it takes to generate a $10 billion loss over the population. One unveried claim of $7,500 in phishing losses translates into $1.5 billion.
You can read the full report here.
Much of the information we have on cyber-crime losses is derived from surveys. We examine some of the dificulties of forming an accurate estimate by survey. First, losses are extremely concentrated, so that representative sampling of the population does not give representative sampling of the losses.