Political scientists are limited by their reliance on existing data sets, and there is not enough emphasis on creating new data

I first wrote this post in 2012 and it’s still extremely relevant, so here it is again with a lick of new paint.

Shouldn’t there be more to political science than running regression analysis on other people’s datasets? That question has often occurred to me when sat in a university room somewhere around the country at one of the academic political science conferences as yet another presenter looks at British electoral politics through the eyes of a statistician.

It is a tribute to the profession that it produces data sets – most notably the British Election Study series – which are then so heavily used by so many people. A tribute, but also a weakness because the easy availability of credible data sets makes them deeply seductive as if the only research that needs doing is locating the data set and exporting its details. Then you can live in a world of mathematical tools and statistical analysis to your heart’s content.

That may sound harsh, but I am struck by the number of times the answer to one of my questions at a session has been, “we’ve not looked at that because the data doesn’t let us”. Not “we’ve not looked at that because we think it is pointless / obvious / stupid / wrong / silly” but instead the answer is that the only things being studied are those the data is available for, conveniently off the shelf.

Consider the case of party funding, which Peter John has used as an example in his defence of the state British political science:

The record of the study of politics has been very good, with long-term impacts of political science in subjects such as the study of elections, the reform of electoral systems, party funding, decentralisation, devolution, constitutional reform, public management reform, the work of the House of Commons and Lords, and in the conduct of foreign policy.

Yet huge areas of basic information about party funding are a mystery. Much good work has been done with analysis of data reported to and by the Electoral Commission (back to running statistical analysis on other people’s data). Get beyond the limitations of that data set and ignorance descends.

What was the total income of the Liberal Democrat last year? Or the year before? Or the year before that? No-one knows. The vast majority of Lib Dem local parties fall below the threshold for publishing accounts and the vast majority of their income does not come in chunks that require declaration. The money does not feature in the Electoral Commission’s records and so does not feature in political science’s knowledge of British political finance. Look at the number of local parties and the thresholds involved (several hundred local parties, £25,000 threshold each), and we could be talking of millions of pounds missing from the picture. That is not a trivial matter of detail.

Most likely the total is well below the theoretical maximum, but how far below and how have the figures changed over the years? No-one knows, because too much emphasis goes on analysing existing data and not enough on creating new data.

Nor is this one isolated example. Another big evidential hole is money donated directly to election campaigns, getting declared on the election expense return, rather than via a party. Who gives such money, how much do they give, what is the balance between individuals, companies and unions? All those sorts of questions go not only unanswered but unasked because political scientists concentrate on the donation records conveniently held and published by the Electoral Commission and which exclude this information.

The more you look into quite what the much used datasets do and don’t include, the more the questions multiply. The examples go on but the lesson is the same: political scientists do not know nearly enough about what is really going in with British political finances to be able to describe, explain or advise safely and well on points that require an understanding of the whole picture.

Which makes using party funding as an example of political science’s success decidedly odd. There are reasons both good and bad why there are so many unanswered questions. That makes party funding an example of the problems political science faces, not a stand out advert for its successes.

The wealth of detail in the statistical analysis far too frequently obscures the fragility of the evidence on which it stands. Statistical prowess deployed on such flawed evidence is not good academic study, it’s a diversion from good academic study.

3 responses to “Political scientists are limited by their reliance on existing data sets, and there is not enough emphasis on creating new data”

  1. At a slight angle there's the issue of party membership. There doesn't seem to be any general obligation for parties to publish regular membership figures themselves so they only do it when they have a good story as labour did in 2010.
    The electoral commission obliges labour & the libdems to produce annual membership figures but doesn't publish them till 7 months later. The tories are let off altogether on the grounds that they don't actually have a national membership.
    I think we should start publishing quarterly figures for the libdems & challenge other parties to do the same, what do you think?

  2. And then the weasle words "no evidence.." start to be trotted out as an objection to any reform. The problem is that the political parties aren't obliged to cooperate with researchers. Lawyer types high up in party hierarchies can be trusted to say no to anything that might conceivable damage a party's interests, public good or not.

    • "There is no evidence" is much favoured by companies that don't fund research in case it produces evidence adverse to their products.

Leave a Reply

Your email address will not be published. Required fields are marked *

All comments and data you submit with them will be handled in line with the privacy and moderation policies.