Showing posts with label measurement. Show all posts
Showing posts with label measurement. Show all posts

Friday, January 19, 2018

Democracy Data, Updated

(Of interest mostly to political scientists or other users of country-year democracy data)

Quick announcement: I’ve just updated my democracyData and QuickUDS R packages (described in this post at more length) to incorporate the latest data from Freedom House (Freedom in the World 2018) and most recent update of the voice and accountability index from the Worldwide Governance Indicators. The democracyData package (https://xmarquez.github.io/democracyData/) allows you to download, tidy, and use a wide variety of datasets with regime and democracy indicators, while the QuickUDS package (https://xmarquez.github.io/QuickUDS/facilitates the construction of Unified Democracy Scores-style latent variable indexes of democracy.

Here’s what Freedom House’s latest data (use with care!) says about the average level of freedom in the world (all countries equally weighted):


Or aggregated by status (free, partly free, not free):


Not so much evidence of a democratic recession, but some evidence of stagnation.

And here in some selected countries:



For contrast, here’s what my version of the Unified Democracy Scores (which incorporate the Freedom House scores as one of their inputs) says about the average level of democracy in the world:


This measure shows a bit more evidence of a decline in the average level of democracy in the world over the past few years, at least according to the indices commonly used by political scientists. This may be simply because only REIGN and Freedom House have data for 2017 so far, so best not to take that dip for 2017 too seriously.

And again the extended UDS for selected countries:



Finally, here’s what the Varieties of Democracy dataset, which I consider to have the best and most flexible set of measures, says:



Here there is only a hint of a downturn in the average level of democracy in the world (but note V-Dem has not yet been updated with 2017 data).

And here is what this data looks like for selected countries:


Enjoy!

Sunday, May 29, 2011

Crowdsourcing a Democracy Index: An Update

(Part 1 of possibly several, depending on time and mood)

A couple of months ago, I set up a democracy ranking website using the Allourideas software as part of a class project to crowdsource a democracy index (which has now been completed; more on that project in an upcoming post). The site works by presenting the user with a random comparison between two countries, and asking them to vote on which of these countries was more democratic in 2010 (click here if you can't see the widget below):



The 100 or so students in my class started the ball rolling, and their responses generated an initial democracy index that had a correlation of about 0.62 with the Freedom in the World index produced by Freedom House: respectable but not great. The post describing the initial results got some links from Mark Belinsky, the Allourideas blog, and Jonathan Bernstein, which increased the number of votes substantially. In fact, as of this writing, the website has registered 4402 (valid) votes, from about 203 different IP addresses, mostly in the USA, New Zealand, and Australia:


4,402 valid votes means at most 4,402 distinct comparisons out of a possible 36,672 potential comparisons of 192 countries (most comparisons have appeared only once, but a few have appeared a couple of times), or about 12% of all possible comparisons. How has the increase in the number of voters changed the generated index? And how does it compare to the current Freedom House index for 2010? As we shall see, the extra votes appear to have improved the crowdsourced index considerably.

Here is a map of the scores generated by the "crowd" - i.e., voters in the exercise (darker is more democratic, all data here):




And here's a scatterplot comparing the generated scores to Freedom House's scores for 2010 (click here for a proper large interactive version):


The Y axis represents the score generated by the Allourideas software: basically, the probability that the country would prevail in a comparison with a randomly selected country. For example, the Allourideas software predicts that Denmark (the highest ranked country) has a 96% chance, given previous votes, of prevailing in a “more democratic” comparison with another randomly selected country for 2010, whereas North Korea (the lowest ranked country) only has a 5% chance of prevailing in this comparison. The X axis represents the sum of the Freedom House Political Rights and Civil Liberties scores for last year (from the “Freedom in the World 2011” report), reversed and shifted so that 0 is least democratic and 12 is most democratic (i.e., 14-PR+CL). The correlation between Freedom House and the crowdsourced index is a fairly high 0.84 (which is about as high as the correlation between the combined Freedom House score and the Polity2 score for 2008: 0.87). But how good is this, really? What do these scores really represent?

At the extremes, judgments of democracy appear to be “easy”: Freedom House and the crowd converge. For example, among countries that Freedom House classifies as “Free,” only six countries (Benin, Israel, Mongolia, Sao Tome and Principe, and Suriname) receive a score of 40 or below from the “crowd,” which is the highest score that any country Freedom House classifies as “Not Free” receives (Russia). But in the middle there is a fair amount of overlap (just as with expert-coded indexes, whose high levels of correlation are driven by the “extreme” cases – clear democracies or clear dictatorships). Some of these disagreements could further be attributed to the relative obscurity of some of the countries involved, given the location of the voters in this exercise (few people know much about Benin, and anyway the index got no votes from Africa), but some of the disagreements seem to have more to do with the average conceptual model used by the crowd (e.g., the case of Israel). The crowd would seem to weigh the treatment of Palestinians more heavily than Freedom House in its (implicit) judgment of Israel’s democracy. This is unsurprising, since the website does not ask participants to stick to a particular “model” of democracy; the average model or concept of democracy to which the crowd appears to be converging seems to be slightly different than the model used by Freedom House.

We can try to figure out where the crowd differs the most from Freedom House by running a simple regression of Freedom House’s score on the score produced by the crowd, and looking at the residuals from the model as a measure of “lack of fit.” This extremely simple model can account for about 69% of the variance in the crowdsourced scores on the basis of the Freedom House score (all data available here); we can improve the fit (to 72%) by adding a measure of “uncertainy” as a control (the number of times a country appeared in an “I don’t know” event, divided by the total number of times it appeared in any comparison). What (I think) we’re doing here is basically trying to predict Freedom House’s index on the basis of the crowdsourced judgment plus a measure of the subjective uncertainty of the participants. The results are of some interest: for example, participants in the exercise appear to think Venezuela, Honduras, and Papua New Guinea have higher levels of democracy than Freedom House thinks, and they also appear to think that Sierra Leone, Lithuania, Israel, Mongolia, Kuwait, Kiribati, Benin, and Mauritius have lower levels of democracy than Freedom House thinks.

A more interesting test, however, would be to do what Pemstein, Meserve, and Melton do here with existing measures of democracy. Their work takes existing indexes of democracy as (noisy) measurements of the true level of democracy and attempts to estimate their error bounds by aggregating their information in a specific way. I might try do this later (I need to learn to use their software, and might only have time in a few weeks), though it is worth noting that a simple correlation of the crowdsourced score for 2010 with the “Unified Democracy Scores” Pemstein et. al. produce for 2008 by aggregating the information from all available indexes is an amazing 0.87, and a simple regression of one on the other has an R2 of .76. So the crowdsourced index seems to be doing something much like what the Unified Democracy Scores are doing: averaging different models of democracy and different "perspectives" on each country.

This all assumes, however, that there is something to be measured – a true level of democracy, which is only loosely captured by existing models. On this view, existing indexes of democracy reflect different interpretations of the concept of democracy, plus some noise due to imperfect information and the vagaries of judgment; they each involve a “fixed” bias due to potential misinterpretation of the concept, plus the uncertainty involved in trying to apply the concept to a messy reality whose features are not always easy to discern (try figuring out the level of civil rights violations in the Central African Republic compared with Peru in 2010, quick!). The crowdsourced index actually goes further and averages the different interpretations of democracy of every participant, just as the Unified Democracy Scores aggregate the different “models” of democracy used by different existing indexes. To the extent that the crowd’s models converge to the true model of democracy, then the crowdsourced index should also eliminate that “bias” due to misinterpretation. But it is not clear that there is a true model, or that the crowd will converge to it even if it existed: the crowdsourced index may have a higher bias (total amount of misinterpretation of the concept) than the indexes created by professional organizations. (And this conceptual bias might shift if more people from other countries voted; I’d really love to get more votes from Africa and Asia).

Even if there is no true model of democracy, it would be interesting to “reverse-engineer” the crowd’s implicit model by trying to figure out its components. (What do people weigh most, when thinking about democracy? Violations of civil liberties? Elections? Opportunities for participation? Economic opportunities?). One could do this, I suppose, by trying to predict the crowdsourced scores from linear combinations of independently gathered measures of elections, civil liberties, etc.; some form of factor analysis might help here? My feeling is that the crowd weighs economic “outcomes” more than experts do (so that crowdsourced assessments of democracy will be correlated with perceptions of how well a country is doing, like GDP growth), but I haven’t tried to investigate that possibility.

It would also be interesting to repeat the exercise by asking people to stick to a particular model of democracy (e.g., Freedom House’s checklist, or the checklist developed by my students – more on that later). It would also be great if the allourideas software had an option that allowed a voter to indicate that two countries are equal in their level of democracy (I think one could do this, but then I would have to modify the client; right now, the only way of signalling this is to click on the “I don’t know” button). Perhaps next year I will try some of these possibilities. All in all, it seems that crowdsourcing a democracy index produces reasonable results, and might produce even better results if the crowdsourcing is done with slightly more controls. (E.g., one could imagine using Amazon's "Mechanical Turk" and a specific model of democracy for generating data on particular years). I would nevertheless be interested in thoughts/further analysis from my more statistically sophisticated readers.

In an upcoming post I will explain how my students produced an index of democracy for 2010, 1995, and 1980, and how that crowdsourced effort compares with other existing indexes. (Short version: pretty well).

[Update 8:40pm: Made some minor changes in wording, added a couple of links]

Wednesday, March 09, 2011

Crowdsourcing a Democracy Index

(Sorry for the recent neglect of the blog. I just started teaching again, and that tends to absorb all my energy. So here’s a teaching-related post on something I’ve been doing in one of my classes).

One of the things my students are doing in my “Dictatorships and Revolutions” class this term is constructing a democracy index/regime classification like those produced by Freedom House, the Polity project, or the DD dataset of political regimes I’ve used in this blog in the past (see, e.g., here and here).[1] We are looking at examples of how different regime classifications can be constructed, discussing some of their problems, and then collectively constructing a set of criteria for classification, which we will ultimately use to actually code all 192 or so countries in the world at intervals of about five years for a couple of decades. (If you are interested in the actual details of how the exercise is organized, e-mail me; this whole thing is still quite experimental, so I would not mind some feedback. It’s turning out to be a bit complex). Since there are over 100 students in the class (around 120, in fact), this means that we can achieve full coverage (and even some overlap) if each student codes just 2 countries (at various points in time), and I am planning to assign 4-5 countries to each student (so each country gets at least 2 coders).  We will then examine how our crowdsourced index or regime classification compares to some of the other indexes and regime classifications.

As a warm-up exercise, I set up a democracy ranking website using allourideas.org, which I learned about some time ago via the good orgtheory people. Basically, this is a webpage where you are presented with a comparison between two countries, and asked which one is more democratic (you can answer “I don’t know,” and give a reason). The results of the pairwise comparisons can be used to generate a ranking, which represents something like the probability that a given country would be more democratic than a randomly selected country. (But rather than read this explanation, why not go play with it? It can be addictive, and it’s basically self-explanatory once you see it). I asked the students to go to this website in the first class of the term, and to vote; a lot of them voted (an average of about 14 times, i.e., 14 comparisons). I didn’t know exactly what to expect, but I was sort of hoping for a “wisdom of crowds” effect. And there is, indeed, something like that, but the effect is small. Here’s a graph (link for full screen):


The y axis represents the sum of Freedom House’s political rights and civil liberties scores: 2 is most free, 14 least free. The x axis represents the “ranking” of the countries as calculated by the Allourideas software, ranging from 4 (North Korea has only a 4% chance of prevailing in a “more democratic” comparison against a randomly selected country) to 93 (Australia; New Zealand scored 92, and was for a time in first position, which is to be expected from a group from New Zealand; see the complete ranking here). Note that these numbers do not reflect the judgments of “individual” students, but the calculated probability of prevailing in a comparison against a randomly selected country, given the information available from previous pairwise comparisons. (No student or set of students actually “ranked” North Korea last or Australia first). The size of the bubbles is proportional to the class’ subjective “uncertainty”: basically, the number of times a country was involved in an “I don’t know” answer divided by the total number of times the country appeared in any comparisons. There were 1250 votes submitted, but since there are 192 countries, the number of possible comparisons is 36,672, which means that a relatively large number of potential comparisons never appeared. (Which is part of the reason I am posting this here – I want to see what happens if lots of people engage in this informal ranking exercise).

There’s clearly a correlation between the rating by Freedom House and the informal rankings generated by the pairwise comparisons produced by the students – about -0.62, which is pretty respectable. (Some of the correlations between Freedom House and other measures of democracy are not much higher than this). A simple regression of the Freedom House ratings on the rankings generated by the students gives a coefficient of -0.11 (highly significant, not that that matters much in this context), which means that an increase of 10 points in the student-generated ranking is associated with a decrease of about 1 point in the combined Freedom House PR+CL score. (A more thorough analysis could be undertaken, but I don’t feel qualified to do it; I’ve put up the data here for anyone who is interested in doing some more exploration, and will update it later if enough other people participate in the ranking exercise).


Most of the “obvious” cases appear at the extremes – developed, well-known democracies get a high ranking, while obvious dictatorships mostly get a low ranking. Many of the countries that seem to be misplaced, however, appear to be either small and little talked about in the news or not especially well-known to students; see, for example, Ghana (which is ranked lower than it should be, if Freedom House is right) and Armenia (which is ranked higher than it should be, if Freedom House is right). Would this change if more people contributed to the ranking, especially people from a variety of countries around the world (I know this blog gets a small readership from a number of unlikely countries –could my kind readers send this link around to people who might be interested, e.g., students?). Here's a heatmap of the student-generated rankings (darker is more democratic):


The map seems reasonable enough to the naked eye. It seems that even a simple informal ranking exercise can be a reasonable approximation to a professional ranking (like that generated by Freedom House) if the people doing the ranking have some knowledge of the countries being compared, so I would expect that more people participating would probably move the informal ranking closer to Freedom House’s measure. (Maybe this is a most cost-effective method of generating a democracy index – “the people’s democracy index,” as it were). But it could also be the case that the ranking would diverge more from the Freedom House ranking as people from diverse countries participated, with different understandings of democracy. Perhaps global opinion about which countries count as most democratic would diverge sharply from the opinions of Freedom House’s expert coders. Or perhaps it would be affected by national biases – people from particular countries would have a tendency to rank it higher/lower than a more “objective” ranking would. It would be interesting to know – so it would be great if you could spread the word by sending  this link around!

(I have also wondered whether this method would work for generating “historic” data on democracy. But the obvious way of doing this would introduce many very unlikely or difficult comparisons– e.g., could we meaningfully compare democracy levels in 1964 Gambia vs. 1980 Angola using this method? – and the less obvious way would require one to set up a website for each distinct year).


[1] Technically, an index of democracy and a regime classification are two different things. The Economist and Freedom House produce indexes of democracy/freedom – an aggregated measure of the degree of democracy in a given country at any particular point in time, ranging from 0 to 100. A regime classification instead takes regimes as types, and attempts to determine whether a given country should be categorized as one kind or another.