Piedmont municipalities: population density vs voter turnout

The voter turnout, i.e. the percentage of eligible voters who cast a ballot in an election, has been related with the population density of the 1206 municipalities of the italian region Piedmont.

The spots on the scatter plot above represent the 1206 municipalities, the x and y coordinates refer respectively to the logarithm of the population density (people/km^2) and to the voter turnout during the italian 2013 general election of the Chamber of Deputies. The area of the spots refers instead to the population size of the corresponding municipality.

The analysis is extended by the opportunity to shift, on the x-axis, between the logarithm of the population density to the logarithm of the population size and, on the y-axis, to the 2008 and 2006 general elections.

The green straight line shows a linear regression analysis on the set of points, the 95% confidence intervals on the y-intercept and slope are drawn on demand.

A slope greater than zero suggests a direct relationship between the logarithm of the population density (or size) and the voter turnout.

The voter uncertainty is calculated as the Shannon entropy of the partition defined by the votes assigned to every delegate, and it reflects the heterogeneity of the electoral preferences in a given municipality.

Municipalities with higher population size have an higher uncertainty on electoral preferences.

further details

Data Source

Data on general elections: elezionistorico.interno.it, © Ministero dell'Interno. Tutti i diritti riservati.
Population density and size of Piedmont municipalities: istat.it, Creative Commons - Attribuzione - versione 3.0 (CC BY 3.0 IT)

Mathematical Methods

Linear Regression

The following text is a brief summary of lectures 29,30 and 31 given by Dmitry Panchenko at MIT. Suppose we are given a set of observations \(\{(X_1,Y_1),\dots,(X_n,Y_n)\}\), where \(X_i,Y_i \in \mathbb{R}\). We assume to model the random variable \(Y\) as a linear function of the random variable \(X\) with the presence of a random noise, i.e. we are assuming $$ Y_i = b_0 + b_1 X_i + \epsilon_i $$ where \(b_0,b_1\in\mathbb{R}\) and \(\epsilon_i\sim N(0,\sigma^2)\) (that is \(\epsilon_i\) is assumed to have normal distribution) are unknown parameters. On the following we will estimate the unknown parameters and their confidence intervals, given the observations. Let us think of the points \(X_i\) as fixed and non random and deal only with the randomness that comes from the noise variables \(\epsilon_i\). In other words we deal only with the distribution \(P(Y_i)=f(X_i,b_0,b_1,\epsilon_i)\), which is normal because the randomness comes from the normal variables \(\epsilon_i\). We want to find the estimates \(\hat{b_0}, \hat{b_1}\) and \(\hat{\sigma}^2\) that fits the observations best and one can define the measure of the quality of fit in different ways. Here we use the maximum likelihood estimates, which maximize the probability \(P(Y_1Y_2\dots Y_n)=P(Y_1)P(Y_2)\dots P(Y_n)\) of the event \(Y_1 \text{ AND } Y_2 \text{ AND } \dots \text{ AND } Y_n\). The maximum likelihood estimates are $$ \hat{b_1} = \frac{\bar{XY}-\bar{X}\bar{Y}}{\bar{X^2}-\bar{X}^2},\qquad \hat{b_0} = \bar{Y} - \hat{b_1}\bar{X},\qquad \hat{\sigma}^2 = \frac{1}{n}\sum^n_{i=1}(Y_i - \hat{b_0} - \hat{b_1}X_i)^2 $$ where \(\bar{X}\) is the mean of \(X\) (note that \(b_0\) and \(b_1\) are the same as found with the least-squares method). The knowledge of the estimates is enough to draw the line that fits the observations, but an important further information (the confidence) comes from the distribution of the estimates. The estimates have a probability distribution because they are functions of \(Y_i\), which have distributions \(P(Y_i)\). What are the confidence intervals? It will become apparent with an example. It can be proved that the random variable \(\hat{\sigma}^2\) is independent of \(\hat{b_0}\) and \(\hat{b_1}\) and that it has a chi squared distribution with \(n-2\) degrees of freedom. In formulas $$ \frac{n\hat{\sigma}^2}{\sigma^2}\sim\chi^2_{n-2}. $$ Note that \(\chi^2_{n-2}\) is a well known distribution, i.e. we can calculate probabilities with it. For example, let be \(\alpha=0.025\), if we find the values \(c_1\) and \(c_2\) such that \(\int_0^{c_1}\chi^2_{n-2}dx=\alpha/2\) and \(\int_{c_2}^{\infty}\chi^2_{n-2}dx=\alpha/2\), then the probability of the remaining interval is \(\int_{c_1}^{c_2}\chi^2_{n-2}dx=1-\alpha = 0.95\), which means that $$ P(c_1\leq\frac{n\hat{\sigma}^2}{\sigma^2}\leq c_2) = 0.95. $$ Solving for \(\sigma^2\) we find the \(1-\alpha\) confidence interval $$ \frac{n\hat{\sigma}^2}{c_2}\leq\sigma^2\leq\frac{n\hat{\sigma}^2}{c_1}. $$ Note that the confidence interval is calculable. With the same meaning, the \(1-\alpha\) confidence intervals of \(b_1\) and \(b_0\) are $$ \hat{b_1} - c \sqrt{\frac{\hat{\sigma}^2}{(n-2)(\bar{X^2}-\bar{X}^2)}} \leq b_1 \leq \hat{b_1} + c \sqrt{\frac{\hat{\sigma}^2}{(n-2)(\bar{X^2}-\bar{X}^2)}} $$ $$ \hat{b_0} - c \sqrt{\frac{\hat{\sigma}^2}{n-2}\left(1+\frac{\bar{X}^2}{\bar{X^2}-\bar{X}^2}\right)} \leq b_0 \leq \hat{b_0} + c \sqrt{\frac{\hat{\sigma}^2}{n-2}\left(1+\frac{\bar{X}^2}{\bar{X^2}-\bar{X}^2}\right)} $$ where the value \(c\) originates from the Student distribution with \(n-2\) degrees of freedom: \(t_{n-2}(-c,c)=1-\alpha\).

Entropy Analysis

The uncertainty is calculated by means of the Shannon entropy on the number of votes assigned to every delegate. For example, given the municipality \(x\) let's suppose that the three delegates \(a\), \(b\) and \(c\) obtained \(v_a^x\), \(v_b^x\) and \(v_c^x\) votes (the total number of votes is \(v^x = v_a^x + v_b^x + v_c^x\)). Then the uncertainty \(\mathcal{H}(x)\) is calculated as the Shannon entropy $$ \mathcal{H}(x) = -\sum_{p\in(a,b,c)} \frac{v_p^x}{v^x}\ln\frac{v_p^x}{v^x} = - \frac{v_a^x}{v^x}\ln\frac{v_a^x}{v^x} - \frac{v_b^x}{v^x}\ln\frac{v_b^x}{v^x} - \frac{v_c^x}{v^x}\ln\frac{v_c^x}{v^x} $$ Note that the entropy is not normalized to the maximal entropy (\(\ln v^x\)), because the number of delegates is nearly the same for every municipality (such normalization would result in an entropy proportional to the population size).


The previous analysis is not exhaustive and the underlined observations need further research in order to be established with higher statistical accuracy.