Quantum Arrow's Theorem
Contents
- 1 Polymath: Does Arrow's Theorem hold in quantum voting?
- 2 Information-Theoretic Formulation of the Model
- 3 Miscellaneous Thoughts
- 3.1 Formal statement of the theorem
- 3.2 What is a good model for quantum preference?
- 3.3 Non-Quantum Superpositions of Preference Orderings
- 3.4 A model for the single-voter spinor (wavefunction)
- 3.5 Election Day and Stern-Gerlach-style Voting
- 3.6 What are the quantum analogues of Arrow's axioms?
- 3.7 What is a Dictator in the quantum setting?
- 3.8 Quantum Arrow's Theorem
- 3.9 Proof of quantum Arrow's theorem
- 3.10 Kalai's Theorem
Polymath: Does Arrow's Theorem hold in quantum voting?
Arrow's theorem is a classical result in political game theory. One wishes to construct an aggregation rule (or social welfare function) which takes as input the individual preference rankings members of the society have, and output an aggregated preference ranking.
Arrow's theorem only holds in the case that there are more than 2 choices to compare.
One naturally imposes some restrictions on the aggregation rule, which we take as the classical axioms: Pareto efficiency and Independence of Irrelevant Alternatives (see below for a description. Arrow's theorem states that if an aggregation rule satisfies these axioms, then one member of society necessarily acts as a Dictator, i.e., there exists one person whose personal preference ranking determines the aggregated ranking.
This Polymath Project has two purposes:
- Identify what a system of ``quantum preferences is, and
- Determine whether Arrow's Theorem holds or not in that setting. i.e., is there a Dictator whose preferences determine the aggregated preferences?
Before doing a quantum formulation of the model, we should simply set up a probabilistic model. i.e., individuals have superpositions of preference ratings.
Information-Theoretic Formulation of the Model
We should solve this model first.
Probability weights on the space of preference orders [math]\Sigma = L(A)^N[/math].
The space of probability weights on [math]\Sigma[/math] is called [math]\Delta(\Sigma)[/math]. This space forms a compact, convex simplex. The probability weights are coordinates on the simplex. There is a nice topology.
The preference aggregation rule [math]F[/math] originally is only defined on the extremal points of the simplex [math]Sigma[/math]. We impose the condition of symmetry: the preference aggregation rule is symmetric in its orderings. i.e., if [math]\vec R = (R_1, \cdots, R_N)[/math] is a preference ranking of society, and [math]\pi\vec R[/math] is a permutation, then [math]F(\vec R) = F(\pi\vec R)[/math]. Suppose there are [math]k[/math] alternatives; we then define [math]K = k![/math] for the number of extremal points of the simplex.
We extend the preference aggregation rule on the whole space [math]\Delta(\Sigma)[/math]. In particular, we impose the condition that if any two superpositioned rankings [math]\vec p, \vec q \in \Delta(\Sigma)[/math] have the same Shannon entropy (i.e., [math]\sum p_i \log p_i = \sum q_i \log q_i[/math]), then [math]F(\vec p) = F(\vec q)[/math]. **Argument: The entropy of two microstates somehow conveys all the information content. Therefore the aggregation rule applied to these different microstates (i.e., the macrostate) should be the same. We define [math]\operatorname{Ent} \vec p = \sum_{i=1}^K p_i \log p_i[/math].
We also want to impose a majorization condition. Put a partial ordering [math]\prec[/math] on the space [math]\Delta(\Sigma)[/math]. The partial ordering should be majorizing: this means that [math](p_1,\cdots,p_K) \prec (q_1, \cdots, q_K)[/math] if and only if [math]p_1 \le q_1[/math], [math]p_1 + p_2 \le q_1 + q_2[/math], [math]p_1 + p_2 + p_3 \le q_1 + q_2 + q_3[/math], etc.
Majorization is a sufficient condition for the entropy reversal property to hold. If [math]\vec p \prec \vec q[/math], then [math]\operatorname{Ent} \vec q \le \operatorname{Ent} \vec p[/math]. )(General result: Banerjee and Lawson have proved that there exists a complete partial ordering which satisfies only entropy reversal, but nobody has been able to characterize it yet.)
In addition to this general, abstract assumption, we also impose Pareto Efficiency and the Independence of Irrelevant Alternatives axioms on [math]F[/math]:
- Pareto Efficiency
- If alternative a is ranked above b for all orderings [math] R_1 , \ldots, R_N [/math], then a is ranked higher than b by [math] F(R_1, R_2, \ldots, R_N) [/math]. (Note that unanimity implies non-imposition).
- Independence of Irrelevant Alternatives (IIA)
- For two preference profiles [math] (R_1, \ldots, R_N) [/math] and [math] (S_1, \ldots, S_N) [/math] such that for all individuals i, alternatives a and b have the same order in [math]R_i[/math] as in [math]S_i[/math], alternatives a and b have the same order in [math] F(R_1,R_2, \ldots, R_N)[/math] as in [math] F(S_1,S_2, \ldots, S_N) [/math].
Arrow's Theorem. Let [math]F : \Delta(\Sigma) \to L(A)[/math] be a preference aggregation rule on the simplex of probability distributions on [math]\Sigma[/math]. Suppose that [math]F[/math] satisfies the majorization property, Pareto efficiency, and the IIA criterion. Then there is a dictator: how should we express the dictator property in this setting?
Miscellaneous Thoughts
Formal statement of the theorem
(taken from Wikipedia)
Let [math] \mathrm{A} [/math] be a set of outcomes, [math] \mathrm{N} [/math] a number of voters or decision criteria. We shall denote the set of all full linear orderings of [math] \mathrm{A} [/math] by [math] \mathrm{L(A)} [/math].
A (strict) social welfare function (preference aggregation rule) is a function [math] F : \mathrm{L(A)}^N \to \mathrm{L(A)} [/math] which aggregates voters' preferences into a single preference order on [math] \mathrm{A} [/math].<ref>Note that by definition, a social welfare function as defined here satisfies the Unrestricted domain condition. Restricting the range to the social preferences that are never indifferent between distinct outcomes is probably a very restrictive assumption, but the goal here is to give a simple statement of the theorem. Even if the restriction is relaxed, the impossibility result will persist.</ref> The [math] \mathrm{N} [/math]-tuple [math] (R_1, \ldots, R_N) [/math] of voters' preferences is called a preference profile. In its strongest and most simple form, Arrow's impossibility theorem states that whenever the set [math] \mathrm{A} [/math] of possible alternatives has more than 2 elements, then the following three conditions become incompatible:
- unanimity, or Pareto efficiency
- If alternative a is ranked above b for all orderings [math] R_1 , \ldots, R_N [/math], then a is ranked higher than b by [math] F(R_1, R_2, \ldots, R_N) [/math]. (Note that unanimity implies non-imposition).
- non-dictatorship
- There is no individual i whose preferences always prevail. That is, there is no [math] i \in \{1, \ldots,N\} [/math] such that Template:Nobreak
- independence of irrelevant alternatives
- For two preference profiles [math] (R_1, \ldots, R_N) [/math] and [math] (S_1, \ldots, S_N) [/math] such that for all individuals i, alternatives a and b have the same order in [math]R_i[/math] as in [math]S_i[/math], alternatives a and b have the same order in [math] F(R_1,R_2, \ldots, R_N)[/math] as in [math] F(S_1,S_2, \ldots, S_N) [/math].
What is a good model for quantum preference?
Here, we take [math] R_1, \ldots, R_N [/math] to be basis vectors in [math]N[/math]-dimensional Hilbert space [math]\mathbb C^N[/math].
The whole point of quantum preference is that people's true preferences can exist in superposition: e.g., they may be .8 liberal and .6i conservative ([math].8^2 + .6^2 = 1[/math]).
Non-Quantum Superpositions of Preference Orderings
Here is a non-quantum perspective of superpositions using classical probability. Take the space [math]L(A)[/math] of preference orderings. In Classic Arrow's Theorem, a voter chooses an preference order [math]R[/math] selected from [math]L(A)[/math].
We could also consider a situation with mixed preference rankings. Let [math]Σ[/math] be the space of probability measures on [math]L(A)[/math]. Now, a voter chooses a mixed preference rating [math]P \in Σ[/math]. Choosing a particular preference ordering [math]R[/math] corresponds to choosing the Dirac delta-measure [math]δ\ltsub\gtR\lt/sub\gt[/math]
Does an analogue of Arrow's Theorem still hold in this setting?
A model for the single-voter spinor (wavefunction)
Let [math] k\in\mathbb N[/math] be the cardinality of the set [math] \mathrm{A} [/math], the set of outcomes (e.g., the candidates for president). In this bizarre country, a voter votes for every candidate on the ballot; however, he gives each of them a distinct rank from 1 to [math] \mathrm{k} [/math]. There are thus [math] \mathrm{k!} [/math] distinct orderings he could conceivable have. In general, then, the quantum voter will be in a superposition of all these states:
[math]|\psi\rangle = \sum_{j=1}^{k!} a_j P_j [/math], where [math]a_j\in\mathbb C[/math] and [math] \sum_{j=1}^{k!} |a_j|^2 =1 [/math], and [math]P_1, \ldots, P_{k!}[/math] are all the preference states classical voters choose among. To be clear, [math] P_j: A \to \{1, \ldots, k \}[/math] injectively.
Thus the state space for a single quantum voter is [math]\mathbb C^{k!}[/math]. The need for complex (as opposed to real) coefficients in the basis expansion is not clear. An evolution equation for the quantum state is needed. The nature of this equation will determine whether we need a real or complex vector space to describe the set of all possible quantum voter states. Thus far, we have only defined one observable: the presidential ballot. The k!-component spinor described above is represented in the eigenbasis of the presidential ballot. We could introduce other observables. To get truly quantum phenomena, we necessarily need at least one pair of incompatible observables. Two observables are incompatible if and only if their corresponding linear operators (on the state space) don't commute.
The toppings I want on my pizza and the vegetables I want in my salad cannot be simultaneously observed, because I can't have salad if I'm having pizza, and I can't have pizza if I'm having salad. The pizza operator does not commute with the salad operator.
Election Day and Stern-Gerlach-style Voting
Suppose the voters are casting ballots for president and for governor, and suppose further that the respective observables are incompatible. That is, the presidential ballot operator doesn't commute with the gubernatorial ballot operator. Thus, voting for both offices cannot take place simultaneously. So let's say voters cast for president first, then for governor. This is highly reminiscent of the Stern-Gerlach experiment in quantum mechanics. What the secretary of state reports as the results of the election is essentially the output of the social welfare function. What should the honorable secretary report? When a quantum voter casts for, say, president, his spinor/wavefunction (which is, in general, a linear combination of all k! eigenvectors of the presidential ballot operator) collapses onto a single eigenvector of the presidential ballot. The collapse mechanism is not known and, to be in perfect analogy with quantum mechanics, cannot be known: it is purely random. To be clear, the coefficient of the eigenvector to which the spinor collapsed may have had a relatively small square modulus in the voter's state's eigenbasis expansion. Thus, the fact that he collapsed onto a particular eigenvector after voting for president tells us nothing about his actual state. To get information about the sizes of the [square moduli of the] coefficients his state assigns to each eigenvector, we need to have the voter cast a ballot for president multiple times. Moreover, we need him to cast a ballot for governor in between casting for president, so that we shake him out of a presidential eigenstate and into a gubernatorial eigenstate (which is a superposition of presidential eiegenstates, since the gubernatorial operator is not simultaneously diagonalizable with the presidential operator). If the voter casts, say, a thousand votes for president in a single day, then we have a histogram for the outcomes he produced. This histogram gives us some approximation of the coefficients in the voter's spinor, up to some error (determined by the fact that we have a finite data set). In general, the voter could cast [math]\omega\in\mathbb N[/math] votes for president; and (using elementary statistics) we can calculate an upper bound on the error, as we try to say what his spinor's components are based on the histogram generated with [math]\omega[/math] data points. If we decide we want our error to be less than epsilon, then we can invert the calculation to determine what [math]\omega[/math] should be. So the secretary of state collects the results of, say, a thousand voting iterations per voter per office, and produces his social welfare function's results as a function of this data. In short: for each voter, the voting lasts all day.
Doesn't this assume that the voter returns to his initial state upon casting for governor? What basis would you have for thinking that? None. This is all garbage. Maybe we have to come up with an evolution equation to be sure about any of this.
Correction: what I said above doesn't necessitate an evolution equation for this. If we have explicit forms for the operators, then we can write the eigenvectors of each in the eigenbasis of the other. So if I measure the governor ballot, then I'm in an eignenstate of it; which means I'm in a (precisely known) linear combination of eigenstates of the presidential ballot. Thus, the next measurement is collapsing this state, which means I'm not getting any information about the original state('s coefficients). Thus this is all pointless.
I am having ice cream. How much chocolate sauce would I like on my ice cream? Well, my options are indexed by a set of cardinality continuum. But ultimately my preference wavefunction is going to collapse to a single definite finite value for how much chocolate sauce I apply to my ice cream.
What are the quantum analogues of Arrow's axioms?
What is a Dictator in the quantum setting?
It is not clear what it means for one person to be a Dictator in this setting.
Quantum Arrow's Theorem
Statement of theorem here.
Proof of quantum Arrow's theorem
The proof should be straightforward once we determine the statement of the theorem.
Kalai's Theorem
Here is an article by Gil Kalai on this subject: http://www.ma.huji.ac.il/~kalai/arr.pdf
He assumes that the social choice function [math]f[/math] is rational: this means that the outcome is always a preference order. We assumed this already when we said that its codomain is the space [math]\Sigma[/math] of preference orders.
Kalai assumes the function is neutral: this means that it is invariant under permutations of choices to be ranked. He also assumes it is symmetric: this means it is constant under permutations of its inputs. e.g., if the population size is [math]N = 2[/math], [math]f(R_1, R_2) = f(R_2, R_1)[/math]
== The Structure of the Space [math]\Delta(\Sigma)[/math]
Can do barycentric subdivision. Every simplex has an inherent partial order. So there's partial ordering on Sigma.
Focus on the part of the simplex (a barycentric subdivision) where there is a total ordering.
We want to impose an order on the whole s
Nice orders have the entropy reversal property.
Examples of orders: Bayesian order (not complete), stochastic order
Take a basis in terms of the barycentric ordering. Defines a chain: a strictly increasing sequence of probability vectors.
We want to have limits in order to prove the existence of a dictator.
Majorization order: We obtain by taking isomorphic copies of the barycentric ordering, and piece them together.
Let [math]\Sigma = L(A)[/math], the space of individual preference rankings. Let [math]\Omega = \Sigma^N[/math] be the space of society's preference rankings.
We define a probability measure [math]\mathbb P[/math] on the space [math]\Omega = \Sigma^N[/math]. (what assumptions should this measure satisfy?) In the case of truly independent voters, this would be a product measure on the space [math]\Omega[/math]. In a more realistic model, there will be correlation between voters' preference rankings. Let [math]K[/math] be the rank of the correlation matrix of the measure [math]\mathbb P[/math]; for now, let's assume that [math]K = N[/math].
expect that any realistic model has correlation between different voters' actions.
Let [math]f : \Sigma^N \to \Sigma[/math] be any function. This represents the aggregation rule. The function [math]f[/math] should satisfy some axioms. In analogy with Classical Arrow's Theorem, it should satisfy Pareto Efficiency (PE) and Independence of Irrelevant Alternatives (IIA). In analogy with the mathematics of the concentration-of-measure phenomenon in mathematical probability, the function [math]f[/math] should be uniformly Lipschitz in the population size.
Let [math]X \in \Sigma^N[/math] be a random variable with probability law [math]\mathbb P[/math]. We should think of [math]X[/math] as simultaneously sampling the full preference ranking of all of society. Of course, we can never truly observe the outcome of [math]X[/math]; it exists solely as a mathematical model.
Now, the random variable [math]f(X) \in \Sigma[/math] represents the aggregated preference of the society.
Let [math]m_f[/math] be a median of the random variable [math]f(X)[/math]. The concentration of measure phenomenon suggests that there exist constants [math]C[/math] and [math]\lambda[/math] (depending on the aggregation rule [math]f[/math] and the distribution [math]\mathbb P[/math]) such that [math]\mathbb P( |f(X) - m_f| \gt u ) \le \mathrm C e^{-\lambda N u}[/math]. This means that the random outcome [math]f(X)[/math] is very close to its median [math]m_f[/math] with extremely high probability.
Question: Is there a dictator? We know that [math]f(X)[/math] is concentrated on its median. We say that an actor [math]i[/math] is a dictator when the median [math]m_f[/math] is very concentrated on the preference of person [math]i[/math]. i.e., [math]m_f \approx \pi_i(X)[/math], where [math]\pi_i[/math] is the projection onto the [math]i[/math] coordinate of the random vector [math]X[/math]