Lecture notes for COMS 6998 - Social Network Economics
Fall 2013, Columbia University
by A. Chaintreau
All the models we consider so far assume that, when presented with various set of evidence, a user weights new information in a linear manner, while taking into account a personal signal that remains private. It is now time to introduce a subtle effect that we generally refer as an information “selection”. This generally accounts for various effects that have been sometimes be defined and considered separately:
- Biased assimilation: Empirical results have shown that a person, when presented with subjective inconclusive evidence, will typically accept for a fact any confirmation a prior opinion, while critically examine (and hence, weight less) any statement contradicting a current belief. Bounded confidence: As a sort of extreme version of biased assimilation, under this assumption, a person only takes into account the new information she receives if that information lies within a bounded confidence interval around the current belief she holds.
- State dependent connectivity: A generalization of the two situation described above is that either by consequence of the above or due to another phenomena, the subset of people you are connected to is not anymore an exogenous factor describing your immediate social environment, but depends in an endogenous manner from the opinion you currently holds.
- Homophily: A very generic term used in a variety of contexts to denote that people who share common traits are more likely to meet, develop social relationship, or trust each other’s opinions. Accounting for homophily may take various forms. Some mild variants of homophily (that we generally call
passive" or
latent”) assume that the network connecting individuals together presents some natural communities (as oppposed to random graphs). These clusters may be explained by other traits (geography, social background) and initially contain individual with different opinions but otherwise do not affect the interaction any further. More “active” variants of homophily end up being equivalent to the above phenomena.
Under any of the forms of these effects (except for “passive” homophily) it is not hard to find simple examples where a population fragments into multiple groups that held different opinions, typically distant, and do not interact. Under most extreme forms of selection, one can observe a polarization either temporarily or persistently. This means that the divergence of opinions increases as time goes, a striking contrast with the prediction of a consensus algorithm.
Homophily
B. Golub and M. O. Jackson, “How homophily affects the speed of learning and best-response dynamics,” The Quarterly Journal of Economics, vol. 127, no. 3, pp. 1287–1338, 2012.
Analyzes iterated averaging on a multi-type uniform random graph (where the probability of an undirected edge (u,v) is Pk,l when u is of type k and v of type l). Iterated averaging in this case converges to average of initial estimator weighted by degree, the question is how fast?
The main contributions are:
- Define a spectral homophily factor as the second largest eigenvalue denoted h of F(P,n) where
Fk,l(P,n)=nlPk,l/∑mnmPk,m denotes the fraction of edges leaving from k that are directed towards nodes in l. As an example, for m types internally connected with proba p and externally connected with proba q<p, we can show this eigenvalue is (p−q)/(m∗avg link proba), which was previously proposed.
- Shows (THM1) that the convergence time satisfies for any constant γ CT(γn)∼log(n)/log(1/h) where ∼ indicates "asymptotically within a factor 2 (so something stronger than Θ)" and CT(ε) is
supbmin{t|∑idi(A)D(A)|(T(A)tb)i−(T(A)∞b)i)|2}≤ε so this is the time for which the square error as compared to the limit and weighted by the weight of influence is less than ε.
The important assumptions are that (1) minimum expected degree grows infinitely faster than log2(n), (2) all groups keep are within the same order in size (non-vanishing fractions) and probability in the matrix P also are all within a constant factor, (3) homophily (as defined above) stays away from zero or one asymptotically.
- This result also indicates that the density of the matrix is not important when it comes to worst case consensus, a result that can be tightened by showing that renormalization of matrix asymptotically leads to exactly the same value of convergence time.
- By showing that this multi-type model has a network diameter proportional to log(n)log(secdeg(P) where the denominator is a reweighted estimator of the squares of degrees. The main lessons learned is that homophily do not affect this value a lot.
NB:
- Proof techniques relies on an argument (THM2) establishing that the second eigenvalue of the matrix under the above conditions is the same as the eigenvalue of the matrix F(P,n) through a law of large number argument.
Bounded confidence
Motivation and model
One of the simplest form of information selection comes from extending the general weighted averaging models we have encountered to only operate on opinions that are within a given distance to the current one, corresponding to the rules:
- As a nodal update, xi←∑j∈Ni|||xj−xi||≤1xj∑j∈Ni|||xj−xi||≤11
- As a pairwise update, when i and j interacts:
xi,xj←xi+xj2 if ||xi−xj||≤1 and no effective update otherwise.
The synchronous nodal updates is sometimes referred to as the Hegselmann-Krause dynamics, and the pairwise update Deffuant-Weisbuch named after the papers that first introduced them.
R. Hegselmann and U. Krause, “Opinion dynamics and bounded confidence models, analysis, and simulation,” Journal of Artificial Societies and Social Simulation, vol. 5, no. 3, 2002.
G. Deffuant, D. Neau, F. Amblard, and G. Weisbuch, “Mixing beliefs among interacting agents,” Advs. Complex Syst., vol. 3, no. 1, pp. 87–98, Jan. 2000.
Most of the analyses of these models have primarily been through numerical simulations.
A good example is the study of a critical threshold for the size of the confidence interval (related to the original diameter) to obtain a consensus. This result has not be demonstrated rigorously.
S. Fortunato, “On the consensus threshold for the opinion dynamics of Krause–Hegselmann,” International Journal of Modern Physics C, vol. 16, no. 2, pp. 259–270, 2005.
J. Lorenz, “Continuous opinion dynamics under bounded confidence: A survey,” International Journal of Modern Physics C, vol. 18, no. 12, pp. 1819–1838, 2007.
Proof of convergence
Most of the analytical results deal with characterizing the convergent nature of the process. A relatively classical and general result exploits properties of the product of row-stochastic matrices to prove it in quite generality.
J. Lorenz, “A stabilization theorem for dynamics of continuous opinions,” Physica A: Statistical Mechanics and its Applications, vol. 355, no. 1, pp. 217–223, 2005.
This paper proves that a sequence of row stochastic matrices A(x(t),t) defining a sequence as x(t+1)=A(x(t),t)x(t) eventually converges to classes achieving consensus as long as the following conditions are satisfied:
- (self-confidence) Ai,i(t)>0 for any t.
- (mutual confidence) Ai,j(t)>0⟺Aj,i>0 for any i,j and t.
- (minimum confidence transfer) There exists δ>0 such that Ai,j(t)>0⟹Ai,j>δ.
The proof works as follows:
Starting from x(0), we can define the sequence A(t), as A(1)=A(x(0),1) and A(t+1)=A(A(t)x(0),t+1), and we denote for any t1<t2, the product matrix A(t1,t2)=∏t1≤t<t2A(t).
Step 1: Let us first observe that, from self-confidence, we can build a sequence t0<…<tk<… such that all product matrix A(tk,tk+1) for any k have the same zero and non-zero entries. If we replace all non zero entries by 1 we obtain the adjacency matrix of a undirected graph, whose connected components we denote by I1,…,Ip. Moreover, by renumbering columns and rows to group connected components together, any product matrix A(tk,tk+1) have a block structure, where every diagonal blocks only contains positive entries.
How to prove it? We first observe that if A and B have positive diagonal then Ai,j>0orBi,j>0⟹ABi,j>0.
We deduce that, starting from t=0, the matrix product A(0,t) above have more and more positive entries, until a time t∗0 where this amount stabilizes. Let us denote A0=A(0,t∗0) and by J0⊆n×n the subset of indices with positive entries. No other matrix A(t) with t>t∗0 may possess a positive entries where A0 is null, otherwise, by the properties above, it would contradict the maximality of A0. For the same reason, it's true of any other product matrix A(t∗0,t) for t>t∗0. We reiterate the construction and deduce that A(t∗0,t∗1) will have maximal subset of positive entries, whose associated subset J1 is contained in J0. If we consider the sequence (A(t∗k,t∗k+1))k≥0, for some value k0 the subset of positive entries Jk becomes minimum and stop evolving. Hence choosing tk=t∗k0+k provides this property. The fact that the matrix is null outside the block structure comes from the definition of communicating classes. The fact that all block within a communication class is positive is slightly more subtle. The matrix A(tk,tk+1) restricted to Il is primitive (i.e. one of its power only contains positive coefficients) since its diagonal is positive and Il is a communication class. This property only depends on the subset of indices that are positive (not their value), which is the same for all product matrix we consider. This implies that one product A(tk′,tk′+1)…A(tk,tk+1)=A(tk,tk′+1) restricted to this block is positive. But since tk was maximum in the number of positive entries, and no products afterward can have more positive entries, it implies that all those positive entries were already present.
Step 2: We know observe that under self,mutual and minimum transfer confidence, if A(t0,t1)i,j>0, then A(t0,t1)i,j≥δn2−n+2 independently of t0 and t1.
How to prove it? We first define μ(A) to be the minimum positive entries of A. Note that if t1−t0≤n2−n+2 the result is obvious by product. We will prove that
μ(A(t0,t1))<δr implies that A(t0) have r more zero entries than A(t0,t1). Hence, if t1−t0≥n2−n+2, A(t0,t1) is a produce of at least n2−n+1 matrices. We can't have μ(A(t0,t1))<δr as it would imply that
A(t0) have at most n−1 positive entries.
This comes from the following: if A,B have self and mutual confidence μ(AB)<μ(B)⟹∃(i,j)such that(AB)i,j>0, and Bi,j=0. We prove the converse: Assuming that AB and B have exactly the same positive entries, we have for any i,j such that (AB)i,j>0:
ABi,j>0=∑k|Bk,j>0Ai,kBk,j≥μ(B)∑k|Bk,j>0Ai,k.
Since B and AB have same positive entries Bk,j=0⟹(AB)k,j=0. The latter can be rewritten as ∑lAk,lBl,j=0.
This implies Ak,i=0 since Bi,j is not null, which also implies Ai,k=0. Hence ∑k|Bk,j>0Ai,k=1 and μ(AB)≥μ(B).
Step 3: Finally, if we introduce d(A) the maximum Euclidean distance between two rows of A, one can show that for any row stochastic matrix A and any matrix B:
d(AB)≤⎛⎝⎜⎜1−mini,j∑1≤k≤nmin(A(t)i,k,A(t)j,k)⎞⎠⎟⎟⋅d(B)
Without loss of generality we can assume that we restrict to a communication classes in Step 1, and that it containing m nodes. The product matrices A(tk,tk+1) contains all positive entries, at least equal to δm2−m+1, hence
mini,j∑1≤k≤mmin(A(t)i,k,A(t)j,k)≥mδm2−m+1 which is independent of t, which proves that limt→∞d(A(0,t))=0, which proves that the limit is a matrix with identical rows.
Related work on convergence
V. D. Blondel, J. M. Hendrickx, and J. N. Tsitsiklis, “On Krause’s multi-agent consensus model with state-dependent connectivity,” IEEE Transactions on Automatic Control, Jul. 2008.
V. D. Blondel, J. M. Hendrickx, and J. N. Tsitsiklis, “Continuous-Time Average-Preserving Opinion Dynamics with Opinion-Dependent Communications,” SIAM J. Control Optim., vol. 48, no. 8, pp. 5214–5240, Jan. 2010.
In the first paper, the same convergence is obtained using a more specific and more compact set of arguments about the evolution of the system. This paper also obtains lower bounds on the distance between various clusters. The paper also analyzes the limit of the system as the number of agents gets larger, and behave collectively as a continuum.
The second paper considers a differential equation equivalent of the Bounded confidence model (with synchronous nodal udpate). Convergence and some stability properties are established.
B. Chazelle, “Natural algorithms and influence systems,” Communications of the ACM, vol. 55, no. 12, Dec. 2012.
B. Chazelle, “The total s-energy of a multiagent system,” SIAM J. Control Optim., vol. 49, no. 4, pp. 1680–1706, 2011.
B. Chazelle, “Natural algorithms,” presented at the SODA ‘09: Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, 2009.
B. Chazelle, “The Dynamics of Influence Systems,” Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on, pp. 311–320, 2012.
The first paper is a review of various works by this author that introduces this area. It remains in general very challenging due to the technicality of the contributions.
The second paper is by far the most readable (and actionable). It proves an extremely general results on the evolution of “bidirectional diffusive systems” which can be generally thought as any system that mixes opinion among arbitrarily formed neighborhood.
- Model: real valued states of nodes evolve according to interaction on a sequence of undirected graphs G(t) defined as a function of the system up to time t−1, and where the evolution at anytime satisfies
(1−ρ)minj∈Ni(t){xj}+ρmaxj∈Ni(t){xj}≤xi(t+1)≤ρminj∈Ni(t){xj}+(1−ρ)maxj∈Ni(t){xj}
xi(t+1)∈[mi+ρ(Mi−mi);Mi−ρ(Mi−mi)]
wheremi=min{xj|j∈Nt(i)},Mi=max{xj|j∈Nt(i)}
for a given independent constant 0<ρ<1/2.
Equivalently, if we write x(t+1)=P(t)x(t) where P(t) is any sequence of stochastic matrix, this requires that
pi,j>0⟺pj,i>0 (confidence is mutual) and max(pi,l(i)(t),pi,r(i)(t))≤1−ρ (diffused influence) where l(i) and r(i) denotes the neighbor of i with minimum and maximum state, whenever they are not the same.
- The total s-energy of this system over time is defined for any s as:
E(s)=∑t≥0∑(i,j)∈Gt||xj−xi||s2.
and the main results of the paper is that for any ``bidirectional diffusive'' system starting from a unit difference, we have:
For s=1, there exists a>0 such that E(1)≤ρ−anFor 0<s<1, there exists b>0 such that E(s)≤s1−nρ−n2−b
which supersedes a lot of results on the convergence of such systems.
Proof of polynomial time convergence
A. Bhattacharyya, M. Braverman, B. Chazelle, and H. L. Nguyen, “On the convergence of the Hegselmann-Krause system,” presented at the ITCS ‘13: Proceedings of the 4th conference on Innovations in Theoretical Computer Science, 2013.
This paper provides a simpler proof of the convergence of the “bounded confidence” dynamics, that is also providing a better polynomial bounds.
- If the dynamics is unidimensional, at most O(n3) steps are required
- Otherwise, more generally, a polynomial number O(n10) bound the number of steps
- The general proof is the most interesting since it illustrates an interesting potential function argument that exploits property of the dynamics.
In particular, if we introduce
F(t)(x,y)=∑i,j|j∈Ni(t)||xi−yi||2=∑i,j|||xi(t)−xj(t)||<1||xi−yi||2
we have
- One nodal update corresponds to a minimization of this cost x(t+1)=argminyF(t)(x(t),y)
- F(t)(x,y) can be rewritten, by symmetry as
xTA(t)x+yTA(t)y−2xTB(t)y
where A(t) and B(t) are symmetric and A is positive semi-definite.
This implies in particular that the only minimizer of F(t)(x(t),.) is x(t+1)=(A(t))−1Bx(t).
Indeed, to prove this, we can cite one of the fundamental results of linear algebra: To minimize a quadratic form of the type
f:y↦12yTAy−bty+c,
(where
A is
positive semi-definite,
b is any vector, and
c is a constant) it is sufficient to satisfy
Ay=b. To see why, write the gradient of the function in each coordinate and observe that the gradient
∇f is equal to
Ax−b. This proves (1)
f is convex, by another derivation, (2) canceling its derivative is sufficient to find a minimizer, which is also unique when
A is positive definite. In the later case, you also have since
A is invertible that
y=A−1b.
Note that we also have:
F(x,x)=2xT(A−B)x and F(x,−x)=2xT(A+B)x.
We deduce the following relation
F(t)(x(t),x(t))−F(t)(x(t+1),x(t+1))=F(t)(x(t+1)−x(t),x(t)−x(t+1))
Let us denote M=(A)−1B for simplicity, we have MTA=AM=B. Note that we have in particular A−B=(I−MT)A and (A+B)=A(I+M), so we can rewrite F(x,x)=2xT(I−MT)Ax. The LHS above can hence be rewritten F(x,x)−F(Mx,Mx) and by replacing x(t+1)=Mx(t) and it is equal to
2xT[(I−MT)A−MT(I−MT)AM]x. The middle matrix can be rewritten as
(I−MT)A−MT(I−MT)AM=(I−MT)A−(I−MT)MTAM=(I−MT)A−(I−MT)AM2
by using the property of commutativity of M and A. This further simplifies by observing that both contains the same multiplicative factor, and by using the well known equality a2−b2=(a+b)(a−b)
(I−MT)A(I−M2)=(I−MT)A(I+M)(I−M)
This proves that the LHS is
2xT[(I−MT)A(I+M)(I−M)]x=F((I−M)x,−(I−M)x)=F(x(t)−x(t+1),−(x(t)−x(t+1)))
Note that the RHS is always non-negative, which implies that the value of F never steps decreasing as t grows. In fact the RHS contains in particular the terms corresponding to j=i in the definition of F, which is at least equal to ∑i||xi(t+1)−xi(t)−(xi(t)−xi(t+1))||2=4∑i||xi(t+1)−xi(t)||2. So we know this function decreases at least by this amount.
Let us introduce G(t)(x,y)=F(t)(x,y)+∑i,j|j∉N(i)1. This function simply adds a constant term for any pair of nodes that are not close enough. We further note that G(t+1)(.,.)≤G(t)(.,.) when applied to the vector x(t+1). This true for any term i,j in this sum since:
- If ||xi−xj||<1 for both at time t and t+1, then the terms are equal
- Identically, the same holds if both time satisfies that i and j are neighbors.
- If j∈Ni(t) and j∉Ni(t+1) the term (i,j) appears as the real difference in G(t), and applied to x(t+1) we know that this term is larger than 1 since they are not neighbors. G(t+1) will now replace this term with a 1 which is smaller.
- If j∉Ni(t) and j∈Ni(t+1) the term (i,j) appears as the real difference in G(t+1), but that’s when this difference is less than 1, hence it’s smaller than the coefficient 1 appearing in G(t).
We have just proved that
G(t+1)(x(t+1),x(t+1))≤G(t)(x(t+1),x(t+1))=G(t)(x(t),x(t))−[F(x(t),x(t))−F(x(t+1),x(t+1))]
the last equality simply comes from the fact that all other terms in the sum of G are fixed and do not depend on x. Together with the result above it proves that as time grows:
G(0)(x(0),x(0))−G(T)(x(T),x(T))≥∑0≤t≤T4∑i||xi(t+1)−xi(t)||2
- We can now finalize the proof by observing the following property. If at time t the system did not converge, then either (i) Two agents who occupied different position will move to the same, or (ii) one agents moves at least by a cst/(dn4) where cst is an independent constant. We first note that this is sufficient to conclude since it implies that starting from G(0)(x(0),x(0))≤n2, we need at most O(n+n2(dn4)2) steps by counting the maximum that can occur in each type before G(t)(x(t),x(t)) would be negative which would be absurd.
Going further:
Biased Assimilation and Polarization
C. G. Lord, L. Ross, and M. R. Lepper, “Biased assimilation and attitude polarization: The effects of prior theories on subsequently considered evidence.,” Journal of Personality and Social Psychology, vol. 37, no. 11, p. 2098, Nov. 1979.
P. Dandekar, A. Goel, and D. T. Lee, “Biased assimilation, homophily, and the dynamics of polarization.,” Proceedings of the National Academy of Sciences, vol. 110, no. 15, pp. 5791–5796, Apr. 2013.
D. M. Fleder, K. Hosanagar, and A. Buja, “Will the Global Village Fracture into Tribes: Recommender Systems and their Effects on Consumers,” SSRN Journal, 2008. (also mentioned as Hosanagar, K., D. Fleder, D. Lee, and A. Buja. 2013. Recommender Systems and their Effects on Consumers: The Fragmentation Debate. Conditionally Accepted at Management Science.)
Networks with conflicting edges
One reason that may explain continued disagreement and fragmentation or polarization is the presence of active antagonistic edges, corresponding to edges with negative weight. The theory of graphs with positive and negative edges as source of conflicts lead to the theory of structural balance:
D. Cartwright and F. Harary, “Structural balance: a generalization of Heider’s theory.,” Psychological Review, vol. 63, no. 5, pp. 277–293, 1956.
According to this theory, networks containing triangles with an odd number of enemies are probably instable.
S. Marvel, S. Strogatz, and J. M. Kleinberg, “Energy landscape of social balance,” Physical review letters, vol. 103, no. 19, p. 198701, 2009.
S. A. Marvel, J. M. Kleinberg, R. Kleinberg, and S. H. Strogatz, “Continuous-time model of structural balance,” Proceedings of the National Academy of Sciences of the United States of America, vol. 108, no. 5, pp. 1771–1776, 2011.
Proposes an evolution of the graph and show that it reaches structural balance at stability. Study various attractors of these dynamics.
One difficulty of understanding opinion on network with conflicting edges is how to set the transitions between enemies. Various dynamics can be defined
C. Altafini, “Dynamics of opinion forming in structurally balanced social networks,” presented at the Decision and Control (CDC), 2012 IEEE 51st Annual Conference on, 2012, pp. 5876–5881.
C. Altafini, “Consensus Problems on Networks With Antagonistic Interactions,” IEEE Transactions on Automatic Control, vol. 58, no. 4, pp. 935–946, Apr. 2013.
Show that for a normalized dynamic the system either converges to a null opinion for every nodes unless the network is strongly balanced into two camps. In the later case the opinion evolution reaches two antagonistic clusters.
G. Shi, M. Johansson, and K. H. Johansson, “How Agreement and Disagreement Evolve over Random Dynamic Networks,” Selected Areas in Communications, IEEE Journal on, vol. 31, no. 6, pp. 1061–1071, 2013.
Characterizes a different evolution based on attraction/repulsion. Show which ones leads to consensus or divergence in a general network?
G. Shi, A. Proutiere, M. Johansson, J. S. Baras, and K. H. Johansson, “The Evolution of Beliefs over Signed Social Networks,” arXiv.org, Jul. 2013.
Look at an alternative “expansive” interaction between enemies that diverges
M. H. Afrasiabi, R. Guérin, and S. S. Venkatesh, “Opinion formation in Ising networks,” Information Theory and Applications Workshop (ITA), 2013, pp. 1–10, 2013.
Selection in discrete space
R. Axelrod, “The Dissemination of Culture: A Model with Local Convergence and Global Polarization,” Journal of Conflict Resolution, vol. 41, no. 2, pp. 203–226, Apr. 1997.
D. Kempe, J. M. Kleinberg, S. Oren, and A. Slivkins, “Selection and influence in cultural dynamics,” presented at the EC ‘13: Proceedings of the fourteenth ACM conference on Electronic commerce, 2013.
This paper proposes a new dynamics to study selection and influence over a discrete set of states. The elementary action of influence is to attract one node to your own state, a process that is only doable under some compatibility. In addition to that effect, the process accounts for the fact that nodes in different states may spend their attention span only over a selected subset of other states.
Recent works on consensus
U. Krause, “Opinion dynamics–local and global,” Proceedings of the International Workshop Future Directions in Difference Equations., Oct. 2011.
C. Budak, D. Agrawal, and A. El Abbadi, “Limiting the spread of misinformation in social networks,” presented at the WWW ‘11: Proceedings of the 20th international conference on World wide web, 2011.
S. Hong and D. Nadler, “Does the early bird move the polls?: the use of the social media tool ‘Twitter’ by U.S. politicians and its impact on public opinion,” presented at the dg.o ‘11: Proceedings of the 12th Annual International Digital Government Research Conference: Digital Government Innovation in Challenging Times, 2011.
C. J. Kuhlman, V. S. A. Kumar, and S. S. Ravi, “Controlling opinion bias in online social networks,” presented at the WebSci ‘12: Proceedings of the 3rd Annual ACM Web Science Conference, 2012.
A. Mirtabatabaei and F. Bullo, “Opinion Dynamics in Heterogeneous Networks: Convergence Conjectures and Theorems,” SIAM J. Control Optim., vol. 50, no. 5, pp. 2763–2785, Jan. 2012.
P. H. C. Guerra, W. Meira Jr, C. Cardie, and R. Kleinberg, “A Measure of Polarization on Social Media Networks Based on Community Boundaries,” presented at the Proceedings of the International Conference Weblogs and Social Media (ICWSM), 2013.
Written with StackEdit.