• Bayesian learning

    • repeated actions, observe each other
  • DeGroot model

    • repeated communication, "naive" updating

Bayesian Learning

  • Will society converge

  • Will they aggregate information properly?

Bala Goyal 98

  • n players in an undirected component g

  • Choose action A or B each period

  • A pays 1 for sure, B pays 2 with probability p and 0 with probability 1-p


  • Each period get a payoff based on choice

  • Also observe neighbors' choices

  • Maximize discounted stream of payoffs $E[\sum_t \delta^t \pi_{it}]$

  • p is unkown takes on finite set of values

Challenges Bayesian learning


  • If p is not exactly 1/2, then with probability 1 there is a time such that all agents in a given component play just one action (and all play the same action) from that time onward

Sketch of Proof

  • Suppose contrary

  • Some agent in some component plays B infinitely often

  • That agent will converge to true belief by the law of large numbers

  • Must be that belief converges to p>1/2, or that agent would stop playing B

Play the right action?

  • If B is the right action then play the right action if converge to it, but might not

  • If A is the right action, then must converge to right action

Consider the model of observational Bayesian learning on a network that we have discussed in which action A pays 1 for sure and action B pays 2 with an initially unknown probability p, and 0 with probability 1-p. Suppose that the society is in a network that is connected and all agents start with the same beliefs over which possible values p could have, and think p to be either 1/4 or 3/4.

According the result we discussed, following statement(s) are correct:

  • If p<0.5, then with probability 1 all agents will play action A from some time onwards.
  • With probability 1, there is some time after which all agents will play the same action.

    Notice it could occur that all agents eventually play A even though B is actually the higher return action: provided they have sufficiently pessimistic beliefs about the return to playing B which could come from a sufficiently pessimistic prior or from bad luck on the initial outcomes from playing B.

    However, they cannot end up eventually playing B when it is the lower return action, since they would then eventually learn it to have a lower payoff. So in that case they eventually play A, and so the last statement is also correct.

Probability of Converging to "correct" action

  • Arbitrarily high if each action has some agent who initially has arbitrarily high prior that the action is the best one


  • Consensus action chosen

  • Not necessarily consensus belief

  • Speed of convergence?


  • Homogeneity of actions and payoffs across players

  • What if heterogeneity?

  • Repeated actions over time

  • Stationarity

  • Networks are not playing role here!