Outline¶

Bayesian learning
- repeated actions, observe each other
DeGroot model
- repeated communication, "naive" updating

Bayesian Learning¶

Will society converge
Will they aggregate information properly?

Bala Goyal 98¶

n players in an undirected component g
Choose action A or B each period
A pays 1 for sure, B pays 2 with probability p and 0 with probability 1-p

learning¶

Each period get a payoff based on choice
Also observe neighbors' choices
Maximize discounted stream of payoffs $E[\sum_t \delta^t \pi_{it}]$
p is unkown takes on finite set of values

Challenges Bayesian learning¶

Proposition¶

If p is not exactly 1/2, then with probability 1 there is a time such that all agents in a given component play just one action (and all play the same action) from that time onward

Sketch of Proof¶

Suppose contrary
Some agent in some component plays B infinitely often
That agent will converge to true belief by the law of large numbers
Must be that belief converges to p>1/2, or that agent would stop playing B

Play the right action?¶

If B is the right action then play the right action if converge to it, but might not
If A is the right action, then must converge to right action

Consider the model of observational Bayesian learning on a network that we have discussed in which action A pays 1 for sure and action B pays 2 with an initially unknown probability p, and 0 with probability 1-p. Suppose that the society is in a network that is connected and all agents start with the same beliefs over which possible values p could have, and think p to be either 1/4 or 3/4.

According the result we discussed, following statement(s) are correct:

If p<0.5, then with probability 1 all agents will play action A from some time onwards.
With probability 1, there is some time after which all agents will play the same action.

Notice it could occur that all agents eventually play A even though B is actually the higher return action: provided they have sufficiently pessimistic beliefs about the return to playing B which could come from a sufficiently pessimistic prior or from bad luck on the initial outcomes from playing B.

However, they cannot end up eventually playing B when it is the lower return action, since they would then eventually learn it to have a lower payoff. So in that case they eventually play A, and so the last statement is also correct.

Probability of Converging to "correct" action¶

Arbitrarily high if each action has some agent who initially has arbitrarily high prior that the action is the best one

Conclusions¶

Consensus action chosen
Not necessarily consensus belief
Speed of convergence?

Limitations¶

Homogeneity of actions and payoffs across players
What if heterogeneity?
Repeated actions over time
Stationarity
Networks are not playing role here!