# Statistical Exponential Random Graph Model(SERGMs)¶

## Random Network Models:¶

• Erdos-Renyi
• Useful for understanding thresholds and how networks come to exhibit certain features
• Miss many real-world features: e.g., clustering
• Watts and Strogatz, Barabasi and Albert, Jackson and Rogers...
• Capture other features: clustering, degree distribution, correlation...
• Stochastic Block Models
• Enrich Erdos-Renyi to allow for probabilities to depend on node characteristics, attributes(or on latent - unobserved characteristics)
• Popular set of models: ERGMs and new ones: SERGMs/SUGMs
• flexible way to introduce various local features and dependencies
• estimated statistically

So what we've been through is we had sets of models that are linked based, other models which are still link by link based which can begin to capture different things like clustering and so forth. We brought in attributes in the Stochastic block models. And then we said okay, now there's the class of models which allow us to capture richer things where there's two dependencies and trying to estimate those things statistically. Exponential Random Graph Models, but as we saw there is difficulties in estimating those. So what I want to talk about now is a class of new ones called Statistical Exponential Random Graph Models. That'll allow us to keep the track of these local features and dependencies and actually do some accurate and fast and easy estimation.

## SERGMs: Introduction¶

• ERGMs not accurately estimable in many cases...
• Too many alternative networks to consider...
• Way out: Many networks lead to the same statistics
• Probabilities only depend on statistics
• So, newworks with same statistics are "equivalent"(equally likely)
• Collapse all equivalent networks

the Exponential Random Graph Models are not accurately estimable in, in many cases. There's basically just too many alternative networks to consider. So what's the idea here? What's the way out? The way out is going to be that many networks lead to the same statistics. So for instance in the, in the simulations we did in the last video what we had was a series of you know, 45 links, 10 triangles and 20 isolated nodes. And in fact, under the statistical, under an Exponential Random Graph formulation, any network that had exactly those characteristics would be equally likely. They'd all lead to the same function, they'd lead to the same probability. So what we can do is we can just say, okay look, even though there's many different networks to search over, In fact, what we really need to keep track of is just the number of different types of statistics, because all the networks that have the same statistic have the same probability. And so we can look at those as equivalent ones. Collapse all those equivalent networks. And that makes the summation and the area that we have to be searching over much smaller than what it was before.

## Sufficient Statistics¶

• $Pr(g) = \frac{exp(\beta S(g))}{\sum_{g'}exp(\beta S(g'))}$

Let N of s prime be the number of networks that would have statistics equal to s prime.

$N(s')$ = number networks

s.t. $S(g') = s'$

So how many different networks are there that have 20 isolates? 10 triangles, and 45 links. So that's a set of statistics. That's our S of g.

So how many different networks would have exactly this, that will let N of s be this. Now that's not necessarily an easy number to, to calculate. And in some cases it can actually be a complicated number to even estimate.

• $Pr(g) = \frac{exp(\beta S(g))}{\sum_{s'}N(s')exp(\beta s')}$

the probability that a certain set of statistics

## Statistical Form¶

• $Pr(g) = \frac{exp(\beta S(g))}{\sum_{s'}N(s')exp(\beta s')}$

• $Pr(s) = \frac{N(s)exp(\beta s)}{\sum_{s'}N(s')exp(\beta s')}$

So ultimately, it's not so much that we're interested in the particular network we saw, as much as it had 20 isolates, 10 triangles and 45 links. So that was the really important aspect, was that it had certain attributes. Those are the kinds of things we're going to want to test as scientists. You know, does the network have certain characteristics? That was a very a very particular network realized, as opposed to some other one. Whatever the characteristics we're really interested in, we can put into those statistics.

• And so this is what we're ultimately interested in.
• What's the probability that we see a network that has certain characteristics.

So now we're going to call this the statistical form here. Because we've gotten rid of the networks and we are just talking about the properties of the network. And all estimations are now in statistics based which can be a much smaller space than we are working in than the original space in, in that ideas as the idea behind the Statistical Exponential Random Graph Models.

And of course, anywhere one of these representations of an Exponential Random Graph Model then has a corresponding representation in this format. And if we can estimate the parameters here, then that tells us what the parameters would be in the original model. So we can work in this space, if we can do the estimation and, and track back and we get the Exponential Random Graph spot.

Ask: What is the probaility of observing a network that has density of links .1, clustering .3, and average path length of 2.7, etc.?

## SERGMs Include:¶

• $Pr(s) = \frac{K(s)exp(\beta s)}{\sum_{s'}K(s')exp(\beta s')}$ SERGMs

s can encode many things: