# centrality

Measure node importance

## Syntax

``C = centrality(G,type)``
``C = centrality(___,Name,Value)``

## Description

example

````C = centrality(G,type)` computes the node centrality specified by `type` for each node in the graph. ```

example

````C = centrality(___,Name,Value)` uses additional options specified by one or more Name-Value pair arguments. For example, `centrality(G,'closeness','Cost',c)` specifies the cost of traversing each edge.```

## Examples

collapse all

Create and plot a graph containing six fictitious websites.

```s = [1 1 2 2 3 3 3 4 5]; t = [2 5 3 4 4 5 6 1 1]; names = {'http://www.example.com/alpha', 'http://www.example.com/beta', ... 'http://www.example.com/gamma', 'http://www.example.com/delta', ... 'http://www.example.com/epsilon', 'http://www.example.com/zeta'}; G = digraph(s,t,[],names); plot(G,'NodeLabel',{'alpha','beta','gamma','delta','epsilon','zeta'})``` Calculate the page rank of each website using the `centrality` function. Append this information to the `Nodes` table of the graph as an attribute of the graph nodes.

`pg_ranks = centrality(G,'pagerank')`
```pg_ranks = 6×1 0.3210 0.1706 0.1066 0.1368 0.2008 0.0643 ```
```G.Nodes.PageRank = pg_ranks; G.Nodes```
```ans=6×2 table Name PageRank __________________________________ ________ {'http://www.example.com/alpha' } 0.32098 {'http://www.example.com/beta' } 0.17057 {'http://www.example.com/gamma' } 0.10657 {'http://www.example.com/delta' } 0.13678 {'http://www.example.com/epsilon'} 0.20078 {'http://www.example.com/zeta' } 0.06432 ```

Also determine which nodes are hubs and authorities using `centrality` and append the scores to the `Nodes` table.

```hub_ranks = centrality(G,'hubs'); auth_ranks = centrality(G,'authorities'); G.Nodes.Hubs = hub_ranks; G.Nodes.Authorities = auth_ranks;```
`G.Nodes`
```ans=6×4 table Name PageRank Hubs Authorities __________________________________ ________ __________ ___________ {'http://www.example.com/alpha' } 0.32098 0.24995 7.3237e-05 {'http://www.example.com/beta' } 0.17057 0.24995 0.099993 {'http://www.example.com/gamma' } 0.10657 0.49991 0.099993 {'http://www.example.com/delta' } 0.13678 9.1536e-05 0.29998 {'http://www.example.com/epsilon'} 0.20078 9.1536e-05 0.29998 {'http://www.example.com/zeta' } 0.06432 0 0.19999 ```

Create and plot a weighted graph using a random sparse adjacency matrix. Since there are a lot of edges, use a very small value for `EdgeAlpha` to make the edges mostly transparent.

```A = sprand(1000,1000,0.15); A = A + A'; G = graph(A,'omitselfloops'); p = plot(G,'Layout','force','EdgeAlpha',0.005,'NodeColor','r');``` Calculate the degree centrality of each node. Specify the importance of each edge using the edge weights.

`deg_ranks = centrality(G,'degree','Importance',G.Edges.Weight);`

Use `discretize` to place the nodes into 7 equally-spaced bins based on their centrality scores.

```edges = linspace(min(deg_ranks),max(deg_ranks),7); bins = discretize(deg_ranks,edges);```

Make the size of each node in the plot proportional to its centrality score. The marker size of each node is equal to the bin number (1-7).

`p.MarkerSize = bins;` Load the data in `minnesota.mat`, which contains a graph object `G` representing the network of roads in Minnesota. The graph nodes have xy coordinates contained in the `XCoord` and `YCoord` variables of the `G.Nodes` table.

```load minnesota.mat xy = [G.Nodes.XCoord G.Nodes.YCoord];```

Add edge weights to the graph that roughly correspond to the length of the roads, calculated using the Euclidean distance between the xy coordinates of the end nodes of each edge.

```[s,t] = findedge(G); G.Edges.Weight = hypot(xy(s,1)-xy(t,1), xy(s,2)-xy(t,2));```

Plot the graph using the xy coordinates for the nodes.

```p = plot(G,'XData',xy(:,1),'YData',xy(:,2),'MarkerSize',5); title('Minnesota Road Network')``` Compute the closeness centrality of each node. Scale the node color `NodeCData` to be proportional to the centrality score.

```ucc = centrality(G,'closeness'); p.NodeCData = ucc; colormap jet colorbar title('Closeness Centrality Scores - Unweighted')``` Also compute the weighted closeness centrality score, using the edge weights as the cost of traversing each edge. Using the road lengths as edge weights improves the score quality, since distances are now measured as the sum of the lengths of all traveled edges, rather than the number of edges traveled.

```wcc = centrality(G,'closeness','Cost',G.Edges.Weight); p.NodeCData = wcc; title('Closeness Centrality Scores - Weighted')``` Compute the weighted betweenness centrality scores for the graph to determine the roads most often found on the shortest path between two nodes. Normalize the centrality scores with the factor $\frac{\left(n-2\right)\left(n-1\right)}{2}$ so that the score represents the probability that a traveler along a shortest path between two random nodes will travel through a given node. The plot indicates that there are a few very important roads leading into and out of the city.

```wbc = centrality(G,'betweenness','Cost',G.Edges.Weight); n = numnodes(G); p.NodeCData = 2*wbc./((n-2)*(n-1)); colormap(flip(autumn,1)); title('Betweenness Centrality Scores - Weighted')``` ## Input Arguments

collapse all

Input graph, specified as either a `graph` or `digraph` object. Use `graph` to create an undirected graph or `digraph` to create a directed graph.

Example: `G = graph(1,2)`

Example: `G = digraph([1 2],[2 3])`

Type of node centrality, specified as one of the options in the table. The table also lists the compatible name-value arguments that work with each type. Each variety of node centrality offers a different measure of node importance in a graph.

Option

Graph type

Description

Name-Value Arguments

`'degree'`

Undirected

The `'degree'`, `'outdegree'`, and `'indegree'` centrality types are based on the number of edges connecting to each node:

• `'degree'` — Number of edges connecting to each node. A self-loop counts as two edges connecting to the node.

• `'indegree'` — Number of incoming edges to each node. A self-loop counts as one incoming edge.

• `'outdegree'` — Number of outgoing edges from each node. A self-loop counts as one outgoing edge.

If you specify `'Importance'` edge weights, then the algorithm uses the sum of the edge weights rather than the number of connecting edges.

`'Importance'`

`'indegree'`

`'outdegree'`

Directed

`'closeness'`

Undirected

The `'closeness'`, `'incloseness'`, and `'outcloseness'` centrality types use the inverse sum of the distance from a node to all other nodes in the graph. If not all nodes are reachable, then the centrality of node `i` is:

`$c\left(i\right)={\left(\frac{{A}_{\text{\hspace{0.17em}}i}}{N-1}\right)}^{2}\frac{1}{{C}_{i}}\text{\hspace{0.17em}}.$`

Ai is the number of reachable nodes from node `i` (not counting `i`), N is the number of nodes in `G`, and Ci is the sum of distances from node `i` to all reachable nodes.

• If no nodes are reachable from node `i`, then `c(i)` is zero.

• For `'incloseness'`, the distance measure is from all nodes to node `i`.

• `'Cost'` edge weights specify the length of the edges.

`'Cost'`

`'incloseness'`

`'outcloseness'`

Directed

`'betweenness'`

Undirected or Directed

The `'betweenness'` centrality type measures how often each graph node appears on a shortest path between two nodes in the graph. Since there can be several shortest paths between two graph nodes `s` and `t`, the centrality of node `u` is:

`$c\left(u\right)=\sum _{s,t\ne u}\frac{{n}_{st}\left(u\right)}{{N}_{st}}\text{\hspace{0.17em}}.$`

${n}_{st}\left(u\right)$ is the number of shortest paths from `s` to `t` that pass through node `u`, and ${N}_{st}$ is the total number of shortest paths from `s` to `t`.

• If the graph is undirected, then the paths from `s` to `t` and from `t` to `s` count only as one path (divide the formula by two).

• `'Cost'` edge weights specify the length of the edges and help determine the shortest paths between nodes `s` and `t`.

`'Cost'`

`'pagerank'`

Undirected or Directed

The `'pagerank'` centrality type results from a random walk of the network. At each node in the graph, the next node is chosen with probability `'FollowProbability'` from the set of successors of the current node (neighbors for the undirected case). Otherwise, or when a node has no successors, the next node is chosen from all nodes. The centrality score is the average time spent at each node during the random walk.

• If a node has a self-loop, then there is a chance that the algorithm traverses it. Therefore self-loops increase the pagerank centrality score of the node they attach to.

• In multigraphs with multiple edges between the same two nodes, nodes with multiple edges are more likely to be chosen.

• `'Importance'` edge weights affect how the algorithm chooses successors. Nodes with higher importance are more likely to be chosen.

`'Importance'`

`'FollowProbability'`

`'Tolerance'`

`'MaxIterations'`

`'eigenvector'`

Undirected

The `'eigenvector'` centrality type uses the eigenvector corresponding to the largest eigenvalue of the graph adjacency matrix. The scores are normalized such that the sum of all centrality scores is 1.

• If there are several disconnected components, then the algorithm computes the eigenvector centrality individually for each component, then scales the scores according to the percentage of graph nodes in that component.

• The centrality score of disconnected nodes is `1/numnodes(G)`.

• Specify `'Importance'` edge weights to use a weighted adjacency matrix in the calculation.

`'Importance'`

`'Tolerance'`

`'MaxIterations'`

`'hubs'`

`'authorities'`

Directed

The `'hubs'` and `'authorities'` centrality scores are two linked centrality measures that are recursive. The hubs score of a node is the sum of the authorities scores of all its successors. Similarly, the authorities score is the sum of the hubs scores of all its predecessors. The sum of all hubs scores is 1 and the sum of all authorities scores is 1.

• These scores can be interpreted as the left (hubs) and right (authorities) singular vectors corresponding to the largest singular value of the adjacency matrix.

• The centrality score of disconnected nodes is `1/numnodes(G)`.

• Specify `'Importance'` edge weights to use a weighted sum, rather than the simple sum of all successor/predecessor scores. This is equivalent to using the singular vectors of the weighted adjacency matrix.

• If there are several disconnected components (in the weakly connected sense), then the algorithm computes the hubs and authorities scores individually for each component. The scores are then rescaled according to the percentage of graph nodes in that component so that the overall sum is still 1.

`'Importance'`

`'Tolerance'`

`'MaxIterations'`

Note

The `centrality` function assumes all edge weights are equal to 1. To change this, specify edge weights for use with the `'Cost'` or `'Importance'` name-value pairs.

Example: `centrality(G,'degree')`

Example: `centrality(G,'hubs','Tolerance',tol)`

### Name-Value Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `C = centrality(G,'closeness','Cost',edgeCosts)` computes the closeness centrality using `edgeCosts` as the cost (weight) of traversing each edge in the graph.

Cost of edge traversal, specified as the comma-separated pair consisting of `'Cost'` and a vector of edge weights. The ith edge weight specifies the cost associated with traversing the edge `findedge(G,i)`.

• For the `'closeness'`, `'outcloseness'`, and `'incloseness'` centrality types, edge costs must be nonnegative.

• For the `'betweenness'` centrality type, edge costs must be positive.

`'Cost'` edge weights are smaller when the connection is shorter, or faster, or cheaper. Some examples of `'Cost'` edge weights are:

• Length of a path

• Travel time

• Cost of a ticket

Note

`'Cost'` only applies to the `'closeness'`, `'outcloseness'`, `'incloseness'`, and `'betweenness'` centrality types.

Example: `centrality(G,'closeness','Cost',c)`

Probability of selecting a successor node, specified as the comma-separated pair consisting of `'FollowProbability'` and a scalar between 0 and 1. The follow probability is the probability that the next node selected in the traversal by the pagerank algorithm is chosen among the successors of the current node, and not at random from all nodes. For websites, this probability corresponds to clicking a link on the current web page instead of surfing to another random web page.

Note

`'FollowProbability'` only applies to the `'pagerank'` centrality type.

Example: `centrality(G,'pagerank','FollowProbability',0.5)`

Edge importance, specified as the comma-separated pair consisting of `'Importance'` and a vector of nonnegative edge weights. The ith edge weight specifies the importance of the edge `findedge(G,i)`. An edge weight of zero is equivalent to removing that edge from the graph.

For multigraphs with multiple edges between two nodes, `centrality` adds the multiple edges together and treats them as a single edge with the combined weight.

`'Importance'` edge weights are larger when the connection is stronger. Some examples of `'Importance'` edge weights are:

• Number of travellers per day

• Number of clicks on a link

• Number of papers published together

Note

`'Importance'` only applies to the `'degree'`, `'outdegree'`, `'indegree'`, `'pagerank'`, `'eigenvector'`, `'hubs'`, and `'authorities'` centrality types.

Example: `centrality(G,'degree','Importance',x)`

Maximum number of iterations, specified as the comma-separated pair consisting of `'MaxIterations'` and a scalar. The `centrality` algorithm runs until the tolerance is met or the maximum number of iterations is reached, whichever comes first.

Note

`'MaxIterations'` only applies to the `'pagerank'`, `'eigenvector'`, `'hubs'`, and `'authorities'` centrality types.

Example: `centrality(G,'pagerank','MaxIterations',250)`

Stopping criterion for iterative solvers, specified as the comma-separated pair consisting of `'Tolerance'` and a scalar. The `centrality` algorithm runs until the tolerance is met or the maximum number of iterations is reached, whichever comes first.

Note

`'Tolerance'` only applies to the `'pagerank'`, `'eigenvector'`, `'hubs'`, and `'authorities'` centrality types.

Example: `centrality(G,'pagerank','Tolerance',1e-5)`

## Output Arguments

collapse all

Node centrality scores, returned as a column vector. `C(i)` is the centrality score of node `i`. The interpretation of the node centrality score depends on the type of centrality computation selected. The more central a node is, the larger its centrality score.