Data Structures for Graphs

Transitioning to practicality, we now learn about how data can be structured so that graphs are correctly formed for application.

9.2 Data Structures for Graphs

In this section, we will describe data structures that are commonly used to represent graphs. In addition we will introduce the basic syntax for graphs in Sage.

9.2.1 Basic Data Structures

List 9.2.1 Data Structures for Graphs

Assume that we have a graph with n vertices that can be indexed by the integers 1, 2, . . . , n. Here are three different data structures that can be employed to represent graphs.

(a) Adjacency Matrix: As we saw in Chapter 6, the information about edges in a graph can be summarized with an adjacency matrix, G, where Gij = 1 if and only if vertex i is connected to vertex j in the graph. Note that this is the same as the adjacency matrix for a relation.

(b) Edge Dictionary: For each vertex in our graph, we maintain a list of edges that initiate at that vertex. If G represents the graph's edge information, then we denote by Gthe list of vertices that are terminal vertices of edges initiating at vertex i. The exact syntax that would be used can vary. We will use Sage/Python syntax in our examples.

(c) Edge List: Note that in creating either of the first two data structures, we would presume that a list of edges for the graph exists. A simple way to represent the edges is to maintain this list of ordered pairs, or two-element sets, depending on whether the graph is intended to be directed or undirected. We will not work with this data structure here, other than in the first example.

Example 9.2.2 A Very Small Example. We consider the representation of the following graph:

Figure 9.2.3 Graph for a Very Small Example

The adjacency matrix that represents the graph would be

$G = \begin{pmatrix} 0 & 1 & 0 & 1\\ 0 & 0 & 1 & 1\\ 0 & 0 & 1 & 0\\ 1 & 0 & 0 & 0 \end{pmatrix}$

The same graph could be represented with the edge dictionary

{1 :[2,4],2:[3,4],3:[3],4:[1]}.

Notice the general form of each item in the dictionary: vertex:[list of vertices].

Finally, a list of edges [(1,2), (1,4), (2,3), (2,4), (3,3), (4,1)] also describes the same graph.

A natural question to ask is: Which data structure should be used in a given situation? For small graphs, it really doesn't make much difference. For larger matrices the edge count would be a consideration. If n is large and the number of edges is relatively small, it might use less memory to maintain an edge dictionary or list of edges instead of building an n × n matrix. Some software for working with graphs will make the decision for you.

Example 9.2.4 NCAA Basketball. Consider the tournament graph representing a NCAA Division 1 men's (or women's) college basketball season in the United States. There are approximately 350 teams in Division 1. Suppose we constructed the graph with an edge from team A to team B if A beat B at least once in the season; and we label the edge with the number of wins. Since the average team plays around 30 games in a season, most of which will be against other Division I teams, we could expect around $\frac{30 \cdot 350}{2} = 5,250$ edges in the graph. This would be somewhat reduced by games with lower division teams and cases where two or more wins over the same team produces one edge. Since 5,250 is much smaller than 3502 = 122,500 entries in an adjacency matrix, an edge dictionary or edge list would be more compact than an adjacency matrix.

Even if we were to use software to create an adjacency matrix, many programs will identify the fact that a matrix such as the one in this example would be "sparse" and would leave data in list form and use sparse array methods to work with it.

9.2.2 Sage Graphs

The most common way to define a graph in Sage is to use an edge dictionary. Here is how the graph in Example 9.2.2 is generated and then displayed. Notice that we simply wrap the function DiGraph() around the same dictionary expression we identified earlier.

G1 = DiGraph( {1 : [4, 2], 2 : [3, 4], 3 : [3], 4 : [1]})
G1.show()

You can get the adjacency matrix of a graph with the adjacency_matrix method.

G1.adjacency_matrix()
[0 1 0 1]
[0 0 1 1]
[0 0 1 0]
[1 0 0 0]

You can also define a graph based on its adjacency matrix.

M = Matrix([[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[0,0,0,0,1],[1,0,0,0,0]])
DiGraph(M).show()
[0 1 0 1]
[0 0 1 1]
[0 0 1 0]
[1 0 0 0]

The edge list of any directed graph can be easily retrieved. If you replace edges with edge_iterator, you can iterate through the edge list. The third coordinate of the items in the edge is the label of the edge, which is None in this case.

DiGraph(M).edges()[(0, 1, None), (1, 2, None), (2, 3, None), (3, 4, None), (4, 0, None)]

Replacing the wrapper DiGraph() with Graph() creates an undirected graph.

G2 = Graph({1 : [4, 2], 2 : [3, 4], 3 : [3], 4 : [1]})
G2.show()

There are many special graphs and graph families that are available in Sage through the graphs module. They are referenced with the prefix graphs. followed by the name and zero or more parameters inside parentheses. Here are a couple of them, first a complete graph with five vertices.

g r a p h s . C o m p l e t e G r a p h ( 5 ) . s h o w ( )

Here is a wheel graph, named for an obvious pattern of vertices and edges. We assign a name to it first and then show the graph without labeling the vertices.

w=graphs.WheelGraph(20)
w.show(vertex_labels=false)

There are dozens of graph methods, one of which determines the degree sequence of a graph. In this case, it's the wheel graph above.

w.degree_sequence()
[19, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]

The degree sequence method is defined within the graphs module, but the prefix graphs. is not needed because the value of w inherits the graphs methods.