Section 3 - Graph Data Structures

10A.3.1 Adjacency Lists

We will use the following graph for our conversation:

Each vertex label, v₁, v₂, etc., represents the underlying object data. If we are creating a graph of computer routers, then we would have a class ComputerRouter, and inside each vertex, we would have an individual ComputerRouter reference (perhaps denver.co.ibone.comcast, wrapped in an object). For our discussion and development, we will use simple String data as the underlying object. Therefore, v₁ would, in our program, be the String "v1", and v₂, the String "v2".

The decision before us is how to represent this graph internally. Surely, we will need to store the vertices. Somehow, we'll want a collection : hash set, array list, ... something to hold them all. Also, we need to represent the edges. Edges will have both directional information and cost information. Here is the general picture, which is just a conceptual organization at this point, not an actual data structure:

Vertices are easy to discuss. If I want to talk about the vertex v₁, I simply write v₁. We put all the vertices in a bag and this will be the vertex list for our graph. This is shown in the left oval in the picture above.

The edges will be put into a separate bag with each edge represented by an ordered-pair. For instance, we describe the edge from v₂ to v₅ using the ordered pair (v₂, v₅). Note that the edge (v₂, v₅) is a member of the above graph, but the edge (v₅, v₂) is not. The ordering tells us what direction the arrow points. Also, if we have a non-directed graph, there are no arrows, so we have a choice, depending on the needs of our algorithm. Either we store both (v₂, v₅) and (v₅, v₂) in the list of edges, or we store only one, but do not make use of the distinction between source and destination in the algorithm.

Finally, the cost is added to our informal language when we need it, and omitted when we don't. Usually, we will just say (v₂ ,v₅) to describe the edge, but if I need to talk about the cost or distance of that edge, I'll add the 10, as in (v₂, v₅) : 10.

Terminology

For a directed edge (v₂, v₅), I will call v₂, the source vertex and v₅ the destination vertex.

One way to describe all this (without the costs) is by using an adjacency table. All the vertices are listed along the left, and to the right of each vertex we list all the destination vertices that have that left vertex as a source. Let's demonstrate this with the vertex v₁. It is the source of two edges, (v₁, v₄) and (v₁, v₂). We would then begin constructing the adjacency table by writing out v₁'s entry in the table, which would look like this:

This would be called v₁'s adjacency list. It tells us that there are two edges with v₁ as a source, one has v₄ as a destination and the other had v₂ as a destination. In other words, this is an alternate way to say the graph contains the edges (v₁, v₄) and (v₁, v₂). If we want to include the cost information, we write the cost after each destination vertex, like so:

Don't misunderstand the meaning of the list to the right of v₁. It does not mean, for instance, that there is an edge from v₄ to v₁. The vertices in v₁'s adjacency list are not in any particular order, and they are not linked up in the graph according to how they appear in the adjacency list. They only represent all the right vertices of edges which have v₁ as a left vertex.

If we do this for each of the seven vertices in the above graph we will have a full adjacency table -- a complete description of the graph:

There are a few things to mention here.

Vertex v₆ has an empty vertex list. This expresses the fact that there are no directed edges emanating from v₆. It is a destination, but never a source. This doesn't always occur in graphs, but it happens with this particular one.
There are no duplicate vertices in any adjacency list. This expresses the fact that if edge (v, w) is in the list once, it cannot appear a second time, even if you were to try to distinguish the two appearances using differing costs. You cannot have two edges starting at v and ending at w in the same graph -- at least not in this course.
The order of the vertices in the various adjacency lists is not important. v₄'s adjacency list is v₅, v₃, v₆, v₇, but it could have been v₆, v₅, v₇, v₃.
Edges are not wrapped up in a single entity in this table. The edge (v₄, v₃) does not appear there as such. You can't circle anything in the table and say "here is edge (v₄, v₃)." The edge's existence is implied by the fact that v₃appears somewhere in v₄'s adjacency list.

Comparing the original pictorial circle/arrow representation with the new adjacency table model, we see it is easier for the human brain to answer questions about the graph using the picture. However, the computer program must work with the adjacency table. That's why human programmers always use pictures of graphs to study them and to design algorithms, but they then program these algorithms using a data structure like an adjacency table, if not exactly an adjacency table.