Chapter 8 - Graphs

Graph Algorithms

General

  • A graph

    • is a data structure consists of nodes(vertex) and edges
    • can be undirected or directed
  • Graph $G = (V,E)$ - V = set of vetices - E = set of edges

    • If only one edge to each direction between nodes
      • $E\subseteq V^2$
      • for a directed $ E $ can alter between 0,….$ V ^2$
  • Two way to represent a graph in a computer

  • adjacency list - sum of the sizes of all adjacency lists of the graph is
    • $ E $ for directed graph, $2 E $ for undirected graph
      • Memory consumption $O(max(V,E)) = O(V+E)$
  • adjacency matrix - is a $|V |\times |V| $ matrix A , where the element $a_{ij}$ is
    • 0, there is no edge from vertex i to j
    • 1, there is an edge from i to j
      • Memory consumption is always $\Theta(V^2)$

img

Terminology

  • Step : moving from one vertex to another vertex along an edge
    • follow the direction in directed graph
  • the distance of the vertex $v_{2}$ from the vertex $v_{1}$ is the length of the shrotest path from $v_1$ to $v_2$
    • distance from itself is 0
    • $\delta({v_1,v_2})$ denoted
    • $\delta({v_1,v_2}) \ne\delta({v_2,v_1}) $ is common in directed graph
    • if no path, then $\delta({v_1,v_2}) = \infin$
  • Color
    • White = discovered
    • grey = discovered but not complelty processed
    • black = discovered and processed
    • white —> grey —> black

img

v->d // the distance of vertex from x
v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
v->color // the color white, grey ,black
v->Adj //the set of the neighbours of v


BFS(s) // s is the source
// set all the vertices  color = WHITE, d = ∞ π = NIL

s->grey // set it as discovered
s->d := 0  // source distance is 0
PUSH(Q,s)	// pish the source into the queue

While !Q.empty() do{
	u:= pop(Q)		// take the next vertex from the queue
	for each v ∈  u→Adj  // go trhough each neightobur 
		if v->color = WHITE then
			v-> color := GREY
			v->d: u->d + 1// incrmeent the distance 
			v-> π := u  // vertex v was reached through u
			PUSH(Q,v)
	u->color := BLACK // make it processed 
} 
  • before calling the algorithm the nodes need to be initialised and it takes $ O(V) $
  • Scans the out Adj edges of the vertex,
  • each queue operation is a constant time
  • amount of WHILE loop takes $ O(V) $ because each vertex can be pushed into the queue once
  • amount of FOR loop is exected at most $O(E)$ goes through each edge once into both direction
  • The running time of BFS is $O(V+E)$
PRINT-PATH(G,s,v)
if v=s then  //base case 
	print s
else if (v->π = NIL) then // the search didnt reach the vertex v at all 
	print "no path"
else 
	PRINT-PATH(G,s,v->π) //recursive call 
	print v					//print afterwards
	

e.g PRINT-PATH(G,a,h) —> RETURN a,b,c,h

File:Depth-First-Search.gif

  • only vertices that havent been seen before are accepted into the path
  • once the algorihm cannot go further , it backtracks only as much as is needed in order to find a new route forwaeds
  • the algotirhm stops once it backtracks back to the source and there are no unexplored edges left there

DIfference to BFS

  • Instead of a queue the vertes waiting to be processed are put into a stack
  • the algorithm doesnt find the shortest paths but path
v->d // the distance of vertex from x
v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
v->color // the color white, grey ,black
v->Adj //the set of the neighbours of v

DFS(s)  // s is the source
// set all the vertices  color = WHITE, d = ∞ π = NIL
PUSH(S,s)
while !S.empty do
	u:= POP(S) 			// pop the latest vertex added to the stack
	if u->color = WHITE then // if it is unprocessed
		u->colour := GREY	//mark it discovered 
		PUSH(S,u)			// Push it into the stack
		for each v ∈ u→Adj do 
			if v->color = WHITE then
				push (S,v)
			else if v->color = GREY then // Grey node, CYCLE IS FOUND
				//cycle handling
				
	else 
		u->color := BLACK 		//all neighbours handled 
				

Efficiency

  • Before calling the algorithm the nodes need to be initlaized so it takes $O(V)$
  • stack operation is $\Theta(1)$
  • The while loop makes atmost $O(V)$ rounds
  • The algorithm goes through each edge once into both directions ,it takes $O(V+E)$
Recuisive DFS

DFS(u)   
u->colour := GRAY
for each v ∈ u→Adj do
	if v->colour = WHITE then  // if v hasn't been visited
		DFS(v)						//contiue to search recursively from v
	else if v->colour = GRAY then	// if has been visited but not fully processed
		//	a loop is found
v-> colour := BLACK				//Mark node as fully processed

Running time

  • a vertex is colored grey at the begining of the function
    • DFS is called atmost $O(V)$
  • There are atmost $O(E)$ rounds of the for loop
  • other operations are constant time
  • The running time is $O(V+E)$

BFS vs DFS

  • BFS can be used for searching the shortest path
  • if the stae space of the graph is larget, the BFS takes significantly larger amount of memory
    • e.g arificial intelligence application
  • DFS used for searching for loops
    • the grey nodes form a path from the source to the current vetex
    • only black or grey vertices can be accessed through a black vertex
      • if a grey vertex can be reached from the current vertex, there is a loop in the graph

Dijkstra’s Algorithm

Dijkstra's algorithm

  • weight can be added to the edges of the graph

    • can present the length of the route or the cost of the state transition
    • The shortest route from the source to vertex
      • is the one whose combined weight of the edges **as small as possible**
  • The weight of each edge is 1, the problem can be solved with scanning through the part of graph reachable from the source with BFS

  • if the weights can be < 0, it is possible that there is no solution to the problem

    • although there are possible paths in the graph
  • The algorithm is used for shortest weighted paths
    • find the shortest paths from the source $s$ to all vertices reachale from $s$
      • by weighting the lengths of the edges with $w$
    • choose in each situation that the shortest path that hasn’t investigated yet
      • Greedy algorithm
  • The algorithm uses a data Structure $Q$, which means $\text{Priority Queue} $

  • $w$ contains the weights of all edges

  • $Relax$ algorithm
    • relaxation of the edge($v,u$)
      • tests could the shortest path found to vertex $v$ be improved
      • by routing its end through $u$ and does so if necessary
    v->d // the distance of vertex from x
    v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
    v->color // the color white, grey ,black
    v->Adj //the set of the neighbours of v
     Q is the Priority Queue
        
    DIJKSTRA(s,w)   // s as the source
    // set all the vertices  color = WHITE, d = ∞ π = NIL
    s->colour := GRAY  //mark it discorvered
    s->d  := 0			// the distance from the source to itself is 0
    PUSH(Q,s) 			// coutinue while there are vetices left
    While !Q.empty   //Where there are vetices left
    	u:= EXTRACT-MIN(Q) take the next vertex from the priority queue
    	for each v ∈ u→Adj do //go through the neighbours of u
    		if v->colour:=WHITE then // if the node hasnt been visited 
    			v->colour := GRAY 		//mark it found
    			PUSH(Q,v)				//push the vertex into the queue
    		RELAX(u,v,w)
    	u->colour :=BLACK 			//mark u as processed
       	
    RELAX(u,v,w)
    if(v->d    >   u->d + w(u,v) )		// if a new shorter route tov was found
    	v-> d := u->d + w(u,v)		//update the distance
    	v→π := u 						//update the parent 
       	
    

Running-Time

  • While loop takes $O(V)$ rounds and the for loop takes atmost $O(E)$ rounds

  • The priority queue can be implemented efficiently with a heap or less efficiently with a list

    •   HEAP LIST
      empty data structure insertion $\Theta(1)$ $\Theta(1)$
      check the priority queue empty $\Theta(1)$ $\Theta(1)$
      Extract Min $O(logV)$ $O(V)$
      Push the data into the queue $\Theta(1)$ $\Theta(1)$
      Relaxation can change the priority of the vertex $O(logV)$ $\Theta(1)$
  • $O((V+E)logV)$

A* Algorithm

  • Dijkstra’s algorithm finds the shortest weighted route by accessing the nodes in the order of shortest route

  • A algorithm* enhances this by adding a heuristic about the shortest possible distance to the goal (The direct distance in a road network)
    • find the shortest weighted route from starting node $s$ to given goal node $g$.
    • Does not find the shortest route to all nodes
    • assumes that the minimum possible distance from any node to the goal can be calculated
    • chooses in each situation a new node, whose is smallest.
  • Just changes the Relaxation

hihi

RELAX-A* (u,v,w)
if v->d > u->d +w(u,v) then 	//if a shorter route to node v is found
	v-> d := u->d + w(u,v)		//new route this far
	v->de := v->d + min_est(v,goal) //minimum estimate for the whole route
	v→π := u 

Author | Billy Chan

Currently studying Information Engineering at City University of Hong Kong.