Graph Algorithms

General

A graph
- is a data structure consists of nodes(vertex) and edges
- can be undirected or directed

Graph $G = (V,E)$ - V = set of vetices - E = set of edges

If only one edge to each direction between nodes

$E\subseteq V^2$

for a directed $

$ can alter between 0,….$

^2$

Two way to represent a graph in a computer
adjacency list - sum of the sizes of all adjacency lists of the graph is
- $ E $ for directed graph, $2 E $ for undirected graph
  - Memory consumption $O(max(V,E)) = O(V+E)$
adjacency matrix - is a $|V |\times |V| $ matrix A , where the element $a_{ij}$ is
- 0, there is no edge from vertex i to j
- 1, there is an edge from i to j
  - Memory consumption is always $\Theta(V^2)$

Terminology

Step : moving from one vertex to another vertex along an edge
- follow the direction in directed graph
the distance of the vertex $v_{2}$ from the vertex $v_{1}$ is the length of the shrotest path from $v_1$ to $v_2$
- distance from itself is 0
- $\delta({v_1,v_2})$ denoted
- $\delta({v_1,v_2}) \ne\delta({v_2,v_1}) $ is common in directed graph
- if no path, then $\delta({v_1,v_2}) = \infin$
Color
- White = discovered
- grey = discovered but not complelty processed
- black = discovered and processed
- white —> grey —> black

Breadth-first Search

v->d // the distance of vertex from x
v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
v->color // the color white, grey ,black
v->Adj //the set of the neighbours of v


BFS(s) // s is the source
// set all the vertices  color = WHITE, d = ∞ π = NIL

s->grey // set it as discovered
s->d := 0  // source distance is 0
PUSH(Q,s)	// pish the source into the queue

While !Q.empty() do{
	u:= pop(Q)		// take the next vertex from the queue
	for each v ∈  u→Adj  // go trhough each neightobur 
		if v->color = WHITE then
			v-> color := GREY
			v->d: u->d + 1// incrmeent the distance 
			v-> π := u  // vertex v was reached through u
			PUSH(Q,v)
	u->color := BLACK // make it processed 
}

before calling the algorithm the nodes need to be initialised and it takes $ O(V) $
Scans the out Adj edges of the vertex,
each queue operation is a constant time
amount of WHILE loop takes $ O(V) $ because each vertex can be pushed into the queue once
amount of FOR loop is exected at most $O(E)$ goes through each edge once into both direction
The running time of BFS is $O(V+E)$

PRINT-PATH(G,s,v)
if v=s then  //base case 
	print s
else if (v->π = NIL) then // the search didnt reach the vertex v at all 
	print "no path"
else 
	PRINT-PATH(G,s,v->π) //recursive call 
	print v					//print afterwards

e.g PRINT-PATH(G,a,h) —> RETURN a,b,c,h

Depth-first Search

only vertices that havent been seen before are accepted into the path
once the algorihm cannot go further , it backtracks only as much as is needed in order to find a new route forwaeds
the algotirhm stops once it backtracks back to the source and there are no unexplored edges left there

DIfference to BFS

Instead of a queue the vertes waiting to be processed are put into a stack
the algorithm doesnt find the shortest paths but path

v->d // the distance of vertex from x
v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
v->color // the color white, grey ,black
v->Adj //the set of the neighbours of v

DFS(s)  // s is the source
// set all the vertices  color = WHITE, d = ∞ π = NIL
PUSH(S,s)
while !S.empty do
	u:= POP(S) 			// pop the latest vertex added to the stack
	if u->color = WHITE then // if it is unprocessed
		u->colour := GREY	//mark it discovered 
		PUSH(S,u)			// Push it into the stack
		for each v ∈ u→Adj do 
			if v->color = WHITE then
				push (S,v)
			else if v->color = GREY then // Grey node, CYCLE IS FOUND
				//cycle handling
				
	else 
		u->color := BLACK 		//all neighbours handled

Efficiency

Before calling the algorithm the nodes need to be initlaized so it takes $O(V)$
stack operation is $\Theta(1)$
The while loop makes atmost $O(V)$ rounds
The algorithm goes through each edge once into both directions ,it takes $O(V+E)$

Recuisive DFS

DFS(u)   
u->colour := GRAY
for each v ∈ u→Adj do
	if v->colour = WHITE then  // if v hasn't been visited
		DFS(v)						//contiue to search recursively from v
	else if v->colour = GRAY then	// if has been visited but not fully processed
		//	a loop is found
v-> colour := BLACK				//Mark node as fully processed

Running time

a vertex is colored grey at the begining of the function
- DFS is called atmost $O(V)$
There are atmost $O(E)$ rounds of the for loop
other operations are constant time
The running time is $O(V+E)$

BFS vs DFS

BFS can be used for searching the shortest path
if the stae space of the graph is larget, the BFS takes significantly larger amount of memory
- e.g arificial intelligence application
DFS used for searching for loops
- the grey nodes form a path from the source to the current vetex
- only black or grey vertices can be accessed through a black vertex
  - if a grey vertex can be reached from the current vertex, there is a loop in the graph

Dijkstra’s Algorithm

Dijkstra's algorithm

weight can be added to the edges of the graph
- can present the length of the route or the cost of the state transition
- The shortest route from the source to vertex
  - is the one whose combined weight of the edges **as small as possible**
The weight of each edge is 1, the problem can be solved with scanning through the part of graph reachable from the source with BFS
if the weights can be < 0, it is possible that there is no solution to the problem
- although there are possible paths in the graph
The algorithm is used for shortest weighted paths
- find the shortest paths from the source $s$ to all vertices reachale from $s$
  - by weighting the lengths of the edges with $w$
- choose in each situation that the shortest path that hasn’t investigated yet
  - Greedy algorithm
The algorithm uses a data Structure $Q$, which means $\text{Priority Queue} $
$w$ contains the weights of all edges

$Relax$ algorithm

relaxation of the edge($v,u$)
- tests could the shortest path found to vertex $v$ be improved
- by routing its end through $u$ and does so if necessary

v->d // the distance of vertex from x
v->π  // a pointer to the vertex through which v was found the first time, NIL for undiscovered vertices (predecessor)
v->color // the color white, grey ,black
v->Adj //the set of the neighbours of v
 Q is the Priority Queue
    
DIJKSTRA(s,w)   // s as the source
// set all the vertices  color = WHITE, d = ∞ π = NIL
s->colour := GRAY  //mark it discorvered
s->d  := 0			// the distance from the source to itself is 0
PUSH(Q,s) 			// coutinue while there are vetices left
While !Q.empty   //Where there are vetices left
	u:= EXTRACT-MIN(Q) take the next vertex from the priority queue
	for each v ∈ u→Adj do //go through the neighbours of u
		if v->colour:=WHITE then // if the node hasnt been visited 
			v->colour := GRAY 		//mark it found
			PUSH(Q,v)				//push the vertex into the queue
		RELAX(u,v,w)
	u->colour :=BLACK 			//mark u as processed
   	
RELAX(u,v,w)
if(v->d    >   u->d + w(u,v) )		// if a new shorter route tov was found
	v-> d := u->d + w(u,v)		//update the distance
	v→π := u 						//update the parent

Running-Time

While loop takes $O(V)$ rounds and the for loop takes atmost $O(E)$ rounds

The priority queue can be implemented efficiently with a heap or less efficiently with a list

	HEAP	LIST
empty data structure insertion	$\Theta(1)$	$\Theta(1)$
check the priority queue empty	$\Theta(1)$	$\Theta(1)$
Extract Min	$O(logV)$	$O(V)$
Push the data into the queue	$\Theta(1)$	$\Theta(1)$
Relaxation can change the priority of the vertex	$O(logV)$	$\Theta(1)$

$O((V+E)logV)$

A* Algorithm

Dijkstra’s algorithm finds the shortest weighted route by accessing the nodes in the order of shortest route
A algorithm* enhances this by adding a heuristic about the shortest possible distance to the goal (The direct distance in a road network)
- find the shortest weighted route from starting node $s$ to given goal node $g$.
- Does not find the shortest route to all nodes
- assumes that the minimum possible distance from any node to the goal can be calculated
- chooses in each situation a new node, whose is smallest.
Just changes the Relaxation

RELAX-A* (u,v,w)
if v->d > u->d +w(u,v) then 	//if a shorter route to node v is found
	v-> d := u->d + w(u,v)		//new route this far
	v->de := v->d + min_est(v,goal) //minimum estimate for the whole route
	v→π := u

Billy Chan

Chapter 8 - Graphs