Trees

  • A set of connected nodes (or vertices). We use both terms interchangeably.
  • A set of edges that connect those nodes. No edges can form a cycle.
    • Constraint: There is exactly one path between any two nodes.
      function: speeding up searching for items

Tree Traversal

what are some natural ways to ‘traverse’ (iterate) through a tree?

  1. Level order traversal. Breadth First Search (BFS)
  2. Depth-First traversals –– of which there are three: pre-order, in-order and post-order. (DFS)

pre-order

1
2
3
4
5
6
preOrder(BSTNode x) {
if (x == null) return;
print(x.key)
preOrder(x.left)
preOrder(x.right)
}

in-order

instead of visiting (aka printing) first, we’ll first visit the left branch. Then we’ll print. Then we’ll visit the right branch.

1
2
3
4
5
6
inOrder(BSTNode x) {
if (x == null) return;
inOrder(x.left)
print(x.key)
inOrder(x.right)
}

post-order

1
2
3
4
5
6
postOrder(BSTNode x) {
if (x == null) return;
postOrder(x.left)
postOrder(x.right)
print(x.key)
}

Graphs

definition

A graph consists of:

  • A set of nodes (or vertices)
  • A set of zero of more edges, each of which connects two nodes.
    image (1266×456) (gitbook.io)
    note that all trees are also graphs, but not all graphs are trees.

simple graphs & multigraphs

Simple graphs: do not have loops & multiple ways from one to another
image (1472×476) (gitbook.io)

more categories

image (1850×892) (gitbook.io)

problems

  • s-t Path: Is there a path between vertices s and t?
  • Connectivity: Is the graph connected, i.e. is there a path between all vertices?
  • Biconnectivity: Is there a vertex whose removal disconnects the graph?
  • Shortest s-t Path: What is the shortest path between vertices s and t?
  • Cycle Detection: Does the graph contain any cycles?
  • Euler Tour: Is there a cycle that uses every edge exactly once?
  • Hamilton Tour: Is there a cycle that uses every vertex exactly once?
  • Planarity: Can you draw the graph on paper with no crossing edges?
  • Isomorphism: Are two graphs isomorphic (the same graph in disguise)? // a million-dollar problem!

    let’s solve the first problem: if s and t connected?

triversal

BFS

1
2
3
4
5
6
7
8
9
10
11
Initialize the fringe, an empty queue 
add the starting vertex to the fringe
mark the starting vertex
while fringe is not empty:
remove vertex v from the fringe
for each neighbor n of vertex v:
if n is not marked:
add n to fringe
mark n
set edgeTo[n] = v
set distTo[n] = distTo[v] + 1

fringe: the data structure we are using to store the nodes on the frontier of our traversal’s discovery process
edgeTo[...]: a map that helps us track how we got to node n; we got to it by following the edge from v to to n.
distTo[...]: helps us track how far n is from the starting vertex

DFS

1
2
3
4
5
6
7
8
9
mark s  // i.e., remember that you visited s already
if (s == t):
return true;

for child in unmarked_neighbors(s): // if a neighbor is marked, ignore!
if isconnected(child, t):
return true;

return false;
1
2
3
4
5
6
7
8
9
10
Initialize the fringe, an empty stack
push the starting vertex on the fringe
while fringe is not empty:
pop a vertex off the fringe
if vertex is not marked:
mark the vertex
visit vertex
for each neighbor of vertex:
if neighbor not marked:
push neighbor to fringe

graphs API

1
2
3
4
5
6
7
public class Graph {
public Graph(int V): // Create empty graph with v vertices
public void addEdge(int v, int w): // add an edge v-w
Iterable<Integer> adj(int v): // vertices adjacent to v
int V(): // number of vertices
int E(): // number of edges
...

we want to get quick answers to these questions:

  • Are given vertices u and v adjacent?
  • Is vertex v incident to a particular edge e?
  • What vertices are adjacent to v?
  • What edges are incident to v?
    so we use adjacency list for the data structure.

Adjacency List

![[Pasted image 20240514103306.png]]
An array is created with each vertice in graph, each of them points to a list that including all of the edges of the vertice

Shortest path

BFS or DFS will never have the ability to get the shortest path in the least time always, so some algs are developed.

Dijkstra’s Algorithm

You are on campus and playing a game with your friends around Hearst Memorial Mining Building. You start at the location s and you want to go to the location t.
![[Pasted image 20240514110735.png]]
BFS will yield a route of length 330m.
![[Pasted image 20240514110800.png]]
Goal: minimize the sum of the weights of the edges on the selected path.
the shortest paths tree from a source s can be created in the following way:

  • For every vertex v (which is not s) in the graph, find the shortest path from s to v.
  • “Combine”/“Union” all the edges that you found above.
    the “Shortest Path Tree” will always be a tree. Why? For every node, there was exactly one “parent” in the edgeToedgeTo array that we have maintained in our original solution.

working

How does it work?

  1. Create a priority queue.
  2. Add s to the priority queue with priority 00. Add all other vertices to the priority queue with priority ∞∞.
  3. While the priority queue is not empty: pop a vertex out of the priority queue, and relax all of the edges going out from the vertex.

relax

vertex: v
Look at your current best distance to w from the source, call it curBestDistToW. Now, look at your curBestDistToV+weight(v,w) (let’s call it potentialDistToWUsing).
Is potentialDistToWUsing better, i.e., smaller than curBestDistToW? In that case, set curBestDistToW=potentialDistToWUsingV, and update the edgeToW to be v

implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
public class Dijkstra {
public static int[] dijkstra(int[][] graph,int startVertex){
//初始化 以求出最短路径的点 result[]
int length = graph.length;
int[] result = new int[length];
for (int i = 0; i < length; i++) {
result[i] = -1;
}
result[startVertex] = 0 ;
// 初始化 未求出最短路径的点 notFound[]
int[] notFound = new int[length];
for (int i = 0; i < length; i++) {
notFound[i] = graph[startVertex][i];
}
notFound[startVertex] = -1;
// 开始 Dijkstra 算法
for (int i = 1; i < length; i++) {
//1. 从「未求出最短路径的点」notFound 中取出 最短路径的点
//1.1 找到最短距离的点
int min = Integer.MAX_VALUE;
int minIndex = 0;
for (int j = 0; j < length; j++) {
if (notFound[j] > 0 && notFound[j] < min){
min = notFound[j];
minIndex = j;
}
}
//1.2 将最短距离的点 取出 放入结果中
result[minIndex] = min;
notFound[minIndex] = -1;
//2. 刷新 「未求出最短距离的点」 notFound[] 中的距离
//2.1 遍历刚刚找到最短距离的点 (B) 的出度 (BA、BB、BC、BD)
for (int j = 0; j < length; j++) {
// 出度可通行(例如 BD:graph[1][3] > 0)
// 出度点不能已经在结果集 result中(例如 D: result[3] == -1)
if (graph[minIndex][j] > 0
&& result[j] == -1){
int newDistance = result[minIndex] + graph[minIndex][j];
//通过 B 为桥梁,刷新距离
//(比如`AD = 6 < AB + BD = 4` 就刷新距离)( -1 代表无限大)
if (newDistance < notFound[j] || notFound[j]==-1){
notFound[j] = newDistance;
}
}
}

}
return result;
}
/** 测试案例 */
public static void main(String[] args) {
char[] vertices = new char[]{'A','B','C','D'};
int[][] graph = new int[][]{
{0, 2, -1, 6}
, {2, 0, 3, 2}
, {-1, 3, 0, 2}
, {6, 2, 2, 0}};
int[] dijkstra = dijkstra(graph, 0);
for (int i : dijkstra) {
System.out.println(i);
}
}
}

disadvantages

![[Pasted image 20240514113547.png]]
Suppose you’re at that vertex labeled 34. Now you’re going to try to relax all your edges. You have only one outgoing edge from yourself to 33 with weight −67. Ah, but note: vertex 33 is already visited (it’s marked with white.) So… we don’t relax it. (Recall the pseudocode for the relax method.)
Now we go home thinking that the shortest distance to 33 is 82 (marked in pink.) But really, we should have taken the path through 34 because that would have given us a distance of 101−67=34.
Dijkstra’s algorithm is not guaranteed to be correct for negative edges. It might work… but it isn’t guaranteed to work.

short-circuit

if a vertice is popped out from the queue, the algs is stopped (short-circuited).

A* algs

let’s consider the drawbacks of Dijkstra’s algorithm: if you want to seek a road from the middle of the US to the east, you will must have to make a circle first, lowering the effeciency. How can we let computer know that it is better to move towards east then west?

The A-star

![[Pasted image 20240514221630.png]]
f(n)=g(n)+h(n)
f(n): the priority of node n
g(n): the cost of node n from the start
h(n): the estimated cost of n to final ones

implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
* 初始化open_set和close_set;
* 将起点加入open_set中,并设置优先级为0(优先级最高);
* 如果open_set不为空,则从open_set中选取优先级最高的节点n:
* 如果节点n为终点,则:
* 从终点开始逐步追踪parent节点,一直达到起点;
* 返回找到的结果路径,算法结束;
* 如果节点n不是终点,则:
* 将节点n从open_set中删除,并加入close_set中;
* 遍历节点n所有的邻近节点:
* 如果邻近节点m在close_set中,则:
* 跳过,选取下一个邻近节点
* 如果邻近节点m也不在open_set中,则:
* 设置节点m的parent为节点n
* 计算节点m的优先级
* 将节点m加入open_set中

The priority of nodes are usually computed by the specific algs for specific cases.

minimum spanning trees

how to find a circle in a graph?

spanning trees

an undirected graph, a spanning tree T is a subgraph of G, where T :

  • is connected
  • is acyclic (not circled)
  • includes all of the vertices

MST

the spanning trees that its sum of priority is the least ones.

cut property

cut: an assignment of a graph’s node to 2 sets
corssing edge: an edge which connects the 2 node from 2 different sets
![[Pasted image 20240515103058.png]]
given any cut, minimum weight crossing edge is in the MST. (can be easily proved.)

Prim’s algs

  • Start from one node
  • add shortest edge that has one node inside the MST under construction and repeat the process

implementation

using priority heap queues, storing vertices in order of distance from tree
repeat(remove closest vertex v from PQ, and relax all edges pointing from v)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public class PrimMST {
  public PrimMST(EdgeWeightedGraph G) {
    edgeTo = new Edge[G.V()];
    distTo = new double[G.V()];
    marked = new boolean[G.V()];
    fringe = new SpecialPQ<Double>(G.V());
    distTo[s] = 0.0;
    setDistancesToInfinityExceptS(s);
    insertAllVertices(fringe);
    /* Get vertices in order of distance from tree. */
    while (!fringe.isEmpty()) {
      int v = fringe.delMin();
      scan(G, v);
    } 
  }

VS Dijkstra’s algs

the same except:

  • visiting order: Prim’s in order of distance from the MST, while D’s in order of distance from source
    relaxation:
  • Dijkstra’s considers an edge better based on distance to source
  • Prim’s a dege better based on distance to tree

complexity

O(ElogV)

Kruskal’s algs

ordering the edges by their priorities, and add the edge from smallest to biggest without circle (using union-find datastructure).

implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public class KruskalMST {
  private List<Edge> mst = new ArrayList<Edge>();
  public KruskalMST(EdgeWeightedGraph G) {
    MinPQ<Edge> pq = new MinPQ<Edge>();
    for (Edge e : G.edges()) {
      pq.insert(e);
    }
    WeightedQuickUnionPC uf = 
             new WeightedQuickUnionPC(G.V());
    while (!pq.isEmpty() && mst.size() < G.V() - 1) {
      Edge e = pq.delMin();
      int v = e.from();
      int w = e.to();
      if (!uf.connected(v, w)) {
        uf.union(v, w);
        mst.add(e); 
} } } }