CS341: Algorithms

A continuation of CS240 Everything can be found on Learn. Taught by Armin Jamshidpey in W2024. Really enjoyed this course.

Lecture notes from another prof: https://cs.uwaterloo.ca/~lapchi/cs341/notes.https://cs.uwaterloo.ca/~lapchi/cs341/notes.htmlhtml

Textbook CLRS is very good. Textbook answers: https://github.com/wojtask/clrs4e-solutions.

Midterm $\to$ Midterm CS341

Final - Tuesday April 16 2024

Some information about the final exam:

It covers Lec8-Lec18. The topics are mainly about greedy algorithms, dynamic programming, and NP-completeness. We won’t specifically ask a direct question about Lec1-Lec7 (which were tested for the midterm), but we assume knowledge from these lectures (e.g. we can use DFS to check whether a directed graph is acyclic, use BFS to compute shortest paths, etc) and you may need to use them.

The proof of the Cook-Levin Theorem will not be asked. From Lec17 and Lec18, you don’t need proofs.

No notes and books allowed. No electronic devices (e.g. calculators) allowed. There is no reference sheet.
You can assume the material in the notes without providing details. You can also assume the content in CS 240 and Math 239 without providing details. No other assumptions can be made without providing details, not from homework, tutorials, reference books, etc. For example, if we ask you a question that is from homework/tutorials, we expect you to answer this question from scratch (instead of saying that we have done it in homework/tutorials).
You will be given partial marks even though you do not know how to solve the problem completely (e.g. having a slower algorithm, having some rough ideas about the reductions, etc).
For preparation, go through the lecture notes, the homework, the tutorial notes, and the good problems in textbooks (CLRS, KT, DPV).
Make sure you read each question on the exam carefully and completely

Topics:

Dijkstra’s algorithm
Greedy algorithms
MST
Kruskal’s algorithm
Dynamic Programming
NP

Focus on later part of the course. Apparently L17 and L18 don’t need to know. Just know the NP-complete problems: https://piazza.com/class/lr45czd8jab7cz/post/1200

To review:

Concepts

Lecture 2

Lecture 3 - 4

Lecture 5

Lecture 6

Lecture 7

Lecture 8

Greedy Algorithms

Lecture 9

Dijkstra’s Algorithm

Lecture 10

Lecture 11

Dynamic Programming
Interval Scheduling Problem
Knapsack Problem

Lecture 12

Lecture 13

Edit Distance
Optimal Binary Search Trees
Independent Sets in Trees

Lecture 15

Lecture 16

Lecture 17 and Lecture (don’t worry about it)

Lecture 3 - Divide an Conquer

Example - Counting inversions

Collaborative filtering:

matches users preferences (movies, music, …)
determine users with similar tastes
recommends n ew things to users based on preferences of similar users

Goal: given an unsorted array $A [1.. n]$ , find the number of inversions in it.

Def

$(i, j)$ is an inversion if $i < j$ and $A [i] > A [j]$

Example: with $A=[1,5,2,6,3,8,7,4]$ , we get

$(2, 3), (2, 5), (2, 8), (4, 5), (4, 8), (6, 7), (6, 8), (7, 8)$

Remark: we show the indices where inversions occur

Remark: easy algorithm (two nested loops) in $Θ (n^{2})$

Lecture 4 - Divide and Conquer

The Selection Algorithm We don’t choose the pivot randomly now.

How to find a pivot such that both $i$ and $n-i-1$ are not too large?

For each group, we spend $O (1)$ to find the medians. Then we have $O (n)$ to find the median of medians. Why?

Why is the runtime of median of medians $O(n)$ ?

T(n) = T(n/5) + T(7n/10) + O(n) is the expression given in the slides T(n/5) and O(n) term is from finding median of medians in the groups of 5 This is from selectPivot() The current array is broken up into groups of 5 (takes O(1) time) There are n/2 groups of 5 Calling quickselect on an array of size 5 is fixed time operation (takes O(k) = O(1)) but there are n/5 groups so finding the median of every group of 5 will take O(n) time Thus the terms T(n/5) + O(n)

T(7n/10) is the worst case for the size of the next array This worst case is bounded by the pivot selection from selectPivot(), proof of this in slides (the maximal size of the next recursive call’s array is 7n/10)

Its an induction proof to show why

Note, in quicksort, you have to sort both sides of the array after the pivot step. In median of medians, after the pivot step, you only have to consider one part of the array, the one that has the median.

$i$ and $n - i - 1$ are the sizes of the right and left subarrays. $\frac{3 n}{10}$ and $\frac{7 n}{10}$ .

He’s not sorting each group. After finding the median of medians, you can reorder the element (the columns) in the picture.

After calling the partition algorithm, we get an array where the median of medians is at some index $i$ . The values in the green area is smaller than the median of medians. All these values $\frac{3 n}{10}$ goes before the $i$ index. Same argument can be done to the bottom right corner. In the green box, the elements are going to be larger and will appear on the left in the array $\frac{3 n}{n}$ .

$T (n /5)$ is the runtime to find the median of medians. we ensure that the pivot chosen is at least in the middle 30%
- Because in the median of medians algorithm, the key idea is to ensure that the pivot chosen is not too small or too large compared to the rest of the elements in the array. Specifically, the pivot selected by the median of medians algorithm is guaranteed to be within the middle 30% of the sorted order of the array.
Worst case is the recursive call $T (\frac{7 n}{10})$ , because it’s quick select and not quick sort. only 7 branches and we ignore the other 3 branches? since quickselect does not partition the array into equal parts, rather it focuses on only one side of the pivot (where the desired element should be located). so worst case, only one side of the array is considered
$O (n)$ is the extra operations to be done, in this case, partitioning $\Rightarrow$ We can prove the runtime to be $\in O (n)$

Why are we grouping them in groups of 5?

Might not work with other grouping numbers. We will see later.

Module 3

Lecture 5 - Breadth First Search

not gonna do bfs and undirected graphs

Goals:

basics on undirected graphs
undirected BFS and applications (shortest paths, bipartite graphs, connected components)
undirected DFS and applications (cut vertices)
basics on directed graphs
directed DFS and applications (testing for cycles, topological sort, strongly connected components)
Undirected Graph

Draw the representation of an undirected graph as an adjacency list and matrix for this graph

Some definitions

BFS

As long as the queue is not empty based on the order of queue. Take things out. and look at all the neighbours. If the neighbours is not visited, we enqueue it.

Time Complexity

Analysis:

each vertex is enqueued at most once
so each vertex is dequeued at least once
so each adjacency list is read at most once (since we visit all the neighbours of a vertex only if it has not been visited)

For all $v$ , write $d_{v} =$ number of neighbours of $v =$ length of $A [v] =$ degree of $v$ .

O (v \sum d_{v)} = O (m)

cf. the adjacency array $A$ has $2 m$ cells (Handshaking Lemma)

Total: $O (n + m)$

Breadth-first exploration idea: start from yourself, and expand.

Thu Jan 25, 2024

To dequeue you need to visit all of the neighbours.

Use induction, assume the statement holds for vi, we want to show that vi+1 is visited. If vi is visited, it means that it enqueue. But to enqueue, we need to dequeue and to dequeue we visit all of the neighbours. And since vi+1 is a neighbour of vi, therefore vi+1 is visited.

Lemma

For all vertices $v$ , there is a path $\to$ v in $G$ if and only if $v i s i t e d [v]$ is true at the end.

Applications

testing if there is a path $s \to$ v
testing if $G$ is connected in $O (n + m)$ .

To check if a graph is connected, we need to run DFS.

To test if the $O (n + m)$ and $O (n)$ : $O (n)$ is the dominant term.

Exercise

For a connected graph, $m \geq n - 1$ .

Thu Jan 25, 2024

I visit some node $v$ , but why do we do that in BFS?

If I have a parent array, do we still need a visited array?

No, we can just go check the parent array and check if it’s null or it visited the child.

In the textbook, they use predecessor to mean parent. It’s more general.

$\to$ Based on this algorithm, we can form a tree!

$L e v e l [v]$ is the parent.

Example for the algorithm (i’ve recorded the whole process):

BFS Tree

BFS Tree

Definition: the BFS Tree $T$ is the subgraph made of

all $w$ such that $p a re n t [w] \neq = N I L$

all edges $w, p a re n t [w]$ , for $w$ as above (except $w = s$ )

Claim

The BFS tree $T$ is a tree.

Proof: by induction on the vertices for which $p a re n t [v]$ is not NIL.

when we set $p a re n t [s] \leftarrow v$ , only one vertex, no edge.
suppose true before we set $p a re n t [w] \leftarrow v$ $v$ was in $T$ before, $w$ was not, so we add one vertex $w$ and one edge $v, w$ to $T$ , so $T$ remains a tree.

Remark: we make it a rooted tree by choosing $s$ as a root.

Shortest paths from the BFS tree

Sub-claim 1

The levels in the queue are non-decreasing

Proof: Exercise

Sub-claim 2

For all vertices $u, v$ if there is an edge { $u, v$ }, then $l e v e l [v] \leq l e v e l [u] + 1$ .

Proof.

if we dequeue $v$ before $u$ , $l e v e l [v] \leq l e v e l [u]$ (sub-claim 1)
if we dequeue $u$ before $v$ , the parent of $v$ is either $u$ , or was dequeued before $u$ in any case, $l e v e l [p a re n t [v]] \leq l e v e l [u]$ (sub-claim 1) but $l e v e l [p a re n t [v]] = l e v e l [v] - 1$ , so OK

gap-in-knowledge don’t really understand the proof of sub-claim 2. Does it mean either $u$ is the parent of $v$ or someone else is the parent of $v$ and got dequeue earlier, before dequeueing $u$ . Therefore, level of u + 1 is definitely bigger or equal to level of v, since it’s above . Doesn’t make sense, i got it wrong? if we dequeue u before, doesn’t it mean that u is lower level than v? Omg notice that there is an edge {u,v}. But still why is it level of u + 1 is greater or equal to level of v? Is level 1 higher than level 2? I think I’m misunderstanding the lower and higher. If 1 is higher level then I got it right…

What is $T$ ? The BFS Tree I assume.

I’m very confused: I have recordings.gap-in-knowledge

$l e v e l [v] \leq l e v e l [u] + 1$

Bipartite Graph

A graph $G = (V, E)$ , is bipartite if there is a partition $V = V_{1} \cup V_{2}$ such that all edges have one end in $V_{1}$ and one end in $V_{2}$ .

Using BFS to test bipartite-ness

Claim

Suppose $G$ connected, run BFS from any $s$ , and set

$V_{1} =$ vertices with odd level

$V_{2} =$ vertices with even level

Then $G$ is bipartite if and only if all edges have one end in $V_{1}$ and one end in $V_{2}$ (testable in $O (m)$ )

Proof. $\Leftarrow$ obvious For $\Rightarrow$ , let $W_{1,} W_{2}$ be a bipartition. Because paths alternate between $W_{1,} W_{2}$ :

$V_{1}$ ( $=$ vertices with odd level) is included in $W_{1}$ (say)
$V_{2}$ ( $=$ vertices with even level) is included in $W_{2}$ So $V_{1 =} W_{2}$ and $V_{2} = W_{2}$

Computing the connected componentsto-understand

Idea: add an outer loop that runs BFS on successive vertices

Exercise

Fill in the details.

Complexity:

each pass of BFS $O (n_{i} + m_{i})$ , if $G_{i} (V_{i,} E_{i})$ is the $i$ th connected component
total $O (n + m)$

Slide explained

It’s just giving the idea of how to find the connected components of an undirected graph. You run a BFS, and then you’re done you’ve visited all the vertices in that component. So you just need to find an unvisited vertex (which means it will be in another component) and then run BFS on that vertex again. You keep doing this until you visit every vertex.

Chatgpt: The approach described is a common method for computing the connected components of an undirected graph using BFS (Breadth-First Search). Let’s break down the steps:

Initialize: Start with an empty list to store the connected components.
Outer Loop: Begin an outer loop that iterates over all vertices of the graph. For each vertex, if it has not been visited yet, it means it belongs to a new connected component.
BFS: Perform a BFS traversal starting from the current unvisited vertex. This BFS traversal will visit all vertices reachable from the current vertex, which form one connected component of the graph.
Store Component: After completing the BFS traversal, all vertices visited during this traversal belong to the same connected component. Add these vertices to the list representing the current connected component.
Repeat: Continue the outer loop to find the next unvisited vertex in the graph. This vertex will belong to a new connected component. Repeat steps 3-4 for this vertex.
Termination: Once all vertices have been visited and assigned to connected components, the process ends.

Let’s analyze the complexity:

BFS Complexity: Each pass of BFS on a connected component takes $O (n_{i} + m_{i} )$ time, where $n_{i}$ is the number of vertices and mimi is the number of edges in the $i$ th connected component.
Total Complexity: Since we perform BFS on each connected component separately, the total complexity is the sum of the complexities of all BFS traversals, which is $O (n + m)$ , where $n$ is the total number of vertices and $m$ is the total number of edges in the graph.

This approach efficiently computes the connected components of an undirected graph by systematically exploring each component using BFS. It’s a fundamental technique in graph theory and is often used in various applications such as network analysis, social network analysis, and image processing.

Lecture 6 - Depth-First Search

Going depth-first

The idea:

travel as deep as possible, as long as you can
when you can’t go further, backtrack

DFS implementations are based on stacks, either implicitly (recursion) or explicitly (as with queues for BFS).

Recursive Algorithm:

Remark: can add parent array as in BFS. What does that mean?:

Adding a parent array in DFS, similar to BFS, means keeping track of the parent of each vertex encountered during the traversal. In BFS, the parent array is used to reconstruct the shortest paths from the source vertex to all other vertices in the graph. However, in DFS, the parent array serves a slightly different purpose.
In DFS, the parent array can be used to represent the DFS tree or forest. Each time a vertex $v$ is visited and explored, the parent of its neighbouring vertices $w$ becomes $v$ , since $v$ is the vertex from which $w$ was discovered. Therefore, by maintaining this information in the parent array, we can construct the DFS tree rooted at any vertex. Here’s how it works:

Initialize a parent array of size $n$ , where nn is the number of vertices in the graph. Initially, all entries in the parent array are set to a special value (e.g., $- 1$ ) to indicate that the vertices have no parent.
During the DFS traversal, when exploring a vertex $v$ , mark $v$ as visited (i.e., $v i s i t e d [v] = t r u e$ ) and iterate through all its neighbours $w$ .
For each neighbour $w$ , if $w$ has not been visited yet, set its parent in the parent array to $v$ (i.e., $p a re n t [w] = v$ ). By the end of the DFS traversal, the parent array will represent the DFS tree or forest, where each vertex (except the root(s)) has a parent that discovered it during the traversal. This information can be useful for various purposes, such as finding paths in the DFS tree, detecting cycles, or performing other graph-related algorithms.

Notice, it is recursively calling on explore().

Claim ("white path lemma")

When we start exploring a vertex $v$ , any $w$ that can be connected to $v$ by an unvisited path will be visited during explore( $v$ ).

Proof: Let’s prove this lemma by contradiction. Suppose there exists a vertex $w$ that can be reached from $v$ by an unvisited path but is not visited during the exploration of $v$ .

Initialization: We begin the DFS traversal with all vertices marked as unvisited.
Exploration of $v$ : When the DFS algorithm starts exploring vertex $v$ , it marks $v$ as visited. Then, for each neighbour $u$ of $v$ , if $u$ has not been visited yet, DFS explores $u$ .
Contradiction: However, by the nature of DFS, when exploring $v$ , all its neighbours are visited recursively. Therefore, if $w$ is reachable from $v$ , it must be a neighbour of $v$ or a neighbour of a neighbour of $v$ , and so on. Since DFS explores all neighbours of $v$ and their neighbours, $w$ should have been visited during the exploration of $v$ . But our assumption states otherwise, leading to a contradiction.

Hence, the assumption that there exists a vertex $w$ reachable from $v$ via an unvisited path but not visited during the exploration of $v$ is false. Therefore, any vertex $w$ that can be connected to $v$ by an unvisited path will indeed be visited during the exploration of $v$ . This completes the proof of the “white path lemma”.

Basic property:

Claim

If $w$ is visited during explore( $v$ ), there is a path $v \to w$ .

Proof: Same as BFS.

The proof is straightforward and relies on the nature of DFS exploration. During the DFS traversal, when exploring vertex $v$ , DFS recursively visits all vertices reachable from $v$ in the graph.

Initialization: The DFS traversal begins with the starting vertex $v$ marked as visited.
Recursive Exploration: For each neighbour $u$ of $v$ , if $u$ has not been visited yet, DFS explores $u$ . This process continues recursively until all reachable vertices from $v$ have been visited.
Vertex $w$ : If vertex $w$ is visited during the exploration of $v$ , it means that $w$ is reachable from $v$ in the graph.
Path from $v$ to $w$ : Since $w$ is visited during the exploration of $v$ , there must be a path from $v$ to $w$ in the graph. This path is formed by the sequence of edges traversed during the DFS traversal.

Therefore, if vertex $w$ is visited during the exploration of vertex $v$ in DFS, then there exists a path from $v$ to $w$ in the graph. This completes the proof of the claim.

Consequences:

Previous properties: after we call explore at $v_{1}, ..., v_{k}$ in DFS, we have visited exactly the connected components containing $v_{1}, ..., v_{k}$ .
- Since it recursively explores all the vertices in the connected component
Shortest paths: no
- unlike bfs, it does not get you the shortest path
Runtime: still $O (n + m)$

Trees, forest, ancestors and descendants Previous observation:

DFS( $G$ ) gives a partition of $G$ into vertex-disjoint rooted trees $T_{1}, ..., T_{k}$ (DFS forest)
- What does this mean>

Definition. Suppose the DFS forest is $T_{1}, ..., T_{k}$ and let $u, v$ be two vertices

$u$ is an ancestor of $v$ if they are on the same subtree $T_{i}$ in the DFS forest and $u$ is on the path root $⇝ v$
equivalent: $v$ is a descendant of $u$ if
- $v$ belongs to the subtree rooted at $u$ in the DFS forest
- equivalent as saying $u$ is the ancestor of $v$

In other words, in a DFS forest:

An ancestor of a vertex is any vertex that lies on the path from the root of the subtree containing the vertex to the vertex itself.
A descendant of a vertex is any vertex within the subtree rooted at the vertex.

Key Property

Claim

All edges in $G$ connect a vertex to one of its descendants or ancestors.

So, there is no edge from subtree to subtree within $G$ :

Proof. Let { $v, w$ } be an edge, and suppose we visit $v$ first. Then when we visit $v$ , $(v, w)$ is an unvisited path between $v$ and $w$ , so $w$ will become a descendant of $v$ (white path lemma).

White Path Lemma: According to the white path lemma, any vertex that can be connected to $v$ by an unvisited path will be visited during the exploration of $v$ .

Back edges

A back edge is an edge in $G$ connecting an ancestor to a descendant, which is not a tree edge.

Observation: All edges are either tree edge or back edges (key property).

Start and finish times

Set a variable $t$ to 1 initially, create two arrays start and finish, and change explore:

Example:

Observation:

these intervals are either contained in one another, or disjoint

if $s t a r t [u] < s t a r t [v]$ , then either $f ini s h [u] < s t a r t [v]$ or $f ini s h [v] < f ini s h [u]$ .

Basically saying that they can’t cross each other, either u finishes before v starts, or u finishes then v finishes.

Proof: if $s t a r t [v] < f ini s h [u]$ , we push $v$ on the stack while $u$ is still there, so we pop $v$ before we pop $u$ .

Cut vertices

Cut vertices

For $G$ connected, a vertex $v$ in $G$ is a cut vertex if removing $v$ (and all edges that contain it) makes $G$ disconnected. Also called articulation points

Finding the cut vertices ( $G$ connected)

Setup: we start from a rooted DFS tree $T$ , knowing parent and level.

Warm-up

The root $s$ is a cut vertex if and only if it has more than one child.

Proof.

if $s$ has one child, removing $s$ leaves $T$ connected. So $s$ is not a cut vertex.
suppose $s$ has subtrees $S_{1}, ..., S_{k,} k > 1$ .

Key property: no edge connecting $S_{i}$ to $S_{j}$ for $i \neq = j$ . So removing $s$ creates $k$ connected components.

Now, we want to investigate the problem of finding cut vertices which ARE NOT the root.

Finding the cut vertices ( $G$ connected)

Definition: for a vertex $v$ , let

$a (v) = min$ { $l e v e l [w],$ { $v, w$ } edge}

$m (v) = min$ { $a (w), w$ descendant of $v$ }

a(v) is the lowest level (highest in the tree) of all nodes directly connected to v.
m(v) is the minimum a(w) value for all w; w is a descendant of v.
- ooo it considers all the descendants $w$ of $v$ , so basically any descendants $w$ of $v$
I interpret a(v) as the “highest link” that v has, while m(v) is the highest link that v’s subtree has.

In simple words:

$a (v)$ is the smallest level of all of $v$ ‘s neighbours
$m (v)$ is the smallest level of any neighbour of a descent of $v$ .
Try calculating $a (v)$ and $m (v)$ and see if your results match the answers shown to the right of the diagram above!

Using the values $m (v)$

Claim

For any $v$ (except the root), $v$ is a cut vertex if and only if it has a child $w$ with $m (w) \geq l e v e l [v]$ .

Proof.

Take a chid $w$ of $v$ , let $T_{w}$ be the subtree at $w$ .
If $m (v) < l e v e l [v]$ , then there is an edge from $T_{w}$ to a vertex above $v$ . After removing $v$ , $T_{w}$ remains connected to the root.
If $m (w) \geq l e v e l [v]$ , then all edges originating from $T_{w}$ end in $T_{v}$ . Proof: any edge originating from a vertex $x$ in $T_{w}$ ends at a level at least $l e v e l [v]$ , and connects $x$ to one of its ancestors or descendants (key property).

Computing the values $m (v)$ Observation:

if $v$ has children $w_{1}, ..., w_{k}$ , then $m (v) = min$ { $a (v), m (w_{1}), ..., m (w_{k})$ }

Conclusion:

computing $a (v)$ is $O (d_{v})$
knowing all $m (w_{1}), ..., m (w_{k})$ , we get $m (v)$ in $O (d_{v})$
so all values $m (v)$ can be computed in $O (m)$ (remember $O (n + m) = O (m)$ when $G$ connected)
testing the cut-vertex condition at $v$ is $O (d_{v})$
testing all $v$ is $O (m)$

$\Rightarrow$ $a (v)$ and $m (v)$ are used for the children on a vertex which we are about to check if it is a cut vertex or not. The $a (v)$ value gives the lowest level of all neighbours of $v$ , this counts for checking to see if a tree rooted at a child of a vertex will become disconnected or not.

Exercise: write the pseudo-code

1. Perform a DFS traversal on the graph G starting from any vertex (e.g., the root vertex).
2. During the DFS traversal, maintain the following information for each vertex v:
   - level[v]: the level of vertex v in the DFS tree
   - a[v]: the minimum level reachable from vertex v along edges in the DFS tree
   - m[v]: the minimum of a[v] and the values m[w] for all children w of v
3. Update a[v] and m[v] as follows:
   - When exploring a vertex v:
     - Set level[v] = t (where t is the current time)
     - Increment t
     - Set a[v] = level[v]
     - For each child w of v:
       - Recursively explore w
       - Update m[v] = min(a[v], m[v], m[w])
4. After the DFS traversal completes, iterate through all vertices v:
   - If v is not the root vertex and m[v] >= level[v], v is a cut vertex
5. The overall time complexity of this approach is O(m), where m is the number of edges in the graph.

Some questions:

Running BFS also works exactly the same for finding cut vertices as running DFS right?
- Consider the 4-cycle, in the BFS tree, the claim would tell you that there is a cut vertex, but there is no cut vertex.
- I think this is because you can have edges to other subtrees in a BFS tree but this is impossible in a DFS tree.

Lecture 7 - Directed Graphs

Tue Feb 6 2024

Directed Graph

$G = (V, E)$ as in the undirected case, with the difference that edges are (directed) pairs $(v, w)$

edges are also called arcs

usually, we allow loops, with $v = w$

$v$ is the source node, $w$ is the target

a path is a sequence $v_{1}, ..., v_{k}$ of vertices, with $(v_{i}, v_{i + 1})$ in $E$ for all $i$ . $k = 1$ is OK.

a cycle is a path $v_{1}, ..., v_{k, v 1}, k \geq 1$

a directed acyclic graph (DAG) is a directed graph with no cycle

Definition:

the in-degree of $v$ is the number of edges of the from $(u, v)$
the out-degree of $v$ is the number of edges of the form $(v, w)$

Data structures:

adjacency lists
adjacency matrix (not symmetric anymore)

BFS and DFS for directed graphs The algorithms work without any change. We will focus on DFS.

Still true:

we obtain a partition of $V$ into vertex-disjoint trees $T_{1}, ... T_{k}$
when we start exploring a vertex $v$ , any $w$ with an unvisited path $v ⇝ w$ becomes a descendant of $v$ (white path lemma)
properties of start and finish times

But there can exist edges connecting the trees $T_i$ .

Classification of edges Suppose we have a DFS forest. Edges of $G$ are one of the following:

tree edges
- A tree edge in the context of a DFS (Depth-First Search) forest of a directed graph is an edge that is part of the DFS tree itself. In other words, during the DFS traversal of the graph, when exploring a vertex $v$ , if a directed edge leads to a vertex $w$ that has not been visited yet, this edge becomes a tree edge and contributes to the construction of the DFS tree.
back edges: from descendant to ancestor
forward edges: from ancestor to descendant (but not tree edge)
cross edges: all others

In this picture, if we were to start at node 1 (the node on the right) then the cross edge would become a tree edge correct?

So the classification of edges is not absolute but depends on where you start the DFS? yes

explore(v)
	visited[v] = true
	start[v] = t, t++
	for all w neighbour of v do
		if visited[w] = false
			explore(w)                      (v,w) tree edge
	finish[v] = t, t++

If $w$ was visited:

if $w$ not finished, $(v, w)$ back edge
- Case 1: $w$ is visited and not finished yet:
  - If vertex $w$ has already been visited but not yet finished (i.e., the DFS traversal is still exploring its descendants), and $w$ is an ancestor of $v$ , then the edge $(v, w)$ is considered a back edge.
  - This condition indicates that $v$ is a descendant of $w$ in the DFS tree, and $(v, w)$ completes a cycle by connecting $v$ back to one of its ancestors $w$ .
else if $s t a r t [v] < s t a r t [w] < f ini s h [w], (v, w)$ forward edge
else $s t a r t [w] < f ini s h [w] < s t a r t [v], (v, w)$ cross edge
- Case 1: $s t a r t [w] < f ini s h [w] < s t a r t [v]$ : In this scenario, vertex $w$ was visited and finished before vertex $v$ was started. However, $w$ is not an ancestor of $v$ because $s t a r t [w] < s t a r t [v] s t a r t [w] < s t a r t [v]$ . Therefore, the edge $(v, w)$ forms a cross edge.

Testing Acyclicity

Claim

$G$ has a cycle if and only if there is a back edge in the DFS forest

Proof: $\Leftarrow$ : Since there is a back cycle, we have an edge from a descendent to an ancestor. So we have a cycle $v \to w \to v$ . $\Rightarrow$ Assume $G$ has a cycle , $v_{1,} v_{2,} ..., v_{k,} v_{1}$ . Without loss of generality, we can assume we visit $v_{i}$ first (in the cycle).

At the time we start $v_{1}$ , the path $v_{2}, ... v_{k}$ is unvisited. By the white path lemma, we will visit $v_{k}$ before we finish $v_{1}$ . Hence $v_{k}$ becomes a descendent of $v_{1}$ and ( $v_{k,} v_{1}$ ) is a back edge.

Topological Ordering

Topological ordering

Definition: Suppose $G = (V, E)$ is a DAG. A topological order is an ordering < of $V$ such that for any edge $(v, w)$ , we have $v < w$ .

No such order if there are cycles.

Need to write a topological ordering in a line. If two nodes are not connected, we have a choice to which one to write first.

There is no uniqueness. It gives you one of the topological ordering. Prof gave the analogy of prerequisites of the courses. You are in trouble if you have a cycle. No topological ordering???

In-degree node 0, good idea to put it first in your topological ordering. Same thing with out-degree node of 0 to be the last one.

Proposition: $G$ is a DAG if and only if there is a topological ordering on $v$ .

$G$ is a DAG $\Rightarrow$ There is topological ordering on $v$ . (It is an if and only if theorem, but prof just showed one way) (Not a proof (sketch)): Claim: $G$ must have at last one vertex with in degree 0.

Assume it is not true. Then all vertices at $G$ are at in-degree at least one. Take an in-degree 0 node as the first vertex in the order remove it from $G$ , the remaining part in a DAG and repeat. At some point, we need to use one of the nodes used to have an edge going in the node at the start.

$\to$ So we’ve found a contradiction. If we have a DAG, there must have a topological ordering.

From a DFS forest

Observation:

start times do not help
finish times do, but we have to reverse their order!

Left: $u < v$ is our topological order. we have 3,1 and $u < v$ Right: 2,1 $v < u$ Start time is not something we can rely on.

What about finish time? Left: 4,2 and $u < v$ Right: 4, 3 and $u < v$

Claim

Suppose that $V$ is ordered using the reverse of the finishing order: $v < w ⟺ f ini s h [w] < f ini s h [v]$ .

This is a topological order.

SO: Definition of Topological Order: In a topological order, for every directed edge $(u, v)$ , vertex $u$ comes before vertex $v$ .

Using Finish Times:
- In a DFS traversal, finish times indicate when vertices are finished being explored and all their descendants have been visited.
- We consider the reverse of the finishing order, so that vertices with later finish times come before vertices with earlier finish times. This makes sense, since if there is a topological order, there is gonna be a start, and we explore all the descendants starting from that node. It will finish last. Recursively go up.

Proof: Look at notes

In a directed acyclic graph (DAG), there cannot exist a path from vertex $w$ to vertex $v$ if $w$ is discovered before $v$ during a DFS traversal. This is because if there were such a path, it would create a cycle in the graph, contradicting the acyclic property of the graph.

Topological order in $O (n + m)$ .

Strong Connectivity

A directed graph $G$ is strongly connected if for all $v, w$ in $G$ , there is a path $v \to w$ (and thus a path $w \to v$ ).

Observation

$G$ is strongly connected $⟺$ there exists $s$ such that for all $w$ , there are paths $s \to w$ and $w \to s$ .

Proof:

$\Rightarrow$ is obvious
For $\Leftarrow$ , that vertices $v, w$ . We have paths $v \to s$ and $s \to w$ , so $v \to w$ . Same thing with $w \to s$ .

Algorithm:

Perform a DFS exploration starting from vertex $s$ on the original graph $G$ .
Perform another DFS exploration starting from vertex $s$ , but this time on the reverse graph $G^{R}$ , where all edges are reversed.
If both DFS explorations reach every vertex in the respective graphs, then the graph is strongly connected.

Correctness:

The first DFS exploration ensures that there is a path from $s$ to every other vertex in the original graph $G$ .
The second DFS exploration, conducted on the reverse graph $G^{R}$ , ensures that there is a path from every vertex back to $s$ in the original graph $G$ , because reversing the edges effectively explores paths from every vertex $v$ back to $s$ in $G$ .
Therefore, if both DFS explorations are successful, it implies that there exists a path from $s$ to every vertex $v$ and from every vertex $v$ back to $s$ , establishing strong connectivity.

Consequences:

The algorithm tests for strong connectivity in $O (n + m)$ time complexity, where $n$ is the number of vertices and $m$ is the number of edges in the graph.
This approach provides an efficient way to determine whether a directed graph is strongly connected, which is essential for various applications and algorithms that rely on strong connectivity information.

If you give a direct graph and we take the reverse, we get a reverse graph where each edge is reversed. First run without reversing. Second run reverse and run DFS on the same node but on the reverse graph.

In G(T) is reversed, if we run DFS on this graph, there is a path

Have we considered the runtime of reversing the graph?

Might be asked on the midterm?? Do it in linear time. Prof only showed that it is in $O (n + m)$ , but didn’t show the runtime to take to reverse a graph. Look

Approach to Reverse a Graph Efficiently:

Original Graph Representation:
- Let’s assume the graph is represented using an adjacency list for each vertex, where each list contains the vertices adjacent to the corresponding vertex.
Reverse Graph Representation:
- To reverse the graph, we need to reverse the direction of each edge. This essentially means swapping the source and target vertices for each edge.
- We can achieve this by creating a new reversed graph and populating its adjacency lists accordingly.
Efficient Reversal:
- We can reverse the graph in linear time by iterating through each vertex and its adjacency list in the original graph.
- For each edge (u,v)(u,v) in the original graph, we add an edge (v,u)(v,u) to the corresponding adjacency list in the reversed graph.
- This process takes O(n+m)O(n+m) time, where nn is the number of vertices and mm is the number of edges in the original graph Runtime Analysis:

The time complexity of reversing the graph is O(n+m)O(n+m), where nn is the number of vertices and mm is the number of edges in the original graph.
This linear time complexity ensures that the graph reversal process is efficient and suitable for practical applications, including the strong connectivity algorithm discussed earlier.

Structure of directed graphs

If given a directed graph we can form a strongly connected components, we can form a DAG.

Korasaju’s Algorithm

Korasaju's Algorithm

Definition: for a Directed Graph $G = (V, E)$ , the reverse (or transpose) graph $G^{T} = (V, E^{T})$ is the graph with same vertices, and reversed edges.

Complexity: $O (n + m)$ (don’t forget the time to reverse $G$ )

Exercise: check that the strongly connected components of $G$ and $G^{T}$ are the same.

Proof: $v$ and $w$ are in the same SCC of $G$ $⟺$ $v$ and $w$ are in the same tree in the DFS forest of $G^{T}$ (with respect to the entering).

Run DFS in ( $G^{T}$ ) from the node that you finished in by doing the first DFS on $G$

Thu Feb 8 2024

$2 \Rightarrow 1$ . Show that for every vertex $t$ in $T$ (including $v$ and $w$ ) $s$ and $t$ are in the same SCC of $G$ . Since there is a path from $s$ to any $t \in T$ , which is in $G^{T}$ , there exists a path from $t \to s$ in $G$ . Now we show that $\forall t \in T$ , $t$ is a descendant of $s$ in the DFS forest at $G$ , which gives a path from $s$ to $t$ in $G$ .

Inducting argument (write the base case). assume it is true for some $t$ in $T$ and show it is true for children of $t$ . Suppose $u$ is a child of $t$ .

Induction hypothesis: $t$ is a descendant of $s$ in $G$ , so (insert image prof drew)

Based on the ordering in the second round of DFS, and the fact that we visited $s$ before $u$ , we can calculate: …todo

Two cases: (insert picture)

Bad case not possible: $(u, v) \in$ todo

Basically this slide:

True/False For a given DAG, $G$ , we have a linear time algorithm for deciding whether $G$ contains a hamiltonian path.

can run DFS on $G$ and find a topological ordering.
Finding a hamiltonian path is basically having a unique topological ordering

Module 4

Lecture 8 - Greedy Algorithms

Goals: This module: the greedy paradigm through examples

interval scheduling
interval colouring
minimizing total completion time
Dijsktra’s algorithm
minimum spanning tress

Greedy Algorithms

Context: we are trying to solve a combinatorial optimization problem:

have a large, but finite, domain $D$

want to find an element $E$ in $D$ that minimizes/maximizes a cost function

Greedy strategy:

build $E$ step-by-step

don’t think ahead, just try to improve as much as you can at every step

simple algorithms

but usually, no guarantee to get the optimal

it is often hard to prove correctness, and easy to prove incorrectness.

Was not paying attention since rankings came out and ended talking to Chester for the whole time…

Greedy strategy: we build the tree bottom up.

create many single-letter trees
define the frequency of a tree as the sum of the frequencies of the letters in it
build the final tree by putting together smaller trees: join the two trees with the least frequencies

Claim: this minimizes $i \sum f_{i}$ × {length of encoding of $c_{i}$ }

We did not look a at the Huffman Tree in CS240.

Interval Scheduling

Interval Scheduling Problem

Input: $n$ intervals $[s_{1}, f_{1}], [s_{2,} f_{2}], ..., [s_{n}, f_{n}]$ . Output: a maximal subset of disjoint intervals

What does it mean?

A maximal subset of disjoint intervals refers to the largest possible collection of intervals from the given set where none of the intervals overlap with each other. In other words, you’re trying to find the largest number of intervals that can be scheduled without any conflicts in terms of time.

Example: A car rental company has the following requests for a given day

$I_{1}$ : 2pm to 8pm
$I_{2}$ : 3pm to 4pm
$I_{3}$ : 5pm to 6pm Answer is $S = [I_{2,} I_{3}]$

Greedy Strategies

There are different greedy strategies for solving the Interval Scheduling Problem:

Consider earliest starting time (Choose the interval with $mi n_{i} s_{i}$ ).
- Earliest Starting Time: This strategy selects intervals based on their starting times. It chooses the interval that starts earliest among the remaining intervals at each step. This often leads to selecting intervals that allow for the maximum number of non-overlapping intervals to be scheduled.

Consider shortest interval (choose the interval with $mi n_{i} {f_{i} - s_{i}}$ ).
- Shortest Interval: This strategy focuses on selecting the shortest interval among the remaining intervals at each step. By choosing shorter intervals first, it may allow for more flexibility in scheduling longer intervals later.

Consider minimum conflicts (choose the interval that overlaps with the minimum number of other intervals).
- Minimum Conflicts: This strategy aims to minimize conflicts by selecting intervals that overlap with the fewest number of other intervals. It prioritizes intervals that have the least overlap with the currently selected intervals.

Consider earliest finishing time (choose the interval with $mi n_{i} f_{i}$ ).
- Earliest Finishing Time: This strategy selects intervals based on their finishing times. It chooses the interval that finishes earliest among the remaining intervals at each step. This can lead to scheduling intervals that free up resources quickly and potentially allow for more intervals to be scheduled overall.

The algorithm provided outlines a generic approach to solving the Interval Scheduling Problem using a greedy strategy based on the earliest finishing time. It sorts the intervals based on their finishing times and then iterates through them, adding intervals to the schedule if they do not conflict with previously selected intervals. Finally, it returns the selected intervals.

Algorithm: Interval Scheduling

$S = 0$
Sort the intervals such that $f_{1 \leq} f_{2 \leq} ... \leq f_{n}$
For $i$ from $1$ to $n$ do if interval $i$ , $[s_{i}, f_{i}]$ , has no conflicts with intervals in $S$ add $i$ to $S$
return $S$

Tues Feb 12 2024 (Missed class I was sleeping)

Ask matthew for notes

Correctness: The Greedy Algorithm Stays Ahead

Note

$S$ is output of greedy system, $O$ is output from optimal algorithm

If $∣ S ∣ = ∣ O ∣$ , then it must be that $S$ is optimal

We then sort $S$ and $O$ by start/finish times

The proof then follows:

Prof notes / proof:

By ind. hyp.: $f (i_{r - 1}) \leq f (j_{r - 1})$ 1.
Compare $j_{r - 1}$ and $j_{r}$ 2.
1 and 2 $\Rightarrow$ $f (i_{r - 1} < s (j_{r}))$
At the time the greedy algorithm had to choose the $r_{t h}$ interval, $j_{r}$ was an option (it had no intersection with $(i_{r - 1})$ and others). The greedy algorithm chose $i_{r}$ and it means: $f (i_{r}) \leq f (j_{r})$

Lemma

For all indices $r \leq k$ we have $f (i_{r}) \leq f (j_{r})$ .

Proof: We use induction

For $r = 1$ the statement is true.
Suppose $r > 1$ and the statement is true for $r - 1$ . We will show that the statement is true for $r$ .
By induction hypothesis we have $f (i_{r - 1}) \leq f (j_{r - 1})$ .
By the order on $O$ we have $f (j_{r - 1}) < s (j_{r})$
Hence we have $f (i_{r - 1}) < s (j_{r})$
Thus at the time the greedy algorithm chose $i_{r}$ , the interval $j_{r}$ was a possible choice.
The greedy algorithm chooses an interval with the smallest finish time. So, $f (i_{r) \leq} f (j_{r})$ .

This lemma establishes an important property:

For any index $r$ up to $k$ , where kk is the number of intervals in the greedy solution, the finish time of the interval chosen by the greedy algorithm $f (i_{r})$ is less than or equal to the finish time of the corresponding interval chosen by the optimal solution $f (j_{r})$ .

The lemma states: “For all indices $r \leq k$ , we have $f (i_{r}) \leq f (j_{r})$ .”

Here, $f (i r )$ denotes the finish time of the r-th interval chosen by the greedy algorithm, and $f (j_{r})$ denotes the finish time of the r-th interval chosen by the optimal solution.

Note

Recall that $i$ is the i-th index from the sorted (by start/end position) Greedy output, and $j$ is the j-th index from the sorted (by start/end position) Optimal output.

For $r = 1$ (base case), we know that $f (i_{r}) \leq f (j_{r})$ , since for the greedy approach, we always start by taking the one that finishes earliest.

For $r > 1$ , we use induction, and assume that $f (i_{r - 1}) \leq f (j_{r})$ , and try proving it for $r$ .

So, by inductive hypothesis, we have $f (i_{r - 1}) \leq f (j_{r - 1})$ , and by comparing $j_{r - 1}$ and $j_{r}$ , we have that $f (j_{r - 1}) < s (j_{r})$ . Using both of those, we get that $f (i_{r - 1}) < s (j_{r})$ . So, when the greedy algorithm had to choose the r-th interval, $j_{r}$ was an option (it had no intersection with $r_{j - 1}$ and others), and thus it would have chosen that, or something better. So, $f (i_{r}) \leq f (j_{r})$ .

Theorem

The greedy algorithm returns an optimal solution.

Proof:

Prove by contradiction.
if the output $S$ is not optimal, then $∣ S ∣ < ∣ O ∣$ .
$i_{k}$ is the last interval in $S$ and $O$ must have an interval $j_{k + 1}$ .
Apply the previous lemma with $r = k$ , and we get $f (i_{k}) \leq f (j_{k})$ .
We have $f (i_{k}) \leq f (j_{k}) < s (j_{k + 1})$ .
So, $j_{k + 1}$ was a possible choice to add to $S$ by the greedy algorithm. This is a contradiction

Prof notes:

Show $∣ S ∣ = ∣0∣$
Assumer $∣ S ∣ < ∣0∣$ . There exists at least $j_{k + 1}$ in $O$ as it has at least one more element in comparison to $S$ .

By the previous lemma, we know that:

f (i_{k}) \leq f (j_{k}) < s (j_{k + 1})

So the greedy algorithm could add $j_{k + 1}$ to $s$ , but it did not and it is a contradiction.

Interval Colouring

Interval Colouring Problem

Input: $n$ intervals $[s_{1}, f_{1}], [s_{2}, f_{2}], ..., [s_{n}, f_{n}]$ Output: use the minimum number of colours to colour the intervals, so that each interval gets one colour and two overlapping intervals get two different colours.

Algorithm: Interval Colouring

Sort the intervals by starting time: $s_{1} \leq s_{2} \leq ... \leq s_{n}$
For $i$ from 1 to $n$ do Use the minimum available colour $c_{i}$ to colour the interval $i$ . (i.e. use the minimum number to colour the interval $i$ so that it doesn’t conflict with the colours of the intervals that are already coloured.)

Note

He gives an $n^{2}$ algorithm, but he said that an $n lo g n$ one exists with a min heap and stuff, but he doesn’t care.

Prof’s proof:

If the greedy algorithm uses $k$ colours, then $k$ is the minimum number of colours needed. In other words, no other algorithm can solve the problem with $k - 1$ colours.
Idea: Show that there exists a time $t$ such that it is contained in $k$ intervals.
Assume $l$ is the first interval which uses colour $k$ .
Since we have an increasing order on start times: $s_{i_{j}} \leq s_{l}$ for $1 \leq j \leq k - 1$ .
On the other hand all intervals have overlap with $[s_{l}, f_{l}]$ , hence $s_{l} \leq f_{i_{j}}$ for $1 \leq 0 \leq k - 1$ .
So, $s_{l}$ is the time which is contained in $k$ intervals. So, no colouring with less than $k$ colours exists.

Sanity check: Recap

This algorithm is a greedy approach to solve the Interval Colouring Problem. It sorts the intervals by their starting times and then iterate through each interval, assigning it the minimum available colour that does not conflict with the colours already assigned to overlapping intervals.

Correctness showing that if the greedy algorithm uses $k$ colours, no other way to solve problem using at most $k - 1$ colours.

Me trying to make sense of the proof

Suppose Intervals $l$ is the First to Use Colour $k$ :

The proof starts by considering a interval $l$ that is the first one required colour $k$ (a new colour dumbass). This means that all previous intervals have been coloured with colours $1, 2, ..., k - 1$ , and intervals $l$ is the first one requiring colour $k$ because it overlaps with intervals already coloured with colours $1, 2, ..., k - 1$ .

Interval $l$ Overlaps with Previously Coloured Intervals:

The intervals that have been coloured with colours $1, 2, ..., k - 1$ all overlap with interval $l$ . This means that they share some time interval with interval $l$ .

Interval $[s_{l}, f_{l}]$ Contains Times from $k$ Intervals:

Because all intervals have been coloured with colours $1, 2, ..., k - 1$ overlap with interval $l$ , the start time of the interval $l (s_{l})$ is less than or equal to the finish time of these intervals ( $f_{i 1}, f_{i 2}, ..., f_{ik - 1}$ ). This means that $s_{l}$ is a time point that is contained within $k$ intervals.

Conclusion: No $k - 1$ Colouring Exists:

Because $s_{l}$ is contained within $k$ intervals, it implies that colouring interval $l$ with colour $k$ is necessary to ensure that it does not have the same colour as any of the intervals it overlaps with. If we tried to colour interval $l$ with colour $k - 1$ or less, it would conflict with one of the intervals it overlaps with. Thus, there is no way to colour the intervals using fewer than $k$ colours while ensuring that overlapping intervals receive different colours.

Minimizing Total Completion Time

The problem

Input: $n$ jobs, each requiring processing time $p_{i}$ Output: An ordering of the jobs such that the total completion time is minimized.

Note: The completion time of a job is defined as the time when it is finished.

Algorithm:

order the jobs in non-decreasing processing times

Switching position of $e_{i}$ and $e_{i + 1}$ in the permutation $L$ , and we obtain a new permutation $L^{'}$ . And the difference in total completion time between $L$ and $L^{'}$ is calculated as: $T (L^{'}) - T (L) = t (e_{i + 1}) - t (e_{i}) < 0$ . This difference is negative because the processing time of $e_{i + 1}$ is less than the processing time of $e_{i}$ . We have reached a contradiction which implies that by switching $e_{i}$ and $e_{i + 1}$ we obtain a permutation $L^{'}$ with a lower total completion time than $L$ , contradicting the assumption that $L$ is an optimal solution. Therefore, the assumption that an optimal solution exists that is not in non-decreasing order of processing times leads to a contradiction, establishing the correctness of the algorithm that orders jobs in non-decreasing processing times.

Prof’s proof:

Assume there is an optimal solution with a different ordering in comparison to the solution given by the greedy algorithm. Suppose $L = [e_{1}, ..., e_{n}]$ is an optimal solution and $L$ is not in a non-decreasing order of processing times.
So there exists an index $i$ such that $t (e_{i}) > t (e_{i + 1})$ the processing time of $e_{i}$ is greater than the processing time of $e_{i + 1}$ . If this does not happen, the optimal solution has the exact same order as greedy, so if we want them to be different, it must happen.
You can see that the cost is greater in such a case, than if you swapped $e_{i}$ and $e_{i + 1}$ to their proper non-decreasing orders.
So, as long as we find 1 ordering reversed, we can find something better. Thus, the optimal solution would have the ordering in non-decreasing.

Thur Feb 15 2024 (Missed class because studying for cs341 midterm)

Lecture 9 - Dijkstra’s Algorithms

Preliminaries

$G = (V, E)$ is a directed graph with a weight function: $w : E \to R$
- Each edge $e$ has an associated weight $w (e)$ .
The weight of path $P =< v_{0}, ..., v_{k} >$ is: $w (P) = i = 1 \sum k w (v_{i - 1}, v_{i})$
- The weight of path $P$ is the sum of weights of the edges along that path, calculated as above.

Padlet (True/False)

Shortest path exists in any directed weighted graph. https://padlet.com/arminjamshidpey/CS341

First note that in this class, “walk vs. path” $\to$ “path vs. simple path”.

Even if there is a path between every 2 nodes, if edges can have negative weights, then it is possible that there is a cycle with negative total weight, and thus going through the cycle many times can lead to infinitely low weights. So, it is false.

Even if there is a path between every pair of nodes, negative-weight cycles can lead to infinitely low weights by traversing the cycle multiple times.

Negative Weights:
- When negative weights are introduced into the graph, the situation becomes more complex.
- Negative weights can lead to scenarios where paths that include negative-weight edges have shorter total weights than paths without negative-weight edges.
- This can lead to unexpected behavior, such as negative-weight cycles, where starting and ending at the same node repeatedly can result in an infinitely negative total weight.
Impact on Shortest Path Existence:
- In graphs with negative weights, the existence of shortest paths is not guaranteed.
- Negative-weight cycles introduce ambiguity into the concept of shortest paths. If a negative-weight cycle exists, one could repeatedly traverse this cycle to achieve arbitrarily low total weights, leading to no well-defined shortest path.
- Consequently, the presence of negative-weight cycles can invalidate the concept of shortest paths in a graph.

Why Dijkstra's algorithm doesn't work with negative weights

Dijkstra’s algorithm employs a greedy strategy, always selecting the vertex with the smallest known distance from the source vertex at each step.

This strategy assumes that adding a new vertex to the set of vertices with known shortest distances will never result in a shorter path to any vertex than the paths already considered.

However, in the presence of negative weights, this assumption is violated because the distance to a neighbouring vertex can decrease when traversing an edge with negative weight.

When negative weights are present, Dijkstra’s algorithm may select a vertex based on the current shortest path information, leading to incorrect shortest path calculations.

Negative-weight edges can create cycles where repeatedly traversing the cycle reduces the total path weight, violating the assumption made by Dijkstra’s algorithm.

Negative-weight cycles pose a particularly challenging problem for Dijkstra’s algorithm.

If a negative-weight cycle exists reachable from the source vertex, it can be traversed repeatedly to produce arbitrarily low total path weights, leading to no well-defined shortest path.

Dijkstra’s algorithm, which does not account for negative weights, cannot handle this scenario and may produce incorrect results or enter into an infinite loop.

Assumption: $G$ has no negative-weight cycles
The shortest path weight from $u$ to $v$ :

Single-Source Shortest Path Problem

Input: $G = (V, E), w : E \to R$ and a source $s \in V$ Output: A shortest path from $s$ to each $v \in V$

Aims to find the shortest path from $s$ to every other node in the graph.

The shortest path weight from node $u$ to node $v$ , denoted as $δ (u, v)$ , is defined as the minimum weight among all paths from $u$ to node $v$ . If there’s no path from $u$ to $v$ , $δ (u, v) = \infty$ .

It is true. Can prove by contradiction easily. If there was a better way, then our longer path could be shorter, which is a contradiction.
- The proof by contradiction is straightforward: Assume there exists a shorter path from $v_{0}$ to vivi for some $i$ , contradicting the assumption that $⟨ v 0, v 1, ..., v k ⟩$ is the shortest path from $v_{0}$ to $v_{k}$ .

Dijkstra’s Algorithm: Explanation

Dijsktra's algorithm is a greedy algorithm.

Input: A weighted directed graph with non-negative edge weights

For all vertices, maintain quantities
- $d [v] :$ a shortest-path estimate from $s$ to $v$
- $π [v] :$ predecessor in the path (a vertex or NIL)
  - the predecessor in the shortest path from the source vertex $s$ to vertex $v$ .

Note:

This is stricter than what we were saying before; not only can we not have negative weight cycles, but not even negative weight edges.
$d \to$ distance $\to$ level
- I think he said that $d$ refers to level
We start by putting every parent as NIL (this is the $π$ list), and set the shortest path to infinity

Good visualization: Dijkstra’s - Shortest Path Algorithm (SPT) Animation
I think $C$ keeps track of vertices whose shortest paths have already been determined. So we start with $C$ being empty.
Repeat the following steps until $C$ includes all vertices $V$ :
- Select a vertex $u$ from $V - C$ (i.e., the set of vertices not yet included in $C$ ) with the smallest $d [u]$ value and add it to $C$ . This vertex $u$ becomes the next vertex to consider in the alogrithm.
- Update the $d - v a l u es$ of the vertices $v$ adjacent to $u$ , i.e., for each vertex $v$ such that $(u, v)$ is an edge in the graph: $d [v] \leftarrow min {d [v], d [u] + w (u, v)}$
- If the $d - v a l u e$ of any vertex $v$ changes due to the update, also update the predecessor $π [v]$ to reflect the new shortest path.
Termination:
- Once all vertices are included in $C$ , the algorithm terminates, and the arrays $d$ and $π$ contain the shortest-path estimates and predecessors, respectively, for each vertex from the source vertex $s$ .

I don't understand why we update predecessor $\pi[v]$ . I guess the question is what is the predecessor $\pi[v]$ represent?

The predecessor $π [v]$ of a vertex $v$ in the context of Dijkstra’s algorithm represents the vertex that precedes $v$ in the shortest path from the source vertex $s$ . By updating $π [v]$ when the $d$ -value of $v$ is changed, we ensure that $π [v]$ always points to the vertex that is immediately before $v$ in the currently known shortest path from the source vertex $s$ . (we can use this $π$ array to reconstruct the shortest paths from the source vertex $s$ to all other vertices. starting from any $v$ , the predecessor $π [v]$ allows us to do this.)

Which Abstract Data Type ( ADT) should we use for vertices?

Values in this tree representing a distance from a source node!

Adjacency list is my first thought

he said no because it is a graph representation and not a vertex representation?

Prof said he would do a priority queue, which perhaps could be implemented by a min Heap.

Note:

There is the order property (left and right are larger), and the structure property (every node is full except maybe the last)
Insert is to add a new node
Extract-Min is to take 5 out, and fix the heap
Apparently all the vertices are put into the min heap with values corresponding to their $d$ -value from $s$ ?

We start by initializing everything
The heap Q kinda keeps track of the current $d$ -values from everything.

Refer to the slides to be honest, it has a whole ass animation 20+ slides

Read this part in the textbook CLRS instead.

Initializes the $d$ and $π$ values and empt set $C$ to keep track of vertices whose shortest paths have already been determined.
Initialize a priority queue $Q$ containing all vertices in the graph
Each time through the while loop, extract a vertex $u$ from $Q = V - S$ (with $d$ -value from priority queue $Q$ ) and add it to set $C$ indicating that its shortest path has been determined
The first time through this loop $u = s$ . Vertex $u$ , thus, has the smallest shortest-path estimate of any vertex in $V - S$ .
For each vertex $v$ adjacent to $u$ :
- If the distance $d [v]$ is greater than the sum of the distance $d [u]$ and the weight $w (u, v)$ of the edge from $u$ to $v$ :
  - Update $d [v]$ to $d [u] + w (u, v)$ , as this path is shorter
  - Update $π [v]$ to $u$ , indicating that $u$ is the new predecessor of $v$ in the shortest path from $s$ to $v$ .
Terminate:
- Once all vertices have been added to the set $C$ , indicating that their shortest paths have been determined, the algorithm terminates.
What does the orange part mean?

We don’t need Heapify to make our heap because it starts with everything at $\infty$
Knowing which implementation to use depends on if the graph is sparse or not. If there are few edges (the graph is sparse), then heap is better. If there are many edges (the graph is dense), then the array implementation might be better.

Prof’s advice for the upcoming midterm tomorrow:

"It's not time for emotion, it is time for skill" - Armin

It is from michael jordan

Reading week

Ask NOTES for feb 12, 15 and 27.

Bless mattloulou

Tues Feb 27 2024 (didn’t attend class, was studying for sci250)

Proof:

By contradiction: We begin by assuming that the claim is not correct, meaning there exists a vertex $u$ for which $d [u] \neq = δ (s, u)$ when it is added to set $C$ .
- The aim is to show that this assumption leads to a contradiction, implying that the claim must be true
Time $t$ :
- This refers to the beginning of the iteration in which vertex $u$ is added to set $C$ .
Shortest path $P$ :
- The existence of a shortest path $P$ from the source vertex $s$ to vertex $u$ is assumed here. This is a key assumption in the proof and must be justified.

Finding vertex $y$ :
- On the shortest path $P$ , we find vertex $y$ as the first vertex encountered that is not already in set $C$ . This implies that $y$ is added to set $C$ after $x$ but before $u$ . (makes sense)
$δ$ is the min weight between 2 nodes among all passes
The prof said that $δ (s, y) \leq δ (s, u)$ ensures that we are using non-negative weights

We have that $s \neq = u$ , since $d [s] = δ (s, s) = 0$
- $s$ and $u$ must be distinct
- This statement ensures that the source vertex $s$ and the vertex $u$ are distinct. Which is necessary because $d [s] = δ (s, s) = 0$ by definition of shortest path distances.
$s \in C$ , so $C$ is not empty
- Since the source vertex $s$ is added to set $C$ initially, the set $C$ is not empty!
If there is no path from $s$ to $u$ then the shortest path distance $δ (s, u) = \infty$ by definition and $d [u] = \infty$ so, $δ (s, u) = d [u]$ .
- This argument ensures that the claim holds trivially when there is no path from $s$ to $u$ .
There exists a path from $s$ to $u$ . Hence, there exists a shortest path. Name it $P$ .

Don’t understand the last two point.

Since $x$ is added to $C$ before $u$ , its shortest path estimate $d [x] = δ (s, x)$
When $x$ was added to $C$ , its edges were relaxed, including $x, y$
- What does it mean to be relaxed??? I saw it mentioned in the textbook as well
- ”Relaxation” refers to the process of updating the shortest path estimate $d [v]$ and predecessors $π [v]$ when a shorter path to a vertex $v$ is found during the algorithm’s execution
- When vertex $x$ is added to set $C$ , its outgoing edges, including the edge to vertex $y$ , are relaxed.
Since $y$ is on a shortest path from $s$ to $u$ so, the part of $P$ from $s$ to $y$ is a shortest path. Hence, $δ (s, y) = δ (s, x) + w (x, y)$ .
For relaxation of edge $(x, y)$ :
- During the relaxation of edge $(x, y)$ , the algorithm compares $d [y]$ to $δ (s, x) + w (x, y)$ , which is the min value for $d [y]$ if the path $P$ from $s$ to $y$ is indeed a shortest path.
- Since the minimum value is chosen during relaxation, at the time of relaxation $d [y] = δ (s, x) + w (x, y) = δ (s, y)$

In summary

these steps provide a rigorous proof of correctness for Dijkstra’s algorithm, demonstrating that the shortest-path estimates $d [v]$ computed by the algorithm are equal to the true shortest-path distances $δ (s, v)$ at the time each vertex $v$ is added to set C.

Lecture 10 - Minimum Spanning Trees

Spanning Tree

Spanning Trees

Definition:

$G = (V, E)$ is a connected graph

a spanning tree in $G$ is a tree of the form $(V, A)$ , with $A$ a subset of $E$

in other words: a tree with edges from $E$ that covers all vertices of the graph

without creating any cycles

examples: BFS tree, DFS tree

Now, suppose the edges have weights $w (e_{i})$

in many real-world scenarios, edges of a graph are associated with weights, representing costs, distances, or other measures
each edge $e_{i}$ in the graph $G$ is assigned a weight $w (e_{i})$

Goal:

a spanning tree with minimal weight

Minimum Spanning Tree (MST):

goal of finding a minimum spanning tree is to identify a spanning tree that includes the minimum possible total weight among all possible spanning trees of the graph.
In other words, an MST is a spanning tree with the lowest possible total weight, considering the weights of its edges.

Note:

This concept is particularly important in network design, where one aims to minimize costs (or maximize efficiency) while ensuring connectivity.

This is like if you have a network and you want to minimize the cost of links, but still have everything connected, you want a minimal weight spanning tree.

Example

You start off by choosing the smallest weight edges in this graph, so in the next three slides, you can see that he marked edge with weight 3, 4, 5, 6, 9 and 12, then we don’t pick 14 since we will be creating a cycle (it won’t be a minimum spanning tree)! Thus, we pick 15.

Remember the goal: We have to connect each vertices without creating any circles by choosing smaller weighting edges.

Remark: In the example, we have 8 vertices, therefore we should en up picking 7 edges, and we did. And there is no cycle and it is a tree.

Kruskal’s Algorithm

Kruskal’s Algorithm Kruskal’s algorithm is a greedy approach to finding the minimum spanning tree (MST) of a connected, weighted graph.

GreedyMST(G)
1. A <- []
2. sort edges by increasing weight
3. for k = 1,...,m do
4.     if e_k does not create a cycle in A then
5.         append e_k to A

If there is a cycle, then it is not a tree, and so can’t be an optimal solution (assuming that edges have positive weights)
We want to show that the result will be a tree, and so equivalently we have to show that it is connected and that there are no cycles.

Augmenting sets without cycles

Claim

Let $G$ be a connected graph, and let $A$ be a subset of the edges of $G$ .

If $(V, A)$ has no cycle and $∣ A ∣ < n - 1$ , then one can find an edge $e$ not in $A$ such that $A \cup {e}$ still has no cycle.

In prof’s annotations, he proves that #vertices - #connected components $\leq$ #edges

Insert Prof’s proof:

In $(V, A), n - c \leq ∣ A ∣ < n - 1 \Rightarrow c > 1$
- which means it has at least two connected components which belong to the connected graph $G$ . Take an edge (in $G$ ) which connects two different connected components in $A$ .

Need to understand the claim and the idea of proof.

I’m so tired right now (March 2, 2024)

Consider the connected components in the subgraph $(V, A)$ . Since $c > 1$ , there are at least two connected components. Let’s call these connected components $C_{1}$ and $C_{2}$ .

Each of these connected components is a subset of vertices in $G$ , and they are disconnected from each other in the subgraph $(V, A)$ .
Since $G$ is connected, there must be at least one edge in $G$ that connects vertices from $C_{1}$ to $C_{2}$ . If there weren’t such an edge, $G$ would have been disconnected, which contradicts the fact that $G$ is connected.
Now, let’s choose an edge $e$ from $G$ that connects a vertex from $C_{1}$ to a vertex from $C_{2}$ . Such an edge exists because $G$ is connected.

Properties of the output

Claim

If the output is $A = [e_{1}, ..., e_{r}]$ , then $(V, A)$ is a spanning tree (and $r = n - 1$ ).

A spanning tree of a graph $G$ is a subgraph that is a tree (a connected acyclic graph) and includes all the vertices of $G$ .

The claim asserts that if the output of this process is a set of edges that form $n - 1$ edges (i.e., the maximum number of edges for a spanning tree in a graph with $n$ vertices), then this set of edges indeed forms a spanning tree of the original graph $G$ .

Note: the output has edges 1 up to $r$ .

Proof:

of course, $(V, A)$ has no cycle
suppose $(V, A)$ is not a spanning tree. Then there exists an edge $e$ not in $A$ , such that $(V, A \cup e)$ still has no cycle.
Case 1: $w (e) < w (e_{1})$
- Impossible, since $e_{1}$ is the element with the smallest weight.
- Thus, if $e$ has a smaller weight than $e_{1}$ , it should have been selected instead of $e_{1}$ . It contradicts the initial assumption.
Case 2: $w (e_{i}) < w (e) < w (e_{i + 1})$
- Impossible: at the moment we inserted $e_{i + 1}$ , we decided not to include $e$ . This means that $e$ created a loop with $e_{1}, ..., e_{i}$ .
- If $e$ were to be added between $e_{i}$ and $e_{i + 1}$ , it would create a loop with $e_{1}, ..., e_{i}$ . This contradicts the fact that $(V, A)$ is assumed to be a spanning tree, which by definition is acyclic
Case 3: $w (e_{r}) < w (e)$
- Impossible: we would have included it in $A$ , since there is no loop in $A \cup {e}$ .
- If adding $e$ does not create a loop in $A \cup {e}$ , then it would have been included in $A$ in the first place. Since there is no loop in $A \cup {e}$ , it means that adding $e$ maintains the acyclic property of the subgraph, which contradicts the assumption that $e$ was no initially included in $A$ .
These cases show that if $(V, A)$ is not a spanning tree, then there exists no edge $e$ that can be added to $A$ without creating a cycle, which contradicts the assumption that $(V, A)$ is not a spanning tree. Thus, $(V, A)$ must be a spanning tree.

Prof’s annotations:

Exchanging edges

Claim

Let $(V, A)$ and $(V, T)$ be two spanning trees, and let $e$ be an edge in $T$ but not in $A$ .

Then there exists an edge $e^{'}$ in $A$ but not in $T$ such that $(V, T + e^{'} - e)$ is still a spanning tree. Bonus: $e^{'}$ is on the cycle that $e$ creates in $A$ .

Proof:

write $e = {v, w}$
$(V, A + e)$ contains a cycle $c = v, w, ..., v$
removing $e$ from $T$ splits $(V, T - e)$ into two connected components $T_{1}, T_{2}$
$c$ starts in $T_{1}$ , crosses over to $T_{2}$ , so it contains another edge $e^{'}$ between $T_{2}$ and $T_{1}$
$e^{'}$ is in $A$ , but not in $T$
- since c spans both $T_{1}$ and $T_{2}$ , it must contain another edge $e^{'}$ connecting $T_{1}$ and $T_{2}$ . This edge $e^{'}$ is in $A$ (since it’s part of the cycle $c$ ) but not in $T$ .
$(V, T + e^{'} - e)$ is a spanning tree
- By replacing $e$ in $T$ with $e^{'}$ , the resulting graph $T + e^{'} - e$ remains connected and acyclic, and thus forms a spanning tree.

This completes the proof that for any edge $e$ in $T$ but not in $A$ , there exists an edge $e^{'}$ in $A$ but not in $T$ such that replacing $e$ with $e^{'}$ still results in a spanning tree.

Prof’s annotations:

Correctness: exchange argument

Initial Assumption: Let $A$ be the output of the algorithm and $T$ be any spanning tree. If $T$ is not equal to $A$ , meaning there exists an edge $e$ in $T$ but not in $A$ .
Edge Exchange: Since $e$ is in $T$ but not in $A$ , according to the claim, there must exist an edge $e^{'}$ in $A$ but not in $T$ such that replacing $e$ with $e^{'}$ results in a spanning tree $T^{'} = T + e^{'} - e$ .
Weight Comparison: The claim states that $e^{'}$ is on the cycle that $e$ creates in $A$ , and during the algorithm, $e$ was considered but rejected because it created a cycle in $A$ . Moreover, all other elements in this cycle have smaller or equal weight compared to $e^{'}$ .
Weight Inequality: Therefore, $w (e^{'}) \leq w (e)$ , indicating that the weight of $e^{'}$ is less than or equal to the weight of $e$ .
Spanning Tree Construction: By replacing $e$ with $e^{'}$ , the resulting spanning tree $T^{'}$ has weight less than or equal to $T$ and one more common element with $A$ , thus maintaining the validity of the spanning tree while improving its similarity to $A$ .
Iteration: This process can be iterated until $T$ becomes equal to $A$ , ensuring that the output of the algorithm is indeed a spanning tree similar to $A$ in terms of contained edges.

Merging connected sets of vertices

There is a slideshow of this in the notes:

…

Doesn’t choose 7 because it will form a cycle. Doesn’t choose 14 because we already have all edges and by picking edge with weight 12.

Data Structures

Operations on disjoint sets of vertices:

Find: identify which set contains a given vertex
Union: replace two sets by their union

This pseudocode outlines the implementation of Kruskal’s algorithm for finding the Minimum Spanning Tree (MST) of a graph using disjoint sets (also known as Union-Find data structure).

Initialization: Initialize an empty set $T$ to store the edges of the MST, and initialize a set $U$ containing $n$ disjoint sets, each representing a single vertex of the graph.
Sort Edges: Sort the edges of the graph by increasing weight. This step ensures that we process edges in non-decreasing order of weight.
Iterate Through Edges: Iterate through the sorted edges $k = 1, ..., m$ , where $m$ is the total number of edges in the graph.
Check Connectivity: For each edge $e_{k}$ , check whether the vertices $e_{k} .1$ and $e_{k} .2$ belong to different sets in $U$ (i.e., they are not already connected in the MST).
Union Operation: If the vertices of $e_{k}$ are in different sets, perform the Union operation on the sets containing these vertices in the MST.
Update MST: add the edge $e_{k}$ to the MST $T$ .
Output MST: After processing all edges, the set $T$ contains the edges of the MST.

Another slideshow of An OK solution (look in notes):

Worst case:

Find is $O (1)$
- finding the set that contains a given vertex is $O (1)$ because it simply requires a constant-time lookup in the $X$ array.
Union traverses one of the linked lists, updates corresponding entries of $X$ , concatenates two linked lists. Worst case $Θ (n)$ .
- this operation involves traversing one of the linked lists (which may contain up to $n$ elements), updating corresponding entries of $X$ , and concatenating two linked lists. In the worst case, takes $Θ (n)$ time.

Krushal’s Algorithm:

sorting edges $O (m lo g (m))$
- using an efficient sorting algorithm
$O (m)$ Find
- performed for each edge to determine whether the vertices of the edge belong to different sets. Since the find operation is $O (1)$ , total time complexity for find operations in Kruskal’s algorithm is $O (m)$ .
$O (n)$ Union
- Union operations are performed whenever two vertices from different sets are found. In the provided implementation, the worst-case time complexity for a single union operation is $Θ (n)$ . In Kruskal’s algorithm, there can be at most $n - 1$ union operations (when all vertices are initially in separate sets and are merged one by one). Thus, the total time complexity for union operations in Kruskal’s algorithm is $O (n^{2})$ .

Worst case $O (m lo g (m) + n^{2})$

For optimizing the union operation in the disjoint set data structure which improves the overall time complexity of Kruskal’s algorithm.

Size Tracking: Each set in $U$ now keeps track of its size. This allows us to determine which set is smaller and traverse only the smaller list during the union operation.
Efficient Concatenation: A pointer to the tail of the lists is added to facilitate concatenation in constant time $O (1)$ .

Analysis of Improved Version:

Key Observation: Although the worst-case time complexity for a single union operation remains $Θ (n)$ , the total time is significantly reduced due to the optimization.
Size Doubling: For any given vertex $v$ , the size of the set containing $v$ at least doubles when we update $X [v]$ during a union operation.
Frequency of Updates: Since $X [v]$ is updated at most $lo g (n)$ times (as the set size doubles), the total cost of union per vertex is $O (lo g (n))$ .
Total Time Complexity: With the improved union operation, the total complexity for all unions becomes $O (lo g (n))$ , considering $n$ vertices.

Combining everything: $O (n lo g (n))$ for all unions and $O (m lo g (m))$ total

Thu Feb 29 2024

Lecture 11 - Dynamic Programming

Goals: This module: the dynamic programming paradigm through examples

interval scheduling, longest increasing subsequence, longest common subsequence, etc.

What about the name?

programming as in decision making
dynamic because it sounds cool.

Runtime that is exponential bigger than 1. He doesn’t look happy.

Insert picture:

Remember what we’ve computed with an array. This is still a recursive solution. Why start by top of the tree when we can go from bottom of the tree?

Runtime: $O (n)$

For the data structure we will use to store $T$ , we just need an array of size $n + 1$ (cause we start at $F_{0}$ .

In this top-down approach, it uses the “memoization” technique, also known as “top-down dynamic programming”. Where you start by solving the larger problem by breaking it down into the subproblems. Then to avoid redundancy, you keep track of everything in a data structure, in this case an array of already computed numbers (of my subproblems).

How can we improve it? Not necessarily the complexity?

We don’t care about the whole array when we only need the previous two values to compute the next.

Therefore, we have solved the subproblems from bottom up? What does that actually mean?

From chatgpt "bottom up"

In the context of dynamic programming, “bottom-up” refers to an approach where you start solving the problem from the smallest subproblems and gradually build up solutions to larger subproblems until you reach the desired solution.

In summary

Bottom-up dynamic programming starts from the smallest subproblems and builds up solutions to larger subproblems iteratively.

Top-down dynamic programming starts from the largest problem and breaks it down into smaller subproblems, recursively solving them while storing solutions to avoid redundant computations.

We just need an array to store stuff. No need of a hash-map. What would be the size of the array. What about $n$ ?

$F_{0,} F_{1,} ..., F_{n}$ : so to compute the value at position $i$ we add up the values at $i - 1$ and $i - 2$ . What kind of relation is this?

Came up with a plan, identified the subproblems. Need a place to keep the information. Use an array, size $n$ and we know how to compute $i$ , then we tell it how to solve it.

Prof doesn’t like this slide:

Use your creativity instead of following this recipe. - Armin

Freedom comes with a cost. Any good things come with a cost - Armin

From Matthew’s notes:

The prof mentioned that you often might have the proof of correctness embedded into your description of the algorithm, since you often need to do a lot of justification to make the recurrence relation between your subproblems / for your algorithm to make sense.
The prof said that in this course, point #5 is always needed. He said that in the following examples, what he is talking about will become apparent / will make sense.
In this course, the prof said that if he asks you to do DP, you must do things iteratively, and not recursively.

Be precise on the exam, what are your subproblems, and HOW you can find the solution in the array. Then come up with a recurrence.

Recovering of the solution it is needed in this course for exams and assignments (although the slide says if needed).

In this course, when they asked dynamic programming, do it iteratively not recursively!

Dynamic Programming

Key features

solve problems through recursion
use a small (polynomial) number of nested subproblems
may have to store results for all subproblems
can often be turned into one (or more) loops

Dynamic programming vs. divide-and-conquer (PROF DOESN’T CARE ABOUT DIFFERENCES)

dynamic programming usually deals with all input sizes $1, ..., n$
Does not go through all the subproblems in divide-and-conquer
DAC may not solve “subproblems”
DAC algorithms not easy to rewrite iteratively

We want the weight of the sum of the intervals to be maximized. dynamic programming solves this problem.
We did the greedy algorithm for it before, but that only works for when the weight of each interval is 0.
What is the input and output. Input is the number of intervals and output intervals that do not overlap and maximizes the weight.

Insert picture: second $O_{w}$ is solution not containing $I_{n}$

An optimal solution: $O_{s} (I_{1, ...,} I_{n})$ with weight $O_{w} (I_{1}, ..., I_{n})$

The $I$ s in the brackets which is the optimal solution, are possible candidates contributing to the optimal solution
But we don’t know the indices of these items!

Plan to figure out the intervals that play a role in our optimal solution:

sorting based on their finishing time. first finishing will have 1, then 2, etc.

There are two options: either we choose $I_{n}$ or we don’t choose $I_{n}$ .

Define $p_{j}$ for interval $I_{j}$ to be the largest index less than $j$ such that the interval $I_{p_{j}}$ are disjoint. (so the next interval will have interval between those two.)to-understand

O(…) here has nothing to do with Big-O, it is just used to show “the optimal solution that involves them”. So, the things inside the brackets are things that could be in the optimal solution, but they don’t have to be. The solution will just be a subset of them.

The idea:

Sort the finishing time in non-decreasing order
Initialize empty set T to store the selected intervals

What exactly is $p_{j}$ over here?
- all the intervals that don’t overlap with the j’s Interval. Of the form $I_{1}, ..., I_{p_{j}}$
And the $s_{i}$ ‘s? Oh maybe some sort of numbering for the intervals that belongs in between the actual Intervals depending on the finishing time of the interval. So we, $s_{i}$ that is < then $f_{1}$ . Then in this case $p_{i} = 0$ , since it’s technically the first one.
- You somehow use $s_{i}$ to find the $p_{i}$
- seems to be a numbering system for intervals based on their finishing times. helps in determining the index of the previous non-overlapping interval

$p_{j}$ and $s_{i}$ : $p_{j}$ represents the largest index less than $j$ such that the interval $I_{p_{j}}$ is disjoint from $I_{j}$ . This means that $p_{j}$ identifies the last interval before $I_{j}$ that doesn’t overlap with $I_{j}$ . As for $s_{i}$ , it seems to be a numbering system for intervals based on their finishing times. It helps in determining the index of the previous non-overlapping interval.

How can I find $p_{i}$ ?
- (Interval $i : [s_{i,} f_{i}]$ )
- $p_{i}$ ?
do for all the $s_{i}$ , but don’t sort because we don’t want to scramble the finish times we’ve sorted
$f_{*} \leq s_{i} < f_{*+ 1}$
$p_{i} = *$

$A$ = sorting permutation for the array of s’s

He said that finding the sorting permutation can be found in $n lo g n$ (same time as sorting)
Go through smallest $s$ and move forward checking for the next one.
I will never go back in computation
$k$ refers to the indexes of $f$
cost: we sort once to get A, and loop cost is linear. And we are always moving forward.

Runtime: $O (n lo g (n))$ sorting and $O (n)$ loops

The algorithm outlined is for computing the values of $p_{j}$ which represent the largest index less than $j$ such that the interval $I_{p_{j}}$ does not overlap with $I_{j}$ . Here’s a breakdown of how the algorithm works:e

Begin by initializing $f_{0}$ to $- \infty$ and set $i$ to 1.
Iterate over each index $k$ from 0 to $n$ .
While $i$ is less than or equal to $n$ and $f_{k}$ is less than or equal to $s_{A [i]}$ (the starting time of the $i$ -th interval in the sorted order), and $s_{A [i]}$ is strictly less than $f_{k + 1}$ , set $p_{i}$ to $k$ and increment $i$ .
Continue this process until either all intervals are considered or the condition is not met.
The resulting $p_{i}$ values will indicate the largest index less than $i$ such that the intervals at those indices do not overlap with the $i$ -th interval.

The runtime of this algorithm is dominated by the sorting step, which is $O (n lo g n)$ , and the subsequent loop, which is $O (n)$ . Therefore, the overall runtime complexity is $O (n lo g n) + O (n)$ , which simplifies to $O (n lo g n)$ . This makes the algorithm efficient for computing the $p_{j}$ values for the intervals.

Note: The while loop

The while loop condition in the algorithm ensures that the algorithm iterates through the intervals until it finds an interval that does not overlap with the current interval being considered. Let’s break down the condition:

while i ≤ n and fk ≤ sA[i] < fk+1

i ≤ n: This condition ensures that the algorithm does not attempt to access an index beyond the bounds of the array of intervals.

fk ≤ sA[i] < fk+1: This condition checks whether the starting time of the interval indexed by A[i] falls within the range of the current interval being considered, which is defined by fk and fk+1. If it does, then it means that the interval A[i] overlaps with the current interval. The loop continues until an interval is found that doesn’t overlap with the current interval.

The while loop condition ensures that the algorithm iterates through the intervals until it finds an interval that does not overlap with the current interval being considered (I_k). Once such an interval is found, the algorithm assigns the appropriate value of p_i and moves on to the next interval.

$M [i]$

$p_{i}$ is the last index of interval that doesn’t have intersection
Solution is the maximum weight. That is recovering solution. Finding that set.

Matthew’s notes

To find the optimal set, you have to first find the element in M that is maximum. Then, you have to figure out what elements actually contribute to that. To do that, you have to trace back the calculation for that index and see what contributed to it. You have to backtrack.
The prof then ran the algorithm on a sample dataset.
I was thinking about if it matters for this problem if we have to make start/end times distinct or not, and I think that we do not have to. There does not seem to be any part of the algorithm that will be messed up if we introduce duplicate end times or start times.

What about the recurrence?

To recover the optimum set of intervals after computing the maximal weight using dynamic programming, you can follow these steps:

Start from the last interval $I_{n}$ and backtrack through the computed values of $M [i]$ .
At each step, check whether including $I_{i}$ contributes to the optimal solution. This can be determined by comparing $M [i]$ with $M [p_{i}] + w_{i}$ . If they are equal, it means $I_{i}$ was not included in the optimal solution. If $M [i]$ is greater, it means $I_{i}$ was included.
If $I_{i}$ was included, add it to the set of optimal intervals.
Move to the interval $I_{p_{i}}$ and repeat the process until you reach the first interval.
The set of intervals you collected during this process will be the optimal set.

Include picture of example: Understand it.

As you can see, we first sort the intervals by their finishing time (non-decreasing order)
Then we can find the permutation of the $s_{i}$ ‘s (But what exactly does the s represent and A)to-understand

Every time you have that arrow, then this interval is in your solution. This is what you do for getting optimal solution. This is called backtracking. Basically how did we get there.

Tue Mar 5, 20204: I have recordings

Continue on slide 14/17

The 0/1 Knapsack Problem

We want to respect the weight constraint of the chosen items $w_{i} \leq W$ by optimizing the number of elements we put in.
Does this have the constraint of only being able to pick an item once?
In Fractional Knapsack, we can break items for maximizing the total value of the knapsack.

Slide says that the optimal solution given a $W$ and $n$ items denoted as $O [W, n]$ (do not confuse it with big-Oh) is the max of two values:
- $v_{n} + O [W - w_{n}, n - 1]$ , if we choose $n$ (and $w_{n} \leq W$ )
  - when it says choose $n$ , i think it basically means we go through the list of the item and decide if we choose the item we are currently looking at?
  - therefore we subtract that chosen weight from $W$ the upper bound
  - I just don’t really understand the $v_{n}$ yetto-understand
- $O [W, n - 1]$ , if we don’t chose $n$
  - so we don’t subtract $w_{n}$ , and move on
he’s’ trying to focus on one object and get the maximal value for that object

The basic idea, as you correctly mentioned, is to consider whether to include the $i$ -th item in the knapsack or not. Here’s how the recurrence relation is set up:

If we choose to include item $i$ in the knapsack, then the value of the knapsack would be the value of the $i$ -th item plus the optimal value achievable with the remaining capacity and considering only the first $i - 1$ items. This is represented as $v_{i} + O [W - w_{i}, i - 1]$ , where $w_{i}$ is the weight of the $i$ -th item.
If we choose not to include item ii in the knapsack, then the value of the knapsack remains the same as the optimal value achievable using the same capacity but considering only the first $i - 1$ items. This is represented as $O [W, i - 1]$ .

The recurrence relation can be expressed as follows:

$O [w, i] = ma x (v_{i} + O [w - w_{i}, i - 1], O [w, i - 1])$ This equation represents the optimal value function $O [w, i]$ . It calculates the maximum value achievable with a knapsack of capacity ww considering items $1$ through $i$ , either by including the $i$ -th item or by excluding it.

Algorithm

Prof’s notes:

Either we choose item a or not: The max value with items $1$ to $10$ .
Two options:
- 1. If we choose item $n$
- 1. If item $n$ is not chosen
$V_{n} +$ max value that we can get with weight $W - w_{n}$ and items $1$ to $n - 1$
max value that we can get with weight $w$ and items 1 to $n - 1$

Insert graph: he finds items 1 at $W - w_{n}$ and item 2. with $w_{n}$

Another graph (Basically just the bottom right graph of above picture): Need to tell me precisely what are the subproblems, trying to solve problem with 1 to $i$ with capacity $w$ . You are given a $w$ and $i$ that is.

What else does he need to give us for dynamic programming to solve
base case
- First row, can’t add anything so all 0’s
- First column too is all of 0’s
Order of computation:
- Take items 1 to $n$ and capacity osi $w$ , whether take item $i$ or not
- If we don’t take we have 1 to $i - 1$ with $w$ , we can copy the cell to the cell at $i$ with capacity $w$ that is if we don’t choose the cell at $i - 1$ with $w - w_{i}$ + $v_{i}$
- Two options either take item i or not, if we take $i$ we already have $v_{i}$ in your pocket

One more thing we should be worried about. We take the best value out of the two. But is it good to take option $i$ or not. First, ask the question: Is it possible to take item $i$ or not?? If item is bigger than weight already, we don’t even have to consider it then.

if it’s possible just compare them if not don’t need to

Pseudocode: Algorithm

Doesn’t show how to recover the solution in this pseudocode!
Update maximum value: If the weight of the current item is less than or equal to the current capacity, we have two choices: either include the current item or exclude it. We choose the option that maximizes the value. The maximum value achievable considering the current item is calculated as the maximum of two values:
- $v_{i} + O [w - w_{i}, i - 1]$ : Value of the current item plus the maximum value achievable with the remaining capacity $w - w_{i}$ and considering only the items up to $i - 1$ .
  - $v_{i}$ : current value of item $i$ , that we are considering including or not.
  - $O (w - w_{i}, i - 1)$ : represents the maximum value achievable with the remaining capacity $w - w_{i}$ and considering only the items up to $(i - 1)$ -th item. It’s the max value achievable with the remaining capacity $w - w_{i}$ , if we decide to include item $i$ in the knapsack. We consider only the items from the first to the $(i - 1)$ -th item because we have already made a decision about including item $i$ . We cannot include the same item more than once in the 0/1 Knapsack Problem, so we only consider the items before $i$ .
  - Putting these two terms together, $v_{i} + O [w - w_{i} , i - 1]$ represents the total value achieved by including item $i$ in the knapsack. It adds the value of item $i$ to the maximum value achievable with the remaining capacity if item $i$ is included.
  - We compare this value with the maximum value achievable without including item $i$ , which is $O [w, i - 1]$ . We choose the maximum of these two values, as it represents the optimal decision for including or excluding item $i$ in the knapsack.
- $O [w, i - 1]$ : Maximum value achievable without including the current item.

Runtime: $Θ (nW)$

Something is hidden?
What is the actual input size? Roughly what are the parameters of the input size? Twice of $n$ so $2 n$ and we also have for each of those numbers: $lo g (n u mb er)$ . And also $w$ is given. It is an integer. How many bits you need to store $w$ in memory. = $lo g (w)$ . Is $Θ (nW)$ written in terms of input size. NO. Let’s rewrite it in a way we can see the input size: $Θ (n 2 lo g (w))$ .

Prof’s notes:

0/1 Knapsack example:

Base case is first row and column being 0.
Order of computation: go column by column (make sure to write the order of computation and not just draw arrows on the final exam) since in the pseudo code the outer for loop is $i$ .
So check first if we can, if so, then look to the left and also the cell of the row - the value. and whatever in the cell + value?
Flag those places in $i$ is taken or not: if not taken: taken from left, if taken, then the diagonal is taken.

From what I understand:

Initialize the array: Create a 2D array O with dimensions (n+1) x (W+1) to store the maximum value achievable for different subproblems.
- O[0..n, 0..W] is initialized with all O(0, j) = 0 and all O(w, 0) = 0 to set the base cases.
Iterate over items and capacities: Loop through each item i from 1 to n and each capacity w from 1 to W.
Check if the current item can be accommodated: If the weight of the current item wi is greater than the current capacity w, it means the item cannot be included in the knapsack. In this case, set O[w, i] equal to O[w, i-1], meaning the maximum value achievable for the current capacity without considering the current item.
Consider including the current item: If the weight of the current item is less than or equal to the current capacity, we have two choices:
- Include the current item: Add its value vi to the maximum value achievable for the remaining capacity (w-wi) and considering the previous items (i-1). This is represented by vj + O[w-wj, i-1].
- Exclude the current item: Keep the maximum value achievable without considering the current item, which is O[w, i-1].
Update the maximum value: Set O[w, i] to the maximum of the two choices mentioned above.
Return the maximum value: Once the loop completes, the maximum value achievable with the given capacity and items will be stored in O[W, n].
Construct the solution: You can backtrack through the array O to find which items were included in the optimal solution. This can be done by tracing back the choices made during the dynamic programming process.

Make sure to compute the values by yourself later!

Watch a video on it (still a bit confused)

He’s keeping track of values in his array!!!

Discussion: This is called a pseudo-polynomial algorithm

in our word RAM model, we have been assuming all $v_{i} s$ and $w_{i} s$ fit in a word
so input size is $Θ (n)$ words
but the runtime also depends on the values of the inputs

01-knapsack is NP-Complete, we we don’t really expect to do much better.

Lecture 12 - Dynamic Programming - Part 2

The Longest Increasing Subsequence Problem

Input: An array $A [1.. n]$ of integers

Output: A longest increasing subsequence of $A$ (or just its length) (does NOT need to be contiguous)

Example: $A = [7, 1, 3, 10, 11, 5, 19]$ gives $[7, 1, 3, 10, 11, 5, 19]$ (1,3,10,11,19 red)

Remark: there are $2^{n}$ subsequences (including an empty one, which doesn’t count)

Tentative subproblems

Attempt 1:

Subproblems: the length $l [i]$ of a longest increasing subsequence of $A [1.. i]$
\on the example, $l [6] = 4$
so what? not enough to deduce $l [7]$

Attempt 2:

Subproblems: the length $l [i]$ of a longest increasing subsequence of $A [1.. i]$ , together with its last entry
example: $l [6] = 4$ , with last element $11$
OK if we can add $A [i + 1]$ , but what if not?

A more complicated recurrence

Attempt 3:

Define $L [i]$ as the length of the longest increasing subsequence of $A [1.. i]$ that ends with $A [i]$ , for $i = 1, ..., n$ .
Initialize $L [1] = 1$ , as the LIS ending at the first element is simply that element itself.

Recurrence relation:

For each element $A [i]$ , consider all previous elements $A [j]$ (where $j < i$ ).
If $A [j] < A [i]$ , it means $A [i]$ can extend the LIS ending at $A [j]$ .
So, $L [i] = max (L [i], L [j] + 1)$ , where $1 \leq j < i$ and $A [j] < A [i]$ .

Explanation:

$L [i]$ represents the length of the LIS ending with $A [i]$ .
To compute $L [i]$ , we check all previous elements $A [j]$ (where $j < i$ ).
If $A [j] < A [i]$ , it means we can extend the LIS ending at $A [j]$ by appending $A [i]$ .
We maximize this by taking the maximum of $L [j] + 1$ for all such $j$ .
The final answer will be the maximum value among all $L [i]$ .

Iterative Algorithm

When we are looking at $S$ , we don’t know where $j$ is before $i$ in the array. So we have to look at all options of $j$ and take the maximum one. So we want to keep track of $j$ . If no j, then it must be the first one?

Runtime: $Θ (n^{2})$

Remark:

the algorithm does not return the sequence itself, but could be modified to do so

To understand

Just follow through the code with the example.

Example: Insert prof’s picture

Solution lies in array $L$ .
how did we arrive at 5? from 4, then 3, 2 and 0. But somehow since it’s 0, we don’t take 7??
- Ok so it’s not 5 dumbass. You look at array $L$ , the highest number is 4, which is at position where 5 is in array $A$ . And you can retrace back and see that 1 $\to$ 2 $\to$ 3 $\to$ 4 lead to 4. So we take value in A: 1, 3, 10, 11.

Thu Mar 7, 2024: Continue and finish Lecture 12, slide 6/7

Longest Common Subsequence Problem

Input: Arrays $A [1.. n]$ and $B [1.. m]$ of characters
Output: The maximum length $k$ of a common subsequence to $A$ and $B$ . (subsequence do not need to be contiguous)
Example: $A =$ blurry, $B =$ burger, longest common subsequence is burr.
Remark: there are $2^{n}$ subsequences in $A$ , $2^{m}$ subsequences in $B$

Prof’s notes:

Uses A to index his rows
Where do i find my solution? bottom right corner.
In $A [1.. n]$ and $B [i .. j]$ . There are 3 cases that may happen:
- 1. $B [j]$ doesn’t appear in the LCS. Then this common subsequence is contained in $A [1.. i]$ and $B [i ... j - 1]$
- 1. $A [i]$ doesn’t appear in the LCS. Then this common subsequence is contained in $A [1.. i - 1]$ and $B [1.. j]$
  - The first two case are based on … not being the same
- 1. $A [i] = B [j]$ appears in the sequence, the rest of the sequence is combined in $A [1... i - 1]$ and $B [1.. j - 1]$

Insert bottom right picture he drew:

Why add 1 in the diagonal arrow? Because They match. Dumbass.
In the two other cases, we know we don’t have added a longest character, so don’t add one.

Approach:

Input: Two sequences $A [1.. n]$ and $B [1.. m]$ of characters.
Output: The maximum length $k$ of a common subsequence between $A$ and $B$ .
Example: For example, if $A =$ “blurry” and $B =$ “burger”, the longest common subsequence is “burr”.
Idea: The LCS problem can be solved using dynamic programming, typically using a two-dimensional array to store the lengths of the LCS of prefixes of the sequences.
Recurrence: At each cell $(i, j)$ of the 2D array (where $i$ represents the index of $A$ and $j$ represents the index of $B$ ), you consider three cases:
- If $A [i] \neq = B [j]$ , then the LCS does not include either $A [i]$ or $B [j]$ , so the length of the LCS up to that point remains the same as it was without including these characters. You can look at the cell $(i - 1, j)$ or $(i, j - 1)$ to find the length of the LCS.
- If $A [i] = B [j]$ , then you can extend the LCS by one. In this case, the length of the LCS would be one plus the length of the LCS up to the previous characters, i.e., $(i - 1, j - 1) + 1$ .
- You take the maximum of these two options.
Adding 1 in the diagonal arrow: When characters match (case 3), we add one to the LCS length because we’re extending the LCS by one character. However, in cases where characters don’t match (cases 1 and 2), we don’t add anything because the current characters don’t contribute to the LCS. So, we don’t add one in those cases.
Finding the Solution: Once the dynamic programming table is filled, the solution lies in the bottom-right cell of the table, which represents the LCS length.
Complexity: The time complexity of this approach is $O (n \cdot m)$ , where $n$ and $m$ are the lengths of sequences $A$ and $B$ respectively. This is because you fill a 2D array of size $n \times m$ .

A bivariate recurrence

Definition: let $M [i, j]$ be the longest subsequence between $A [1.. i]$ and $B [i .. j]$

$M [0, j] = 0$ for all $j$
$M [i, 0] = 0$ for all $i$
$M [i, j]$ is the max of up to three values
- $M [i, j - 1]$ (don’t use $B [j$ )
- $M [i - 1, j]$ (don’t use $A [i]$ )
- if $A [i] = B [j]$ , $1 + M [i - 1, j - 1]$

Size of array is $m \times n$ os runtime is $Θ (mn)$

Prof gave an example: Insert picture

Base case is all $0$ ‘s for first row and column
Row is $i$ which is A. Column is $j$ which is B.
Look if the first column is equal. they are so go diagonally. b=b
We compare $A [2]$ to $B [1]$ : $l$ does not equal to $b$ . so don’t go diagonally, green arrow
Go through the whole process by yourself at home.
- Walked through it.
- Compare (i,j)
- B, B (0,0)
- L, U (1, 1)
- U, U (2,1)
- R, R (3,2)
- R, G (4,3)
- R, E (4,4) Here instead of $i + 1$ , we do $j + 1$ . But why? When exactly do we advance $i$ or $j$ ????
- Y, S (5,5)
- Nothing, Y (6,5) The end.

If you want to recover your solution: Walk back from here: bottom right cell, that is in the example 4 (solution) at (6,6), each arrow (red arrow diagonal) adds a character ot the output. The first one you take is the green one, vertical. Then take the diagonal, the index of A is 5. How did i get there? diagonal red arrow. and so on

$A [1]$ diagonal, $A [3]$ diagonal, $A [4]$ diagonal, $A [5]$ diagonal
Note that he is not showing up every step, but when you get to your answer, you would have noted all the arrows and keep track of them. At the end, you just go back and follow them.

Insert small picture: (red arrow) add $A [i + 1]$

Can we do better? Yes, there are other heuristic, but we want talk about them. For us, the goal is to look at dynamic programming. So we don’t care. Not worried about efficiency.

When to advance $i$ or $j$ :

When comparing characters at indices $(i, j)$
- If $A [i]$ equals $B [j]$ , it means that the characters are part of the LCS. In this case, you advance both indices by $1$ diagonally to $(i + 1, j + 1)$ to check the next characters in both sequences.
- If $A [i]$ does not equal $B [j]$ , it means that the characters are not part of the LCS. In this case, you choose the maximum of the two adjacent cells????. However, you only advance one of the indices, not both.
  - If you choose the maximum from the cell to the left $(i, j - 1)$ , it means you advance index $j$ by $1$ horizontally to $(i, j + 1)$ to check the next character in sequence $B$ .
  - If you choose the maximum from the cell above $(i - 1, j)$ , it means you advance index $i$ by $1$ vertically to $(i + 1, j)$ to check the next character in sequence $A$ .

Should we fill up the table first then march through it?

Still not understanding the choose the max of the two adjacent cells…

I guess you need to fill up the table first. Then retrieve the answer at bottom right corner. Then backtrack from that cell to reconstruct LCS…

Try constructing the table (used all my brain power. But definitely easier and more intuitive than 01 knapsack problem)

Dynamic Approach:

Definition: Define $M [i, j]$ as the length of the longest common subsequence between $A [1.. i]$ and $B [1.. j]$ .
Base cases: Initialize all elements of the first row and first column of the $M$ matrix to $0$ .
Recurrence relation: For each cell $(i, j)$ of the matrix $M$ , compute the length of the longest common subsequence using the following rules:
- If $A [i] \neq = B [j]$ , then $M [i, j]$ is the maximum of $M [i, j - 1]$ (not using $B [j]$ ) and $M [i - 1, j]$ (not using $A [i]$ ).
- If $A [i] = B [j]$ , then $M [i, j]$ is $1$ plus $M [i - 1, j - 1]$ (the length of LCS without the current characters plus $1$ for the match).
Iterative computation: Use two nested loops to compute all values of $M [i, j]$ . The runtime of this algorithm is $Θ (mn)$ , where $m$ and $n$ are the lengths of sequences $A$ and $B$ respectively.
Reconstructing the solution: To recover the actual LCS, start from the bottom-right cell $(i, j)$ of the matrix $M$ and backtrack according to the arrows you’ve noted during computation. Follow the arrows:
- If you move diagonally (matching characters), include the character at position $i$ (or $j$ , since they are both equal).
- If you move vertically, exclude the character from sequence $B$ .
- If you move horizontally, exclude the character from sequence $A$ .
Example: For example, if the LCS length is $4$ (bottom-right cell), backtrack from that cell following the arrows you’ve noted to reconstruct the LCS.

This approach efficiently computes the length of the LCS and allows for easy reconstruction of the LCS itself.

Lecture 13 - Dynamic Programming - Part 3

Edit Distance

Now here, we want to use another type of measure (he is not claiming if one measure is better than the other).
When does mispelling happen?
- 1. a character is typed wrong
  - How to fix? Replace = in his slides called $C han g e$ to fix it
- 1. a character is missed
  - ( $A dd$ to fix)
- 1. An extra character is typed
  - （ $De l e t e$ to fix)

How can I convert snowy to sunny? Distance we need for editing. ED = Editing Distance

First thing to do is to align them.

First thing is align first one
Use a gap Out of all the alignments, you want to figure out the one that is better:

from slide, first alignment need 3 changes so we have 3C
then its 1A, 1C and 1D
- 1A being let’s say we missed typed a character, so we can add u
- 1C being we change o to n
- 1D being we typed an extra character w, so if we delete it, we will get sunny.
last one … 2A, 1C, 2D
- 2A: we add s and n
- 1C: we change s to u (he forgot to put it in picture above)
- 2D: we delete o and w $\to$ which one is the minimum?

You must list all the possible alignments and give the best!

Prof’s notes:

We want to find ED between $A [1.. n]$ and $B [1.. m]$ . Three cases may happen:
1. Align last character of A with last character of B. So $A_{n}$ is aligned with $B_{m}$ .
- The last column will add $1$ to the cost, if $A [n] \neq = B [m]$ and adds $o$ if $A [n] = B [m]$ . The ED of the remaining columns is based on ED between $A [1.. n - 1]$ and $B [1.. m - 1]$
1. $A_{n}$ is not aligned with $B_{m}$ .
- In this case, the ED is 1 plus the ED between $A [1.. n - 1]$ and $B [1.. m]$
1. $B_{n}$ is not aligned with $A_{n}$ at the end
- In this case, the ED is 1 plus the ED between $A [1.. n]$ and $B [1.. m - 1]$

This can be generalized to other indices. Let’s define an array $D$ at size $n + 1 \times (m + 1)$ and $D [i, j] = E D$ between $A [1.. i]$ and $B [1.. j]$ :

What else does he need to tell us?
- Base case: ED between an empty word and $B [1.. j]$ is $j$ . ED between an empty word and $A [1.. i]$ is $i$ .
- Solution at bottom right cell.

Prof does have an example, but he doesn't have time to go through. Go to DPV 6.3 with an example of this for it to make more sense!

The recurrence:

Definition

Let $D [i, j]$ be the edit distance between $A [1.. i]$ and $B [1.. j]$

$D [0, j] = j$ for all $j$ (add $j$ characters)

$D [i, 0] = i$ for all $i$ (delete $i$ charracters)

$D [i, j]$ is the minimum of three values

$D [i - 1, j - 1]$ (if $A [i] = B [j]$ ) or $D [i - 1, j - 1] + 1$ (otherwise)

$D [i - 1, j] + 1$ (delete $A [i]$ and match $A [1.. i - 1]$ with $B [1.. j]$ )

$D [i, j - 1] + 1$ (add $B [j]$ and match $A [1.. i]$ with $B [1.. j - 1]$ ) The algorithm computes all $D [i, j]$ , using two nested loops, so runtime $Θ (mn)$ .

The recurrence relation provided defines the edit distance between two strings $A [1.. i]$ and $B [1.. j]$ in terms of smaller subproblems. Here’s a breakdown of the recurrence:

Let $D [i, j]$ be the edit distance between the prefixes $A [1.. i]$ and $B [1.. j]$ .
Base cases:
- $D [0, j] = j$ : The edit distance between an empty string and $B [1.. j]$ is $j$ because it requires adding $j$ characters to transform the empty string into $B [1.. j]$ .
- $D [i, 0] = i$ : The edit distance between $A [1.. i]$ and an empty string is $i$ because it requires deleting $i$ characters from $A [1.. i]$ to make it empty.
Recurrence relation:
- $D [i, j]$ is the minimum of three values:
  1. $D [i - 1, j - 1] + 1$ if $A [i] \neq = B [j]$ : This corresponds to a substitution operation where $A [i]$ is replaced by $B [j]$ .
  2. $D [i - 1, j] + 1$ : This corresponds to a deletion operation where $A [i]$ is deleted, and we find the edit distance between $A [1.. i - 1]$ and $B [1.. j]$ .
  3. $D [i, j - 1] + 1$ : This corresponds to an insertion operation where $B [j]$ is added to $A$ , and we find the edit distance between $A [1.. i]$ and $B [1.. j - 1]$ .
The algorithm computes all $D [i, j]$ values using two nested loops, iterating over all possible prefixes of the strings $A$ and $B$ , resulting in a runtime of $Θ (mn)$ , where $m$ and $n$ are the lengths of the strings $A$ and $B$ respectively.

Don't really understand why $D[i-1,j]+1$ is deletion operation where $A[i]$ is deleted and $D[i,j-1]+1$ is insertion of $B[j]$ ...???

Optimal Binary Search Trees

I Skipped over this Optimal Binary Search Trees when reviewing lectures!!!

Make sure to review for the finaltodo and take notes. Didn’t take notes either.

Keep the things you need close to yourself: MVT works on linked list. Seen in CS240.

There is something like this in the binary search trees.

where can we access easily? The root
$(d e pt h (i) + 1)$ must be multiplied by the probability

Example in this slide:

Insert drawn picture
For first tree: Compute: $(1/5 \times 1) + 2 (\frac{1}{5} \times 2) + 2 (\frac{1}{5} \times 4) = \frac{11}{5}$
Didn’t have time to note the right side $\to$ left one is better!

Does the greedy works here? No, since we are doing programming dynamic duh.

Prof’s notes:

Greedy doesn’t work. Why?
Insert picture (it’s the left tree)
put the key with the highest probability in the root.
So we put 5 in the root
Then we pick 3 since it’s the largest in the rest
Then 4
After, that between 2 and 1 we choose 2.
Lastly we put in 1.

So he has left and right tree.

Left tree value is: $2.05$
Right tree: value is: $1.95$

So greedy failed since it’s the left tree (in the picture).

Define $M [i, j]$ to be the minimal cost for items $i$ to $j$ .

Take an item $k$ , between $i$ and $j$ and put it in the root. (my subproblems are from $i$ to $j$ )

Insert picture:

What are

if we look at the left, its a binary search tree, $M [i, k - 1] + l = i \sum k - 1 p_{l} \times 1$ , every single node has one extra work, since we have pushed the whole tree down (look at the picture)
- main cost for $i$ to $k - 1$ .
- all items re shifted downby 1 level so the $s$ s.t. of access goes up by 1 in eah case.
Right side:
- $k + 1$ to $j$
- $m [k + 1, j] + l = k + 1 \sum j (p_{l} \times 1)$
- also we have a term $p_{k} \times 1$ for the root.

Stoped on 4/8 slide

Prof talked about the midterm.

Question 2: easy divide-and-conquer, should just consider the intersection of the right and left side and that’s it
Question 3: he shouted was gonna be in glass
Question 4: True or false question
Then he proceeded to show a graph of Assignment Average vs. Midterm Score. Not normal. 100 on assignment but 30 on the midterms… that’s a lot of people 200+ students
They decided that if you do better on the final, they will use our final exam grade and discounting the midterm.
He will fix the grades, don’t worry about the grades, worry about learning. What a chad.

Tues Mar 12, 2024

Prof redrew the graph: (reinsert the graph)

The above formula is for any $i \leq k \leq j$ and we have to choose $k$ so that it gives the min value. So we try for all $k$ :

M [i, j] = mi n_{i \leq k \leq j} (M [i, k - 1] + M [k + 1, j]) + l = i \sum j p_{l}

We also define … (to finish)

Algorithm: Pseudocode

$d = j - 1$ : is a comment, we don’t overwrite the value of $d$

He is not trying to solve the problem. Focusing on finding recurrence relation for DP.

S is a subset of V
Independent: any two vertex I pick in S, they are not connected immediately by an edge. SHouldn’t be directly connected. Yes you can find a path.
Cardinality is unique. Set is not necessarily unique. Size is indeed unique. The largest.
2 can’t go inside S? 4 has the same property, then you can add 1, 2 or 3.
So the size is 2, and $S = 1, 3$
If I give you a tree, solve the problem? Think about an approach.
- DP has subproblems and trees have subtrees
- What he thinks of it: Step back and look at the objects from afar. Focus on one object and transfer the problem into subproblems.
- We choose the root. If it’s not in the tree, where can i find it? In subtrees. If the root is in the solution, where do we get the solution (should come from somewhere in the tree) so children are definitely not in the solution.

Prof’s notes:

Given a tree (insert graph)
Either the root appears in the a max ind. set or not
- 1. $r \in S$ : none of the children of $r$ can be in $S$ so, the remaining part of $S$ should come from subtrees rooted at grad children of $r$ . (insert picture on the left)
- 1. $r \in / S$ In this case all elements of $S$ are from subtrees rooted at children of $r$ .

Write a formula for this recurrence relation:

How do you refer to a subtree? (insert example tree). Just refer to the root. Say: the subtree rooted at a. The node is unique.
Definition: $I (u)$ is the size of the max independent set of the subtree rooted at $u$ .

I (u) = ma x (1 + g r an d c hi l d re n_{o} f_{u} \sum I (w) + c hi l d re n_{o} f_{u} \sum I (w))

First sum: If $u$ is in max ind. set
Second sum: If $u$ is not in max ind. set.

Basically this slide^

Lecture 14 - Dynamic Programming - Part 4

The Bellman-Ford Algorithm

source is fixed
if no negative cycle, computes all distances δ(s, v)
can detect negative cycles
very simple pseudo-code, but slower than Dijkstra’s algorithm

Definition:

for $i = 0, ... n - 1$ , set
- n is number of vertices, and m is the number of edges
$δ (s, v) =$ length of the shortest path $s \to v$ with at most $i$ edges if no such path $δ_{i} (s, v) = \infty$
- I guess you initialize it to infinity at the beginning, so therefore, if you didn’t find a path? or shortest path? (what’s the difference) then you return $\infty$ for this the starting point to this vertex.

Easy observations:

this gives $δ_{0} (s, s) = 0$ and $δ_{0} (s, v) = \infty$ for $v \neq = s$
- If we want to use no edges, we have 0 which is the length of the path
if there is no negative cycle, $δ_{n - 1} (s, v) = δ (s, v)$ (shortest paths are simple)
- $δ_{n - 1}$ will be the value that you want, if no negative cycle
- why is it $n - 1$ ?
in any case, $δ (s, v) \leq δ_{i} (s, v)$ for all $i$ and for all $v$
- easy observations
- $δ (s, v)$ represents the length of the shortest path from the source vertex $s$ to vertex $v$ in the graph. This is the optimal shortest path distance.
- $δ_{i} (s, v)$ represents the shortest path length from $s$ to $v$ that uses at most $i$ edges. This is the length of the shortest path with at most $i$ edges.
- one represents the optimal shortest path distance, and the other is the length

In this course, we don't allow self loops

Simple Path

In a simple path, no vertex can be repeated. Therefore, if there are $n$ vertices in the graph, the longest simple path from the source vertex to any other vertex will visit at most $n - 1$ distinct vertices.

Recurrence:

Now, why these observations hold?

Simple informal observation:
- A path from $s$ to another vertex $v$ has at most $n - 1$ edges unless it contains a cycle
  - (insert picture of path he drew)
  - if it has more than $n - 1$ edges, then one of the vertices appears more than once which means that there exists a cycle
- Assume there is no negative cycle!!
- First observation: $δ (s, v) \leq δ_{i} (s, v)$
  - If $δ_{i} (s, v) = \infty$ then it is obvious
  - If $δ_{i} (s, v)$ is a finite number, it means that there exists a path from $s$ to $v$ . So, $δ (s, v)$ is either the length of that path or something better. So, $δ (s, v) \leq δ_{(} s, v)$
- $δ_{n - 1} = δ (s, v)$
  - If $δ (s, v) = \infty$ then it is obvious
    - I have no paths
  - If $δ (s, v)$ is a finite number, then it is the weight of the shortest path $p$ from $s$ to $v$ . We claim that this $p$ has at most $n - 1$ edges.
  - If our claim is not true, then the path has at least $n$ edges. This implies that $p$ contains a cycle. The cycle cannot be negative according to my assumption. So it is a positive cycle and it is a contradiction (remove the cycle to get a better path).
  - We need the recurrence relation to find $δ_{n - 1}$

Idea of DP recurrence:

Assume $δ (s, v) < \infty$ , this means that $δ_{i} (s, v)$ is the length at a path with at most $i$ edges which has a finite weight.
The path either has exactly $i$ edges or at most $i - 1$ edges.
If it has $i - 1$ edges then we use $δ_{i - 1} (s, v)$
Otherwise, we decompose the path to one edge $(u, v)$ and a path with $i - 1$ edges to $u$
So we use $δ_{i - 1} (s, v) + w (u, v)$
However, for finding the shortest paths we need to consider all edges $(u, v)$ as the path may pass through any neighbours.

δ_{i} (s, v) = min {δ_{i - 1} (s, v), mi n_{(u, v) \in E} {δ_{i - 1} (s, u) + w (u, v)}}

where $E$ represents the set of edges in the graph, and $w (u, v)$ represents the weight of the edge $(u, v)$ .

Decomposition of Paths:
- If the shortest path from $s$ to $v$ contains exactly $i$ edges, then $δ_{i} (s, v) = δ_{i - 1} (s, v)$ .
- If the shortest path from $s$ to $v$ contains at most $i - 1$ edges, then we decompose it into two parts:
  - The first part is an edge $(u, v)$ .
  - The second part is a path from $s$ to $u$ using at most $i - 1$ edges.
Dynamic Programming Recurrence:
- To compute $δ_{i} (s, v)$ , we consider two cases:
  1. The shortest path from $s$ to $v$ using at most $i$ edges contains exactly $i - 1$ edges. In this case, $δ_{i} (s, v) = δ_{i - 1} (s, v)$ .
  2. The shortest path from $s$ to $v$ using at most $i$ edges contains an edge $(u, v)$ , where $u$ is a neighbour of $v$ . We try all possible neighbours $u$ of $v$ and compute the shortest path length from $s$ to $u$ using at most $i - 1$ edges, then add the weight of edge $(u, v)$ .
    - We take the minimum of these values to ensure we find the shortest path distance.

So basically the two cases considers if we found it or we need to add another edge (path) and we find the shortest one. And add it to the solution. Something like that.

for each $d$ we need to through each of the nodes (the vertices), then for all the neighbours of $v$ , we check
go through every possible edges and we check (the if statement)
There is too many arrays: $n + 1$ and line 6: we go to all $u, v$ of u going into $v$ . Compute all the in neighbours.

d is array that we have, this time, we dont have d1, d2.
Don’t care in what order we do it, in line 4, we are just looking at all edges. Pick the edges in any order you want
Lines 5-7: is there a shorter paths: Seen in Dijkstra’s algorithm, which is the relaxation step.

Example: (Do it yourself)

change this slide, i think the prof updated it

Thu March 14 2024

Saving a bit of time and space

Idea: use a single array $d$

Runtime: $O (mn)$

Insert prof’s notes:

Summary

The Floyd-Warshall algorithm

no fixed source: computes all distances $δ (u, v)$
negative weight OK but no negative cycle
very simple pseudo-code, but slower than other algorithms
another application of dynamic programming

Remark: doing Bellman-Ford from all $u$ takes $O (m n^{2}$ ).

Looking at subsets of vertices

Bellman-Ford uses paths with fixed numbers of steps.
Floyd-Warshall restricts with vertices can be used

Insert prof’s notes: Floyd-Warshall recurrence relation

Dude drew a matrix in class. Insert that. Then compute $D_{1}$ .

Examples are from CLRS.

Tue March 19 2024

Lecture 15: Polynomial Time Reduction

Only focus on decision problem in this module.

Decision Problem: Given a problem instance $I$ , answer a certain question “yes” or “no”.

Problem Instance: Input for the specified problem.

Problem Solution: Correct answer (“yes” or “no”) for the specified problem instance. $I$ is a yes-instance if the correct answer for the instance $I$ is “yes”. $I$ is a no-instance if the correct answer for the instance $I$ is “no”.

Size of a problem instance: $S i ze (I)$ is the number of bits required to specify (or encode) the instance $I$ .

Polynomial Reduction

$I_{A}$ is a YES instance of $A ⟺ F (I_{A}) = I_{B}$ is a YES instance of B.

We write $A \leq_{p} B$ , if such a polynomial time reduction exists. We also write $A =_{p} B$ if $A \leq_{p} B$ and $B \leq_{p} A$

Polynomial Time Reduction

A decision problem $A$ is said to be polynomial time reducible to another decision problem $B$ , denoted as $A \leq_{p} B$ , if there exists a polynomial time algorithm $F$ that can transform any instance $I_{A}$ of $A$ into an instance $I_{B}$ of $B$ such that:

If $I_{A}$ is a “yes-instance” of $A$ , then $F (I_{A}) = I_{B}$ is a “yes-instance” of $B$ .

If $I_{A}$ is a “no-instance” of $A$ , then $F (I_{A}) = I_{B}$ is a “no-instance” of BB.

In essence, a polynomial time reduction allows us to solve problem $A$ by transforming its instances into instances of problem $B$ , solving problem $B$ , and mapping the solution back to the original problem.

Prof’s notes:

The goal is to find a poly-time algorithm to convert (transform) input of a problem $A$ to inputs for problem $B$ .

Padlet Activity: Assume $A$ and $B$ are decision problems and we are given:

an algorithm $A l g_{B}$ , which solves $B$ in polynomial time,
a polynomial reduction $F$ , which gives $A \leq pB$ Design a polynomial time algorithm to solve $A$ .

Input an instance $I_{A}$ of $A$ . Output whether $I_{A}$ is a yes-instance

Use $F$ to transform $I_{A}$ into $I_{B} = F (I_{A})$ which is an instance of $B$ .
return $A l g_{B} (I_{B})$

The cost of returning $I_{B}$ is already considered. So if the cost is $P (n)$ then $∣ I_{B} ∣ \leq P (n)$ .

Polynomial time refers always to the input. So here, it is in respect to $n$ ?

Assume $S i ze (I_{A}) = n$ . If the runtime of $F$ is $p (m)$ (p is polynomial) then the output $F (I_{A})$ has the property: $S i ze (F (I_{A}) \leq p (n)$ . (since the algorithm $F$ ) has to return $F (I_{A})$ .

The total runtime: If $A l g_{B}$ has runtime $q (m)$ (polynomial) the total runtime is in $O (q (p (n)))$

Here $I_{B}$ is polynomial in terms of input of $B$ but we want to relate it to the input of $A$ apparently.

Analysis

Size of Input Instances: When we discuss polynomial time complexity, it’s indeed in terms of the size of the input. If $S i ze (I_{A}) = n$ , it means $I_{A}$ has $n$ bits.

Transformation by F: The polynomial-time reduction algorithm $F$ transforms $I_{A}$ into $I_{B} = F (I_{A})$ . Since $F$ runs in polynomial time, $∣ I_{B} ∣ \leq p (n)$ , where $p$ is a polynomial function.

Runtime of $A l g_{B}$ : The algorithm $A l g_{B}$ can solve problem $B$ in polynomial time. Let’s say its runtime is $q (m)$ , where $m$ is the size of the input for $B$ .

Total Runtime: When we apply $A l g_{B}$ to $I_{B}$ , the size of the input is $∣ I_{B} ∣$ , which is bounded by a polynomial in terms of $n$ . Thus the total runtime is $O (q (p (n)))$ . Approach is correct, that is to solve problem $A$ using the polynomial-time reduction $F$ and the algorithm $A l g_{B}$ for problem $B$ . The total runtime is indeed $O (q (p (n)))$ , where $q$ is a polynomial runtime of $A l g_{B}$ , and $p$ is a polynomial bounding size of the transformed instance $I_{B}$ .

Transitivity Of Polynomial Time Reductions

Lemma

$A \leq_{p} B$ and $B \leq_{p} C \Rightarrow A \leq_{p} C$

Proof:

$A \leq pB \to$ There exists a poly- time algorithm $F$ which maps instances of $A$ to $B$ .
$B \leq pC \to$ There exists a poly-time algorithm $H$ which maps instances of $B$ to $C$ .
What does $Ho F$ (Composition of $H$ and $F$ ) do?

Insert picture

$H \circ F$ maps instances of $A$ to $C$ in poly-time. It also respects the property on yes/no instances.

Sanity Check

The composition of $H$ and $F$ , denotes as $H \circ F$ , essentially applies the transformation $F$ first, followed by the transformation $H$ . In other words, it maps instances of problem $A$ to instances of problem $C$ .

Proving Hardness Using $A$ Polynomial Reduction

Activity: Assume problem $A$ is known to be impossible to be solved in polynomial time. True/False: $A \leq_{p} B \Rightarrow B$ Cannot be solved in polynomial time.

The answer is True. Assume that $B$ can be solved in poly-time with an algorithm. $A l g_{B}$ . Use $A l g_{B}$ for solving $A$ by reducing it to $B$ . But it is impossible.

Now look at the notation. $A \leq_{p} B$ . B is at least as hard as A, in other words, if I could solve $B$ , I could solve $A$ . B is harder to solve.

Summary:

In computational complexity theory, when we say that problem $A$ is polynomial-time reducible to problem $B$ (denoted as $A \leq_{p} B$ ), it means that we can efficiently transform instances of problem $A$ into instances of problem $B$ in polynomial time. In essence, this reduction implies that if we could solve problem $B$ efficiently (in polynomial time), then we could also solve problem $A$ efficiently by first transforming its instances into instances of BB and then using the polynomial-time algorithm for $B$ to solve them.

ahhhh

Assumption: Let’s assume that problem $A$ is known to be impossible to be solved in polynomial time. This implies that there is no polynomial-time algorithm capable of solving $A$ .

Implication: Now, if we have a polynomial-time reduction from $A$ to $B$ (i.e., $A \leq_{p} B$ ), and we assume that $B$ can be solved in polynomial time, it would imply that we could solve $A$ in polynomial time as well. This is because we could transform instances of $A$ into instances of $B$ using the polynomial-time reduction and then solve them using the assumed polynomial-time algorithm for $B$ .

Contradiction: However, this contradicts our initial assumption that problem $A$ cannot be solved in polynomial time. If $A$ cannot be solved in polynomial time, and we have a polynomial-time reduction from $A$ to $B$ , then $B$ also cannot be solved in polynomial time. This is because if $B$ were solvable in polynomial time, we would be able to solve $A$ in polynomial time, which contradicts our assumption about the hardness of $A$ .

Therefore, if problem $A$ is polynomial-time reducible to problem $B$ and problem $A$ is known to be unsolvable in polynomial time, it follows that problem $B$ cannot be solved in polynomial time either. This demonstrates how we can use polynomial reductions to infer the hardness of problems and establish relationships between their complexities.

Simple Reductions

The following three problems are equivalent in terms of polynomial-time solvability.

Maximum Clique

$S \subseteq V$ is a clique if, ${u, v} \in E$ for all $u, v \in S$ .

Input: $G = (V, E)$ and integer $k$

Output: (the answer to the question) is there a clique in $G$ with at last $k$ vertices?

We’ve seen this Independent sets…

Maximum Independent Set (IS)

$S \subseteq V$ is an independent set if, ${u, v} \in / E$ for all $u, v \in S$ .

Input: $G = (V, E)$ and an integer $k$

Output: (the answer to the question) is there an independent set in $G$ with at least $k$ vertices?

Minimum Vertex Cover (VC)

$S \subseteq V$ is a vertex cover if, ${u, v} \cap S \neq = 0$ for all ${u, v} \in E$ .

Input: $G = (V, E)$ and an integer $k$

Output: (the answer to the question) is there a vertex cover in $G$ with at most $k$ vertices?

$Cl i q u e \leq_{p} I S$ and $I S \leq_{p} Cl i q u e$

Finding a set of vertices with all edges in between
Finding a set of vertices with no edges in between

Idea: change edges to no edges. (Find complement of the graph).

Complement graph: $\overset{ˉ}{G} = (V, \overset{ˉ}{E})$ , $∣ u, v ∣ \in E$ if and only if $∣ u, v ∣ \in / \overset{ˉ}{E}$

$F$ is the algorithm to construct the complement of all the input $G$ . Clearly it can be done in poly-time and one can check that $S \subseteq V$ is a clique in $G ⟺ S$ is an independent set in $\overset{ˉ}{G}$ .

Hence ${G, k}$ is a yes-instance for clique $⟺ {\overset{ˉ}{G}, k}$ is a yes-instance for IS.

Summary key points

Clique (Maximum Clique): Involves finding a subset of vertices where every pair of vertices is connected by an edge.

Independent Set (Maximum Independent Set): Involves finding a subset of vertices where no pair of vertices is connected by an edge.

Vertex Cover (Minimum Vertex Cover): Involves finding a subset of vertices such that every edge has at least one endpoint in the subset.

basically, minimum set of vertices, so it covers all the edges in the graph.

$Cl i q u e \leq_{p} I S$ : By constructing the complement $\overset{ˉ}{G}$ , where edges become non-edges and vice versa, we can transform instances of Clique into instances of IS. This reduction preserves the existence of cliques and independent sets.

$I S \leq_{p} Cl i q u e$ : Similarly, we can perform the same transformation to show that instances of IS can be transformed into instances of Clique, thus establishing the equivalence.

Conclusion: We can establish polynomial-time reductions between Clique, Independent Set, and Vertex Cover problems by utilizing the complement graph transformation.

Lemma

Assume $G = (V, E)$ is a graph.

$S \subseteq V$ is a vertex cover $⟺ V - S$ is an independent set in $G$ .

Too tired to write what he’s writing. Include picture of his proof I guess.

Prof’s proof of the lemma:

Assume $S$ is a vertex cover in $G$ . We claim that $V ∖ S$ is an IS. If it is not true, then we have $x, y \in V ∖ S$ such that ${x, y} \in E$ . By definition of a CV, at least one of $x$ or $y$ must be in $S$ . Which is a contradiction.
Assume that $V ∖ S$ is an independent set. We claim $S$ is a vertex cover. If not, there is an edge ${x, y} \in E$ such that $x \in / S$ and $y \in / S$ . This means that $x, y \in V ∖ S$ and there is an edge between them, which is a contradiction.

Ok so let me wrap my head around this. Proof:

$\Rightarrow$ : $S$ is a Vertex Cover $\Rightarrow$ $V - S$ is and Independent Set
- We claim that $V - S$ is an independent set in $G$ .
- If $V - S$ is not an independent set, then there exists vertices $x, y \in V - S$ such that ${x, y} \in E$ .
- However, by the definition of a vertex cover, at least one of $x$ or $y$ must be in $S$ , otherwise, the edge ${x, y}$ would not be covered by $S$ .
- This contradiction implies that $V - S$ must be an independent set.
$\Leftarrow$ : $V - S$ is an Independent Set $\Rightarrow S$ is a Vertex Cover
- Assume $V - S$ is an independent set in $G$ .
- We claim that $S$ is a vertex cover of $G$ .
- If $S$ is not a vertex cover, then there exists and edge ${x, y} \in E$ such that $x \in / S$ and $y \in / S$ .
- However, this contradicts the assumption that $V - S$ is an independent set, as $x$ and $y$ would both be in $V - S$ and adjacent, violating the independence property.
- This contradiction implies that $S$ must be a vertex cover.

Thu Mar 21 2024

Continues on slide 11/15. He was writing something…?

$V C \leq_{p} I S$ and $I S \leq_{p} V C$

The previous lemma shows that:
$G$ has a vertex cover of size at most $k ⟺ G$ has an independent set of size at least $n - k$ .
So, our reduction algorithm maps ${G, k}$ for VC to ${G, n - k}$ for IS.
It runs in poly-time and maps yes/no instances for VC to yes/no instances of IS (respectively).
The above results + transitivity shows that

Cl i q u e =_{p} I S =_{p} V C

Summarize main points

Equivalence between vertex cover (VC) and independent set (IS)

The lemma establishes a clear equivalence between the existence of a vertex cover of size at most $k$ and the existence of an independent set of size at least $n - k$ in a graph $G$ .

This equivalence allows us to construct a polynomial-time reduction from Vertex Cover to Independent Set and vice versa by transforming instances of one problem into instances of the other while preserving the yes/no answers.

By applying the transitive property of polynomial reductions, we can conclude that the complexity of Vertex Cover, Independent Set, and Clique problems are equivalent, denoted as: $Cl i q u e =_{p} I S =_{p} V C$

Implications:

This equivalence implies that if any of these problems can be solved efficiently (in polynomial time), then all of them can be solved efficiently. Similarly, if any of them is proven to be NP-hard, then all of them are NP-hard.

More Simple Reductions

Hamiltonian Cycle (HC)

A cycle is Hamiltonian Cycle if it touches every vertex exactly once. Input: Undirected graph $G = (V, E)$ Output: (the answer to the question) Does $G$ have a Hamiltonian Cycle?

Hamiltonian Path (HP)

A path is a Hamiltonian Path if it touches every vertex exactly once. Input: Undirected graph $G = (V, E)$ Output: (the answer to the question) Does $G$ have a Hamilton path?

Proposition

$H P =_{p} H C$

Proof: $H P \leq p H C$

Note that he is not trying to solve it. Merely trying to reduce it so by finding an input for the RHS, we can find the input for the LHS.

If he adds an edge, his life would be good. Goal is to convert. Not to solve the problem?

The programming question is going to be similar to what the prof is going to do here.

If you have a large network, use MST.

He creates a node that connects to every vertex in $G^{'}$ including u and w.

Given $G = (V, E)$ for HP we have to transform it to $G^{'} = (V^{'}, E^{'})$ for HC.
We construct $G^{'}$ in the following way:
add a vertex $s$ to $V$ : $V^{'} = V \cup {s}$
add edges $(s, x)$ for $x \in V$
If you give me graph G, convert it using F, to G’, just one extra vertex $s$ which is connected to everybody (insert the graph)
It is easy to see that $F$ runs in poly-time.

Reduction from HP to HC

Easy to understand.

To perform the reduction, the algorithm adds a new vertex $s$ to the graph $G$ , which is connected to every vertex in $G$ , including the endpoints of the desired path in $G$ . This new graph $G^{'}$ is then constructed such that $G^{'}$ has a Hamiltonian cycle if and only if $G$ has a Hamiltonian path.

Implications:

The reduction algorithm $F$ constructs $G^{'}$ from $G$ in polynomial time.

The transformation ensures that $G$ has a Hamiltonian path if and only if $G^{'}$ has a Hamiltonian cycle, establishing the equivalence between the HP and HC problems

Claim: $G$ has a hamiltonian path if and only if $G'$ has a hamiltonian cycle.

Proof:

$\Rightarrow$ Assume $p$ is a Hamiltonian path in $G$ , with end points $u, w$ . Then $P + s u + s w$ is a Hamiltonian cycle in $G^{'}$ .
$\Leftarrow$ Assume $C^{'}$ is a Hamiltonian cycle in $G^{'}$ . There must exist two incident edges on $s$ , name them $s u$ and $s w$ . Then by removing $s u$ and $s w$ from $C^{'}$ , we get $C$ which is a Hamiltonian path in $G$ .

Pretty straightforward.

Now, let’s see $H C \leq p H P$ .

Reduction is not about solving, but about reducing. Let’s say we have $G^{'}$ , and a path from u to w. Instead, he’s creating a hamiltonian path that starts and end on the two vertex he adds which is of degree one. If we want to add those two points what do we do.

So given a graph $G$ , you pick a random vertex $x$ , you want to create $x^{'}$ with the property being a copy of $x$ . So whoever was a neighbour of $x$ would be a copy of $x^{'}$ in $G^{'}$ . This can be done in polynomial time. Super easy. We also want to create those two nodes mentioned earlier. So we create vertex $t$ and $t^{'}$ with degree 1. This can be done in polynomial time. $t^{'}$ is connected to $x^{'}$ and $t$ is connected to $x$ . If we have a hamiltonian cycle in $G^{'}$ , then $x$ should be part of it, and some $u$ and $w$ . We can construct a hamiltonian path from $t^{'}$ to $x^{'}$ to $u$ , then to $w$ then got to $x$ then $t$ .

Prof’s notes:

Assume $G = (V, E)$ is given for HC. We construct $G^{'} = (V^{'}, E^{'})$ in the following way: Choose an arbitrary vertex $x \in V$
- add a duplicate $x^{'}$ of $x$
- add vertices $t^{'}$ and $t$ (with degree one) with edges $t^{'} x^{'}$ and $t x$ .
- It is a polynomial time computation

Proof outline (sanity check)

$\Rightarrow$ :

Assume $G$ has a Hamiltonian cycle.

Choose an arbitrary vertex $x$ from $G$ .

Construct a graph $G^{'}$ by adding a duplicate vertex $x^{'}$ of $x$ and two new vertices $t$ and $t^{'}$ with degree 1.

Connect $x$ to $t$ and $x^{'}$ to $t^{'}$ .

Since $G$ has a Hamiltonian cycle, $x$ must be part of it.

Hence, a Hamiltonian path in $G^{'}$ can be constructed starting from $t^{'}$ , traversing $x^{'}$ , and then visiting the edges of $G$ in the order of the Hamiltonian cycle, ending at $t$ .

$\Leftarrow$ :

Assume $G^{'}$ has a Hamiltonian path.

If $G^{'}$ has a Hamiltonian path, it means that there exists a path that visits all vertices of $G^{'}$ exactly once.

Since $t^{'}$ and $t$ are of degree 1 and adjacent to $x^{'}$ and $x$ respectively, this path must start at $t^{'}$ and end at $t$ .

Therefore, the vertices visited in between $t^{'}$ and $t$ correspond to the vertices of $G$ .

Hence, there exists a Hamiltonian cycle in $G$ .

Claim: $G$ has a Hamiltonian cycle if and only if $G'$ has a Hamiltonian path

Proof:

$\Rightarrow$ Assume there is a Hamiltonian cycle in $G$ and we want $G^{'}$ . So we still have Hamiltonian cycle.
- Of course, there are other vertices in the graph that he doesn’t need to draw because we care about the cycle.
- Then in $G^{'}$ , we see that in red, we have a Hamiltonian path. $P + x^{'} u + t^{'} x^{'} + x w + x t$ is a Hamiltonian path
$\Leftarrow$ Assume there is a Hamiltonian path in $G^{'}$ . It should be of the form (insert picture). Basically $t^{'} x^{'} y x t$ . There must be an edge from x to y, because we have an edge from $x^{'}$ to y.
- The two endpoints must be $t^{'}$ and $t$ (they are degree one vertices). $x^{'}$ has two neighbours, $t^{'}$ and a vertex $y$ . We also know that $x y$ is an edge in $G$ as $x^{'}$ is a copy of $x$ . So, by removing $t^{'} x^{'}, x^{'} y, t x$ and adding $x y$ , we form a Hamiltonian cycle.

If we don’t have $t$ and $t^{'}$ , something like this may happen (drawn in the box, but he fucking erased it, said something like he was gonna do a counterexample). We have a hamiltonian path here (in the cycle which is the transformed graph), but we can’t create a hamiltonian cycle. So we need $t$ and $t^{'}$ to start and end outside (where we want???).to-understand fully. It makes sense, but I’m trying to see if

An Important Problem

$X = {x_{1}, ..., x_{n}}$ where $x_{i}$ ‘s are Boolean variables
A literal (term) is either $x_{i}$ or $\overset{x_{i}}{ˉ}$
A clause is a disjunction of distinct literals, $t_{i} \lor t_{2} \lor ... \lor t_{l}$ where $t_{i} \in {x_{1}, ..., x_{n}, \overset{x_{1}}{ˉ}, ..., \overset{x_{n}}{ˉ}}$ . We say that the clause is of length $l$ .
an assignment satisfies a clause $C$ if it causes $C$ to evaluate to true.
a conjunction of a finite set of clauses, $C_{1} \land C_{2} \land ... \land C_{k}$ is called a formula in Conjunctive Normal Form (CNF).

3-SAT (Theorem Cook-Levin)

Input: A CNF-formula in which each clause has at most three literals.

Output: (the answer to the question) is there a truth assignment to the variables that satisfies all the clauses?

the distinct literals, $t_{1},$ or $t_{2}$ ,… are all in $t_{i}$

He gives an example, which gives a true value:

How doe we get each of these $C_{i}$ to be true. We can just grab one of the $C_{i}$ and checks if it is True?

Theorem

3-SAT $\leq_{p} I S$ .

Let’s assign $x_{1} = T$ , then take $\overset{x_{2}}{ˉ}$ to be True, therefore $x_{2} = F$ . In $C_{3}$ , we can only take $x_{3}$ and assign it to be true. Every time, he takes one out of the clause and make sure it doesn’t already have any connection with previous picks, no relations between these. Translate this in a graph, being picking a vertex such that each of the vertex we pick has are not connected to each other.

The thing he drew down there is called a first type edge. For each clause, create the number of nodes and connect each other. Then for second one, we connect them again. (If a clause only has one vertex, we don’t have to draw it or something.) For every single clause, we create a graph, not connected at first. We connect the clauses if they are negations of each other as seen below. (continue doing this).

Proof Outline

Reduction from 3-SAT to IS:

Given a 3-SAT formula, construct a graph where each variable $x_{i}$ corresponds to a vertex in the graph.

For each variable $x_{i}$ , create two vertices: $x_{i}$ and $\overset{x_{i}}{ˉ}$ , representing the true and false assignments respectively.

Connect the vertices corresponding to the literals in each clause. If a clause contains a literal $x_{i}$ , connect the vertices corresponding to $x_{i}$ and $\overset{x_{i}}{ˉ}$ to the vertices corresponding to the other literals in the clause. This ensures that if one literal in the clause evaluates to true, the clause can still be satisfied.

Ensure that vertices representing literals within the same clause are not connected directly. This prevents assigning conflicting truth values to literals within the same clause. The resulting graph represents a collection of disjoint sets of vertices, each set representing a clause in the 3-SAT formula.

The objective is to find an independent set in this graph, where no two vertices in the set are adjacent. This corresponds to finding a truth assignment to the variables that satisfies all the clause in the 3-SAT formula.

Tues March 26

He started with the last slide of Lecture 15. Include the picture and his annotations.

Make sure to connect the nodes between the literals if they are opposites of each other.

Transform the CNF into an independent set. There are two inputs, first input is the graph, then input integer k. And set k to the number of clause.

Take the true literals. If we choose $\overset{x}{ˉ}$ to be true in the first clause. What’s the whole thing about the k set of independent set??

Prof’s notes:

A CNF with clauses of length at must 3 is given. For each clause we form a triangle with vertices labeled as the literals of the clause.
A clause with 2 literals will be an edge from one to the other
A clause with only one literal is easy to deal with.
To force exactly one choice from each clause, we set $k$ to be the number of clauses.
We have to make sure that we are not choosing opposite literals $x$ and $\overset{x}{ˉ}$ in different clauses. So, we create an edge between any two vertices that corresponds to opposite literals.
This contradiction takes polynomial time.

Summary

Graph Construction

For Each Clause:

For each clause in the CNF formula, create a triangle in the graph.

Each vertex of the triangle represents a literal in the clause.

Ensure that the literals within the same clause are not directly connected to each other to prevent conflicts.

Handling Clauses with 2 Literals:

For clauses with only two literals, create an edge between the two corresponding vertices in the graph.

Handling Clauses with 1 Literal:

Clauses with only one literal are easy to handle as they represent a single vertex in the graph.

Connecting Opposite Literals:

To ensure that opposite literals (e.g., $x$ and $\overset{x}{ˉ}$ ) from different clauses are not chosen together, create and edge between any two vertices representing opposite literals.

Independent Set Construction

Each set of vertices in the graph represents a clause in the CNF formula.

The objective is to choose one vertex from each triangle (clause) such that no two chosen vertices are adjacent (i.e., represent literals in the same clause).

Setting $k$ to be the number of clauses ensures that exactly one vertex is chosen from each clause.

Claim

Suppose the CNF formula has $k$ clauses. The formula is satisfiable if and only if there is an independent set of size $k$ in the graph.

Proof:

$\Rightarrow$
- If there is a satisfying assignment, then we choose one literal that is set to true in each clause and the corresponding vertex will be in the independent set.
- Since the CNF is satisfiable, there is at least one true literal in each clause, and so the set has exactly $k$ vertices.
- The $k$ vertices form an independent set as there are no edges of the first type between them, since we choose only one literal vertex in each clause. Also there are no second type edges between the chosen vertices as we won’t choose both $x$ and $\overset{x}{ˉ}$ in the satisfying assignment.
$\Leftarrow$
- Assume there is an independent set of the size $k$ in $G$ .
- Any independent set can choose only one vertex from each clause, since there are edges between them. We have only $k$ clauses, so an independent set of size $k$ chooses exactly one vertex from each clause as there are edges between them.
- We have only $k$ clauses, so an independent set of size $k$ chooses exactly one vertex from each clause.
- Moreover, for each variable we choose at most one of literals $x_{i}$ or $\overset{x_{i}}{ˉ}$ , since there is a second type edge between them.
- If $x_{i}$ is chosen in the independent set, set $x_{i}$ to true, otherwise set it to false.
- This assignment satisfies the CNF.

Trying to understand explanation

Satisfiability of CNF formula:

A CNF formula with $k$ clauses is satisfiable if there exists a truth assignment to the variables that satisfies all clauses simultaneously.

Independent Set in Graph:

An independent set in the graph representation of the CNF formula corresponds to a selection of vertices (literals) from the graph such that no two selected vertices share an edge (representing conflicting literals within the same clause).

The size of the independent set represents the number of clauses selected from the CNF formula.

Relationship:

The claim asserts that the CNF formula is satisfiable if and only if there exists an independent set of size $k$ in the graph.

If the CNF formula is satisfiable, then there exists a truth assignment satisfying all clauses, which translates to the existence of an independent set of size $k$ in the graph.

Conversely, if there exists an independent set of size $k$ in the graph, then it corresponds to selecting one literal from each clause, ensuring that the selected literals satisfy all clauses simultaneously, thus proving the satisfiability of the CNF formula.

Lecture 16 - NP-completeness

Counting the number of bits.

Definition: NP

A problem $X$ is in NP if there is a polynomial time verification algorithm $A L G_{X}$ such that the input $S$ is a yes-instance iff there is a proof (certificate) $t$ which is a binary string of length poly( $∣ S ∣$ ) so that $A L G_{X} (S, t)$ returns yes.

Trying to Understand

Verification Algorithm:

For a problem $X$ to be in NP, there must exist a verification algorithm $A L G_{X}$ that runs in polynomial time.

This verification algorithm takes two inputs:

The problem instance $S$ .

A proof or certificate $t$ , represented as a binary string of length polynomial in the size of $S$ .

Polynomial Time Verification:

The verification algorithm $A L G_{X}$ verifies whether the given instance $S$ is a “yes-instance” by checking the validity of the proof $t$ .

If $S$ is indeed a “yes-instance” of the problem $X$ , there exists a proof $t$ such that $A L G_{X} (S, t)$ returns “yes” in polynomial time.

Relationship with Yes-Instances:

The problem instance $S$ is considered a “yes-instance” of $X$ if and only if there exists a valid proof $t$ such that $A L G_{X} (S, t)$ returns “yes”.

In other words, the existence of a valid proof $t$ serves as evidence that $S$ is indeed a “yes-instance” of $X$ .

The definition of NP highlights the significance of polynomial-time verification algorithms in determining whether a given problem instance belongs to the complexity class NP. Problems in NP are characterized by the existence of efficient verification procedures that can confirm the correctness of proposed solutions, given suitable proofs or certificates.

Examples

Example 1: Vertex Cover

$S$ here is an input graph $G = (V, E)$ and an integer $k$

$t$ here is a subset $U$ of $V$ with $∣ U ∣ \leq k$ $A l g_{v} (S, t)$ : go through all $E$ and check if $t$ covers the edges and $t \leq k$ .

Example 2: 3-SAT

$S$ here is a 3-SAT formula

$t$ here is a truth assignment $A l g_{v} (S, t)$ : check whether $t$ satisfies all clauses.

Exercise: Clique, IS, HC, HP, Subset-Sum are all in NP.

$t$ is a solution to that problem (usually)
So is $S$ some sort of instance of a 3-SAT formula or what????gap-in-knowledge
Is 3-SAT NP?
- Yes, it is an NP problem
Not all problems are in NP

See slide 11 for Subset-Sum:

You can now refer to slide 11 without showing any proof? And allowed to use the information and do the assignment.

We won’t see a lot of co-NP.

Definition

$P$ is the set of decision problems that can be solved in polynomial time.

P \subseteq NP

if you solve this problem, you can generate the certificate of that.

NP-completeness

NP-complete

A problem $X \in NP$ is NP-complete if $Y \leq_{p} X$ for all $Y \in NP$ .

Fact: $P = NP ⟺$ an NP-complete problem can be solved in polynomial time.

Theorem (Cook-Levin)

3-SAT is NP-complete.

Consequences:

If we can prove 3-SAT $\leq_{P} X$ , then $X$ is NP-complete. For example $I S \in NPC$ , since 3-SAT $\leq_{P} I S$ .
To prove that a problem $X \in NP$ is NP-complete, we just need to find a NP-complete problem $Y$ , and prove that $Y \leq_{P} X$ .

Understanding...

Definition of NP-Completeness:

A decision problem $X$ is said to be NP-complete if it belongs to the complexity class NP and every problem in NP can be reduced to $X$ in polynomial time.

In other words, if there exists a polynomial-time reduction from any problem in NP to $X$ , then $X$ is NP-complete.

Implications of NP-completeness:

NP-complete problems are among the hardest problems in NP, as they capture the computational complexity of all problems in NP

If a polynomial-time algorithm can be found for any NP-complete problem, it would imply that $P = NP$ , meaning every problem in NP can be solved in polynomial time.

Cook-Levin Theorem:

The Cook-Levin theorem states that the 3-SAT problem is NP-complete.

This theorem is significant as it provides a foundational example of an NP-complete problem and demonstrates that 3-SAT captures the computational complexity of all problems in NP.

Consequences: Identifying NP-Complete Problems

To prove that a problem $X$ is NP-complete, one needs to demonstrate a polynomial-time reduction from a known NP-complete problem $Y$ to $X$ .

Solving NP-Complete Problems:

Solving an NP-complete problem efficiently would imply that $P = NP$ , which is a major open question in computer science

Therefore, many researchers focus on identifying efficient algorithms for specific NP-complete problems or approximating solutions to these problems.

$A \leq pB$

If we can solve B in polynomial time, we can solve A in polynomial time. Then if B is the hardest problem, every single NP problem can be reduced to B.

If we can solve an NP efficiently,

Every NP problem can be reduced to max independent set.
3-SAT is proven to be reducible to IS

On exam:

Show that your problem is NP

If there exist an NP complete problem that can be reduced to yours.

Then it is NP complete

He created a red box? around $v$ true?

What’s the only thing we know about 3-SAT? There exist a polynomial time that we can provide a certificate.

so in this case, we found that circuit-sat is NP complete, then if we show the circuit-sat is reducible to 3-sat, then all the np problems
This is something different

We want $y_{1}$ and $y_{2}$ to not be in the set at the same time and the second clause checks that

Thu March 28: Goes back a couple slides and continues from slide 6/11

Given a certificate $t$ ,
Need to first check if the size $\geq 2$ . If so, it meets the first condition. Among y1 and y2, one of them at most can be in my set.
If second condition is set to T, then initially, y1 and y2 are false.
This is a no instance.
What exactly is $k$ ?

Get an Instance of and IS, which is a graph + a integer ( $S$ ? I think) transform into a circuit. look at algorithm that verifies it. A circuit sat is np Use certificate $t$ and circuit for ALG that takes in $S$ and $t$ to compute a logical formula can be translated into a circuit, that would be my circuit

From logic we know $P$ if and only $q =$ [(p or not q) and (not p or q)]

We want to reduce Circuit-SAT into 3-SAT
Note this slide doesn’t write conjunctions

Insert picture he drew (slide).

Insert the second slide he drew.

He is basically translating gates to CNF.

write out the CNF

(include annotated slide instead over here)

I’m very confused. Will need to review and understand this set of slides.

This module:

prof takes care of hard part, we take care of easier part
Next two sets of slides, you just need to remember the results for the final!!!!!!
NP problems on the exam would be hard. Need to solve the np problem by yourself. And understand it

Tip, hint for future?

Hint for future: Which np complete prpblem to choose, usually when we don’t know which to start. Start with 3-SAT. Minimum set of requirements. And try to reduce 3-SAT to other problems.

Lecture 17 - NP-completeness Part 2

Starting the construction

Colour is not important, be able to recognize which part of the construction it is instead?

(include slide 3/9)

For each clause, we create 3 nodes (2 sets?). Make sure that you go from left to right or from right to left. We enforce that there is no edges in between. So we have the option to explore from left to right or right to left, onto the next row or something.
Create a source node and sink node
He still want a hamiltonian cycle, so from t you can go to s
Think about how he is constructing it, not why for now. 3 nodes per clause for each variable. $x_{1}, x_{2}, x_{3}$ are variables.
Problem is to reduce 3-SAT to directed hamiltonian cycle

Observations:

Do you see any hamiltonian cycles here? There exist $2^{n}$ hamiltonian cycles here. No matter which CNF given.

Write very short essay trying to explain what happened: try it

Learn efficient techniques. Learn how to learn more efficiently.

Tue April 2

Didn’t attend. Ask for notes. Bless mattloulou.todo

Thu April 4

Started Lecture 18.

sets are disjoint
include his drawings

Trying to form a gadget? We have a 3-SAT problem that we want to perform reduction. For each variable, we want to create a gadget.
Line of vertices per variable. Lines of nodes he was creating for the Hamiltonian cycle thing in previous lecture probably.
Include what he drew again.
Encode true and false depending on the moving direction (left to right or vice versa). Purpose of last lecture.
CNF is satisfiable if and only if it has perfect matching??
n and s are input sizes?
Include the drawing (aka two eggs)
- If you can solve B you can solve A
- Doesn’t care if he has injective map or … something else, don’t care direction from B to A. The two directions bottom right. yes instance

do this per variable. we have s clauses.
per clause we create two nodes (core vertices?) first number referring to variable, t version and y version?
z, odd in x, even in y
hypreedges: he drew 4 weird doritos that represents the fidget spinner gadget

how do you pick hyperedges? from the total construction
- 2 ways
  - take all hyperedges including t version or the false one
  - he picks the top one on the left. can’t take left and right because they share nodes. Then he can only choose the the bottom one.

T version of the tip to a1 and b1
If i have negation of something. connect a1 b1 to the F version.
x1 and x2 and x3 are set to F.
- We can say one thing, does these True assignment satisfy my assignment?
- Yes, we have a true assignment that satisfy my CNF.

C1: x3 is the one causing something
C2: x1 is the one making sure
There are tips that are not covered. So we need more vertices!!!!!!!!!!
- For each of these tips, we need to figure out how many are left. Total is 2ns = 12 tips
- s of them are covered for sure.
- 10 tips left at most not picked can be calculated using 2ns-s = 10. This is the upper bound.
- We can create a dummy vertices for something…

make sure to pick exactly one! per clause
so apparently we do have perfect matching
Now from the other direction: they satisfy too. Just read the bullet points.

todo go home. He is not going to ask any proof. Good to understand the results.

3D matching is in NP. Certificate is a potential solution. Algorithm checks if we visit every single edge in the hyperedge exactly once. Can be done in poly-time.
certificate is a subset of these: in the slide.
Now is it NP-Complete

Include his drawing

must be true, only way to get 1, is to get a v1 in that position.
We are picking some of these vectors
If we have a 3D matching we are covering every vertex exactly once.
These are not numbers, but they are vectors so far.
In CS240, we saw that we can transform vector to number?

choose base b to be m + 1. (for a reason)
m is the number of hyperedges.
so far for each vector, we create a number. Translate hyperedges to numbers. Forming a vector, converting to a number are both polynomial time.
How large can $a_{j}$ be? Having everything as 1.
The formula at the end: (m+1)^3n+1 will always be larger. Bounded by it.

What is K
If we have with cj =number of vi’s in S with vi,j =1
- If this equality holds, then the later ones are easier to see.
- Remember base is m+1 (need to be)
  - It is possible to choose m vectors and the sum is 1. So we make sure of s

The End. He showed this kung fu panda clip with master oogway and shifu.

🪴 Avril Chen

Explorer

CS341: Algorithms

Final - Tuesday April 16 2024

Concepts

Lecture 3 - Divide an Conquer

Example - Counting inversions

Lecture 4 - Divide and Conquer

Lecture 5 - Breadth First Search

Some definitions

BFS

Time Complexity

BFS Tree

Shortest paths from the BFS tree

Lecture 6 - Depth-First Search

Cut vertices

Lecture 7 - Directed Graphs

From a DFS forest

Structure of directed graphs

Lecture 8 - Greedy Algorithms

Interval Scheduling

Greedy Strategies

Algorithm: Interval Scheduling

Correctness: The Greedy Algorithm Stays Ahead

Interval Colouring

Algorithm: Interval Colouring

Minimizing Total Completion Time

Lecture 9 - Dijkstra’s Algorithms

Preliminaries

Dijkstra’s Algorithm: Explanation

Lecture 10 - Minimum Spanning Trees

Kruskal’s Algorithm

Augmenting sets without cycles

Properties of the output

Exchanging edges

Correctness: exchange argument

Merging connected sets of vertices

Data Structures

Lecture 11 - Dynamic Programming

Dynamic Programming

The 0/1 Knapsack Problem

Algorithm

Lecture 12 - Dynamic Programming - Part 2

The Longest Increasing Subsequence Problem

Tentative subproblems

A more complicated recurrence

Iterative Algorithm

Longest Common Subsequence Problem

A bivariate recurrence

Lecture 13 - Dynamic Programming - Part 3

Edit Distance

Optimal Binary Search Trees

Lecture 14 - Dynamic Programming - Part 4

The Bellman-Ford Algorithm

Lecture 15: Polynomial Time Reduction

Polynomial Reduction

Transitivity Of Polynomial Time Reductions

Proving Hardness Using A Polynomial Reduction

Simple Reductions

More Simple Reductions

An Important Problem

Lecture 16 - NP-completeness

Examples

NP-completeness

Lecture 17 - NP-completeness Part 2

Starting the construction

Graph View

Table of Contents

Backlinks

Proving Hardness Using $A$ Polynomial Reduction