CS341

Study CS341

Final W2021

1

  • True or False. Given an array A of n integers, there is no linear time algorithm to find the k-th smallest element in A. Briefly justify your answer.
    • False. There are linear time algorithms to find the k-th smallest element in an array A of n integers, such as the Quickselect Algorithm. QuickSelect is an efficient algorithm based on the quicksort paradigm, which can find the k-th smallest element in expected linear time O(n) on average. It achieves this by partitioning the array similar to quicksort but only recursively exploring the partition that contains the desired element, effectively reducing the search space by half at each step.
  • True or false. If a tree T is both DFS tree and BFS tree of an undirected graph G, then G=T. Briefly justify your answer.
    • True. Suppose , then can’t be a tree (otherwise ). There will be a cycle in . Say it is . We claim that the BFS and DFS will be different. WLOG, say now visit in BFS and DFS, and it’s the first vertex we encounter in the loop.
    • BFS will give a structure: , all the way to . However, DFS will branch at . Thus, the BFS and DFS can’t equal in this case.
    • If there exists a cycle in the graph G, then BFS and DFS will produce different structures for the tree T. This is because BFS explores vertices level by level, while DFS explores vertices along paths, and encountering a cycle will lead to branching in DFS. Therefore, if G contains a cycle, T cannot be both the DFS tree and BFS tree of G. So, the statement is indeed true.

2. Short Answer

  • b) Given a sorted array of distinct integers , design an time algorithm to check whether there exists an index such that . Briefly describe your idea, then provide your pseudocode. You don’t need to prove correctness or analyze run-time.
    • Use similar approach as Binary Search. Look at the mid element first. If , return true. If , then repeat on larger half, otherwise, perform on lower half.
    • Pseudocode
A[1..n]
i=1, j=1
while(i <= j)
	mid = (i+j)/2
	if A[mid] = mid
		return true
	else if A[mid]<mid then
		i = mid + 1
	else
		j = mid - 1
end while
return false

3. Greedy

Suppose we are given an array of positive integers and a cap value . We want to partition the array into minimum number of contiguous blocks such that each block has sum at most .

Example. and . Then, one optimal solution is to divide into 3 blocks: , where the sums of the blocks are .

  • a) Consider a greedy strategy where we repeatedly select the longest contiguous subsequence with a sum at most that does not overlap with the previously selected blocks. Provide an example to show that this greedy strategy does not always produce an optimal solution.

    • Here is a counterexample, let’s say we have , with . The strategy given will separate it in 3 blocks: . However the optimal solution would be having 2 blocks: .
  • b) Design a Greedy Algorithm to solve this problem.

Pseudocode

def partition_blocks(A, C):
    blocks = []
    current_block = []
    current_sum = 0
    
    for a in A:
        if current_sum + a <= C:
            current_block.append(a)
            current_sum += a
        else:
            blocks.append(current_block)
            current_block = [a]
            current_sum = a
    
    blocks.append(current_block)
    
    return blocks
 
# Example usage:
A = [2, 7, 1, 4, 2, 2, 2, 2, 2, 3]
C = 10
print(partition_blocks(A, C))

Idea: The approach of this greedy algorithm is to iteratively construct blocks such that each block has a sum at most , while minimizing the number of blocks used.

The idea is to traverse the array of integers and maintain a current block. At each step, we add elements from to the current block as long as the sum of the current block remains less than or equal to . If adding an element would exceed the sum , we finalize the current block, append it to the list of blocks, and start a new block with the current element.

This process continues until we have traversed all elements of . The resulting list of blocks represents the partition of into contiguous blocks, each with a sum at most . By constructing blocks greedily in this manner, we aim to minimize the total number of blocks required to satisfy the given constraint.

We start from the beginning of the array, then get the largest contiguous subsequence with a sum at most then repeatedly doing this until we reach the end.

Pseudocode:

A[1...n], C
start = 1, sum = 0
for end from 1 to n do:
	if sum + A[end] > C then:
		addBLock(A[start...end-1])
		sum = 0
		start = end
	end if
	sum = sum + A[end]
end for

The provided pseudocode outlines a greedy algorithm to partition the array AA into contiguous blocks with sums at most CC. Here’s a brief explanation of the approach:

  1. Initialize variables start=1start=1 (indicating the start index of the current contiguous subsequence) and sum=0sum=0 (indicating the sum of the current contiguous subsequence).
  2. Iterate over each element of the array AA from index 1 to nn:
    • If adding the current element to the current sum exceeds CC, it means that the current contiguous subsequence has reached its maximum sum. Therefore, add the current contiguous subsequence (from index start to end−1) as a block, reset the sum to 0, and update the start index to end.
    • Otherwise, add the current element to the sum.
  3. After iterating through all elements, add the remaining contiguous subsequence (if any) as the last block.

This algorithm effectively partitions the array into contiguous blocks with sums at most by greedily selecting the largest contiguous subsequence at each step. This approach ensures that the number of blocks used is minimized.

4. Dynamic Programming

We consider a variant of the 0/1-Knapsack Problem in Lecture 10. Suppose there are items that come in a sequence. Each item has a weight and a value (both and are nonnegative integers). You are given a knapsack of size (a nonnegative integer), and the task is to select a subsequence of items with increasing values to maximize the total value while satisfying the capacity constraint. Formally, we want to find a subsequence to maximize the total value subject to the constraints and . Example. Let be three items, and the capacity of the knapsack is . Although the capacity of the knapsack is enough for all three items, the optimal solution of this example can only pick the last two items, i.e., , with total value .

Design a dynamic programming algorithm to compute the optimal value of the above problem. Your algorithm should run in time polynomial in and (i.e., the maximum value of an item).

Your solution must consist of:

  • A description of the subproblems you will solve and the order in which you will solve them.
  • A formula for solving one subproblem in terms of previously solved subproblems, and a justification of correctness.
  • Pseudocode of your algorithm
  • An analysis of run-time using big-O notation.

Subproblems: : the optimal solution we can get with weight , and with item picked. This means is the highest value we’ve seen so far. We will solve them from bottom-up so .

The base case is trivial as we defined, since we cannot choose anything if weight limit is 0.

Consider , since we must choose , if , we cannot choose any item. If then we can only choose .

For , before we choose , the bag has a limit of . And, we will only consider previous item with to permit our property.

Consider the max value we can get: .

which is given by our formula.

  • is an optimal selection at some
  • second line: for

Then, to find the real optimal at item , we do for since if we choose some item, then the sequence must end with some item between and .

Pseudocode:

Use table M[0...n][0...W]
Initialize M[0,w]:= 0 for w=0...W
for i from 1 to n do:
	for w from 0 to W do:
		if w_i > w then do:
			M[i,w] = 0
		else do
			M[i,w] = v_i
			for k from 1 to i-1 do:
				if v_k <= v_i then do:
					M[i,w]:= max{M[i,w], M[k, w-w_i] + v_i}
				end if
			end for
	end for
end for

ans:= 0
for i from 1 to n do:
	ans:= max(ans, M[i,W])
return ans

The runtime is

  • : number of subproblems
  • : runtime to solve each subproblem
  • : final step

Wrapping my head around this

Subproblems:

  • Define subproblems as the optimal solution with weight ww and item picked, where vivi​ is the highest value seen so far.
  • Solve subproblems bottom-up, starting from .
  • Base case: for (no items can be chosen if weight limit is 0).

Formula for Solving Subproblems:

  • For , we cannot choose any item.
  • For ​, we can only choose item ii.
  • For , consider the maximum value achievable by including item ii or not:
    • where iterates over previous items with ​.

Pseudocode:

  • Initialize table with base cases.
  • Iterate over items ii and weights ww, updating using the provided formula.
  • Compute the maximum value achievable by considering for .

5. Graph Algorithms

In the Australian outback, imagine a network of feeding grounds for kangaroos. Each feeding ground is modelled by vertices, and if there is a trail between two feeding grounds it is modelled by an undirected edge. Each edge is weighted with the height of the tallest fence along the trail, in meters, or 0 if there is no fence at all.

Describe a preprocessing algorithm that produces in time a data structure for the following task: given a pair of feeding grounds and , determine in time whether or not it is possible for a kangaroo that is able to jump over fences of height at most to get from feeding ground to feeding ground (possibly going through other feeding grounds along the way). Note that all of and are arbitrary.

Hint: consider dynamic programming.

If you are using dynamic programming, your solution must consist of:

  • A description of the subproblems you will solve and the order in which you will solve them.
  • A formula for solving one subproblem in terms of previously solved subproblems, and a justification of correctness.
  • Pseudocode of your algorithm
  • An analysis of run-time using big-O notation.

Solution:

We will use a 2-d array to store the lowest fence weight among the highest fence height a of each path from to .

This question is equivalently asking if there exists a path such that the highest weight of an edge along the path is no greater than .

My understanding so far

  • Description of Subproblems: You define a subproblem as determining the lowest fence weight among the highest fence heights of each path from feeding ground uu to feeding ground vv.
  • Order of Solving Subproblems: The subproblems are solved using dynamic programming, where you compute the lowest fence weight among the highest fence heights for all pairs of feeding grounds uu and vv. This is done using a bottom-up approach.
  • Formula for Solving Subproblems: The lowest fence weight among the highest fence heights for a path from uu to vv can be computed by considering all intermediate vertices ww and selecting the maximum of the lowest fence weight among the highest fence heights for paths from uu to ww and from ww to vv, and the weight of the edge connecting uu and vv.

Subproblems: is the lowest among the largest weight of each path, with the condition that the path can only use vertices .

Suppose the modelled graph is . We solve subproblems for all as goes from 0 to .

  • , if .
  • , if
  • , otherwise

Formular

Proof: ? Do i need to do it?

Pseudocode:

Initialize M[u,v] as above
for i from 1 to n do:
	for u from 1 to n do:
		for v from 1 to n do:
			temp:= max{M[u,i], M[i,v]}
			M[u,v]:= min{temp, M[u,v]}
		end for
	end for
end for

input: u,v,h
	if M[u,v] > h: return false
	else: return true.
  • Iterate over all possible intermediate vertices from 1 to , where is the total number of vertices.
  • For each pair of vertices and , update by considering all possible intermediate vertices .
  • Compute the lowest weight of the heaviest edge along the path from to by taking the minimum of the maximum weights of the paths from to and from to , considering all possible intermediate vertices .
  • Once the preprocessing step is completed and values are computed for all pairs of vertices, you can answer queries in constant time.
  • Given a query with vertices and and a maximum fence height , check if is greater than . If it is, then it’s not possible for a kangaroo to jump from to with a maximum fence height of . Otherwise, it is possible.

Runtime: (triple loop, constant operation in each iteration)

OH I see.

From all the path , we look for the lowest of all the heaviest on all paths to .

We do this for every single path in the graph. Insane.

6. NP-completeness

Recall that a literal is either or for some Boolean variable . We say that literals in the form of are positive, and literals in the form of are negative.

Prove the following special case of TRUE-FALSE-SAT in Assignment 10 is NP-Complete.

POSITIVE-TRUE-FALSE-SAT

  • Input: Boolean variables , clauses , where each contains an arbitrary number of positive literals.
  • Output: Does there exist a True/False assignment to such that all the clauses contain literals with different truth values? (YES or NO)

Examples.

  • is a YES-instance, where the assignment is a YES-witness.
  • is a NO-instance. We can check that there is no assignment that can satisfy the required condition.

Solution:

To prove that POSITIVE-TRUE-FALSE-SAT is NP-Complete, we need to show two things:

  1. POSITIVE-TRUE-FALSE-SAT is in NP.
  2. POSITIVE-TRUE-FALSE-SAT is NP-hard, meaning every problem in NP can be reduced to it in polynomial time.

i) certificate: an assignment of verification: We check each clause with the assignment to see if it contains one true and one false. We can also check along the way that every literal is positive. If every clause has one true and one false, then YES. We only loop once so polynomial time.

ii) We use TRUE-FALSE-SAT

TRUE-FALSE-SAT POSITIVE-TRUE-FALSE-SAT

iii) Input Transformation: input: Boolean variables , clauses .

Construct an input as follows:

  • For each , we add an auxiliary variable and add a clause . Then check each clause .
  • If we found a negative literal say , then we substituted it with .

iv) Proof:

  • Assume there is an assignment such that each clause contains literals with opposite truth values.
    • Then wee use the same assignment and assign each to be . i.e. .
    • Then, for , since wee simply substitute with , and now , the assignments of these clauses remains the same. Therefore, there is at least one tru and one false in each .
    • Consider since . has exactly one true and one false.
    • Therefore, there exists an assignment , such that all the clauses contain literals with different values. i.e. a YES for POSITIVE-TRUE-FALSE-SAT.
  • Assume that there exists an assignment , such that all the clauses contain literals with different values.
    • For , since it contains literals with different values, then or in the assignment.
    • Then, for each , we replace with . Since contains literals with different values and , we are not changing any values in the substitution, thus each contains different values.
    • Therefore, a YES for TRUE-FALSE-SAT.

v) We only need to make a new variable and one new clause for each . And check each clause to substitute once. This can be done in polynomial time.

7. Exchange Algorithm for MST

We have seen that actually: https://piazza.com/class/lr45czd8jab7cz/post/1265, according to this piazza post…

Note:

This is intended to be a harder question. Your answer will be marked based on the elegance of your proof and the quality of your bound. If you are short of time, or having difficulties, we recommend skipping this question.

Given an undirected connected graph with vertices and edges. The edges in are associated with pairwise distinct edge weights . We consider the following exchange algorithm for finding Minimum Spanning Tree (which is less efficient than Kruskal’s Algorithm and Prim’s Algorithm).

Exchange
	Let T be an arbitrary spanning tree of G.
	while there exists (e,e') that admits a valid exchange and w(T - e + e') < w(T)
		T:= T - e + e'
	return T

Hint: you may find some minimum spanning tree properties in Assignment 6 useful.

  • a) Prove that Exchange returns a minimum spanning tree of when it terminates.
  • b) Show that the algorithm always terminates by providing an upper bound on the number of iterations in terms of and .

Midterm F2023

Problem 4 - Greedy

The input is an unsorted sequence of integers representing the values of some collectible assets, and a number of years . At the start of each year you get to collect one of theses assets, and add it to your investment portfolio. At the end of each year, the total value of your entire investment portfolio doubles. (It starts at 0)

So for example, if at the start of the year you have a portfolio that is worth 100 dollars, and you collect a new asset worth 20 dollars, at the end of the year your portfolio is worth 240 dollars.

    1. Give a greedy algorithm to solve this problem. It should output the value of your portfolio after years.
FindGreedyPortfolioValue(v, k)
v <- sort v by value of v_i in non-decreasing order
profit <- 0
for i from 1 to k do:
	profit = profit + v[i]
	profit = profit x 2
end for
return profit
    1. Analyze the runtime of your algorithm in the unit cost model.
    • Due to the for loop we have operations and the sorting take time. Thus the runtime is .
    1. Prove your algorithm is correct (i.e., optimal).

To prove the correctness of the algorithm, we need to show that it always produces the maximum possible portfolio value after years.

Let’s consider the algorithm’s strategy:

  1. Sorting: The algorithm sorts the assets in non-decreasing order of their values. This ensures that the assets with the highest values are collected first.
  2. Collecting Assets: For each year from 1 to , the algorithm collects the asset with the -th highest value available. After collecting an asset, the total value of the portfolio is doubled.

The correctness of the algorithm can be proven by induction on the number of years .

Base Case ( = 1):

  • When , the algorithm collects the asset with the highest value available. This results in the maximum possible portfolio value after 1 year, as there is only one asset to collect.

Inductive Step:

  • Assume that the algorithm correctly selects assets to maximize the portfolio value after years.
  • In the -th year, the algorithm collects the asset with the -th highest value available. Since the assets are sorted in non-decreasing order, this is the -th highest value overall. Therefore, the portfolio value after years is maximized by this selection.
  • Doubling the portfolio value at the end of each year ensures that the total value after kk years is maximized.

Since the algorithm selects assets in a way that maximizes the portfolio value after each year, it follows that the overall portfolio value after years is also maximized. Therefore, the algorithm is correct (i.e., optimal).

Problem 5 - DP

Let be a sequence of distinct positive integers. We say that a subsequence is legal if it includes at least one element from every two consecutive elements in , and it includes the last element of . Formally, and .

For example, given the sequence , the subsequence is legal while is not.

In this question, given a sequence , we would like to find a legal subsequence which has minimum value, where the value of a subsequence is defined as the sum of its elements. For example, for the sequence above, the subsequence is an optimal solution with value 20.

    1. Prove or disprove: For any and any sequence , if is a legal subsequence with minimum value, then it does not include two consecutive elements of the original sequence.

False, Consider .

Must contain 5 since it’s the last element. Now all the pairs: . So the legal subsequence with minimum value is and includes two consecutive elements that is 2 and 4 from the original sequence.

    1. Given sequence , let be an array such that is the minimum value of a legal subsequence of . Write a recurrence relation for . You can assume that .
    1. Write detailed pseudocode for an algorithm that takes as its input a sequence and returns the minimum value of a legal subsequence.
FindMinLegal(A(a_1,...,a_n)):
    s <- dp table of size (n+1)
    s[0] <- 0
    for i in (1...n) do:
        if (i == 1):
            s[i] = a_1
        else if (s[i-1] >= s[i-2]):
            s[i] = a_i + s[i-2]
        else if (s[i-1] < s[i-2]):
            s[i] = a_i + s[i-1]
    return s[n]

This algorithm initializes the dynamic programming table s with size (n+1) and computes the minimum value of a legal subsequence ending with each element of A. It uses dynamic programming to store the minimum values in the table s. Finally, it returns s[n], which represents the minimum value of a legal subsequence ending with the last element of A.

Understanding

Ohhhh I see, kinda slowly.

So at first, I was confused to why we compare s[i] values, but notice first line of the algo, we initialize s, as a dp table which creates and store the values of A. It will serve as storing the minimum value of legal subsequence at different length .

Another thing, is in the main loop, we slowly compute the minimum subsequence at each i. So that is why we always add which was something else I was confused about. So we slowly build up our dp table up to , the actually value we care about, which we return.

That is why, we update s[i] for every loop. We return , which represents the minimum value of a legal subsequence actually ending with the last element of .

    1. Analyze the running time of your algorithm from part iii) in the unit cost model. The table is of the size . Going over the table should take time only due to the for loop that only does operation.

Final S2018

True or False

  • Any tree with at least 7 vertices must have a vertex degree at least 3.
    • False
    • Consider the graph
    • This graph is a tree, but each vertex has degree at most 2.
  • Let be a directed graph and let be any DFS tree of . Then for any edge that is not in we must have .
    • False. We can have an edge from an ancestor to a descendent like the dashed line below in DFS.
  • Recall that we can modify BFS to check if a graph is bipartite. We can also modify BFS to check if a graph is tripartite. A graph is tripartite if there are disjoint sets of vertices such that and , are not in the same set .
    • False. A tripartite graph is the same as a 3-Colourable graph, and 3-Colourable is NP-complete.
  • The complement independent set

HW 2023