DSA Midtermcheets

.docx

School

New York University**We aren't endorsed by this school

Course

ECE-GY 9343

Subject

Computer Science

Date

Dec 17, 2024

Pages

Uploaded by JudgeEnergy15861

4. (12 points) Solve the following recurrences: (a) (4 points) Use the iteration method to solve T(n) = & (T(an) + T(8n)) +n, 0.5 < a,8 < 1; (b) (4 points) Use the substitution method to verify your solution for Question 4a. (¢) (4 points) Solve the recurrence T'(n) = 4T (n/2) +n®3 using master method. (i) (i) v (a)For the kth level, the computation would be ((a + 8))*n, with summation formula of geometric pro- gression, Yy ((a+ 8))kn = j = 7=23n, since 0.5 < a, 8 < 1, T(n) = O(n), Recursion Tree (2 pt), correct T(n) (2 pt) (b)Suppose T(n) = cn +b, we substitute T(n) in the original equation, we can get Answer: 1 T(n) = H{can + b+ chn +b) +n = (%m 3 %l-m Dn+b <en+b =0(n) which proves that T(n) = O(n) (¢) As we can see in the formula above, a = 4, b = 2, logsa = 2 and f(n) = n2%, which is the case 3 of Master theorem, then we can say T(n) = O(n?%) T(n) = T(5) +27(3) +n? The recursion tree is asymmetric and unbalanced. The minimum depth is logyn at the most left while the maximum is logy n at the most right side. Note that the cost at each depth is reduced a factor of before reaching the depth logg n. In other words, the merging cost at depth & is bounded by n* % (35)*. Then we can bound T(n) from above and below by, o (Lypommyd Tupper(n) = (1 + 9 Tiower(n) = (1 + ()+ 5 Tiower(n) € T(n) < Tupper(n) 2 T Both Typper (n) and Tiouer(n) are O(n?), since lim T2 = lim =3 Therefore, T(n) = 8(n?) Part 2 - substitution method: The upper bound: 1H: T(k) < dk?, Yk <n == T(n) < dn?, so we have: n T(n) =T ~21'{'—i)on"<vl|%)"+‘_’:ﬂ§)"+n"<vln‘¢->d > Thus, if we set d > 22 then for all n, T(n) < dn? = T(n) = O(n?). The lower bound: IH: T(k) > ck*, Yk <n = T(n) > cn®, so we have: RO . RIS e . e | 7(11]—7(§)+2T(§)+n zn(z,) +21(g) 4+n“>en @(5?3 Thus, if we set ¢ < 32, then for all n, T(n) > en® = T(n) = Q(n?). With both upper and lower bounds, we can get T(n) = ©(n?). Solve the recurrence: T(n) = 2T(v/n) + (log logn)? (Hint: How to make change of variables so that you can apply Master’s method Solution: Let m =logn, so n=2", and, T(2™) = 2T (V2™) + (log m)* T(2™) = 27((2™)}) + (logm)? T(2™) = 27(2F) + (logm)? Let S(m) = T(2™), then S(m) = 25(%F) + (log m)? By master's method, we have a = S(m) = 6(m) and T(n) = O(logn) 2.d=logya =1, f(m) = (logm)? = O(m?=*), for some ¢ > 0, so

8. (11 points) An array of n distinct keys were inserted into a hash table of size m sequentially over n time slots, suppose chaining was used to resolve collisions, let X, be the random variable of the munber of elements examined when searching for the k-th inserted key (the key inserted in the k-th time slot, 1 < k < n), () (5 points) what is the probability mass function of Xy, i.., calculate P(Xj =), i > 07 (b) (3 points) what is the expected value of Xi? (¢) (3 points) what is the expected mumber of elements examined when searching for a key randomly selected from the array? Answer: (a) There were n — k keys inserted after the k-th key, each of them has a probability of 1/m (1pt) to be inserted into the same chain as the k-th key. Let ¥ be the number of keys positioned before the k-th key in its chain, it is a Bernoulli trial of n — k experiments, and success probability of each trial is 1/m, so Y, follows a binomial distribution of (n — k, ), and Xj = Yi +1, the p.m.f. of Y is P(Yi=i) n—k>i>0(2pt) The pm.f of Xi p— P(Xi=i) = P(Vi ) L, om=k+12i21(2pt) (b) The average of ) (¢) The average of searching for a random key is: 13. (7 points) Answer the following questions about hash table with collision resolved by chaining, (a) (3 points) If the hash table has m slots, and n keys are sequentially inserted into the hash table, the is determined by a simple universal hash function h(-) with value range of 1 to m. What is the probability that the number of keys inserted into a given slot is larger than K, 0 < K < n? slot of each ke (b) (4 points) If the hash table has 2m slots, and n keys are sequentially inserted into the hash table, the slot of each key is determined by the sum of two simple universal hash functions hi () +ha(-), each with value range of 1 to m. What is the probability that a key & insert into the chain stored at hash table slot j (1< j<2m)? Answer: a). Let X be the random variable of the number of keys inserted into any given slot. X follows the binomial distribution with n-trial and p = L. its p.m.f is 1 1 P(X=k)= (:)7"7“'5 So the probability of X > K is PX>K)= Y (';_)%;(x—% kK41 b) Let hy and hy be the two hash values of a key, the key is placed into the chain at slot j iff hy + h it happens with probability of where I(:) is the indicator function. More specifically, (a) PL=0; (b) =it for2<j<m+1; (©) Py= i form+1<j<2m Indicate, for each pair of expressions (A, B) in the table below, whether A is O, o, Q, w, or © of B. Assume that k > 1,¢ > 0, and ¢ > 1 are constants. Your answer should be in the form of the table with ‘yes” or “no” written in each box. a B [0]o [0 w6 a|lgn | n° |yes|ves | mo | mo | mo b o~ @ [yos [ves [mo [ mo [ no c n n¥nn no | no | no | no | no d| 2" | 277 [mo | no [yes |ys | mo ¢ | n'° | 2" |yes | no | yes | no | yes T [ Tg(n]) | lg(n) | yes | no | yes [ mo | yes

1. Prove the following properties of asymptotic notation: (a) n=w(Vn) Following the w definition: 241> 0, such that n > ev/m,Vn > ng on=w(vm) g(n)), and h(n) = ©(g(n)), then f(n) = Q(h(n)) ionship between h(n) and g(n), h(n) = ©(g(n)) = h(n) = O(g(n)), and the is not needed in the proof: Ve> 0,3ng (b) If f(n) = For the r big-Omega f(n) = g(n)) = 31 >0.n1 >0, such that f(n) > c1g(n).¥n > ny h(n) = ©(g(n)) = 3e2> 0.nz > 0, such that h(n) < exg(n),Vn > na sog(n) = lh(n).\m >ny o 1 %> 0,n5 = max(ny,n2) >0, such that Va > ns, f(n) > cig(n) > c1—h(n) ey c2 - f(n) = Qh(n)) (¢) f(n) = O(g(n)) if and only if g(n) = Q(f(n)) (Transpose Symmetry property) For the if and only if type of questions, we must prove from both sides: Partl: f(n) = O(g(n)) == g(n) =Q(f(n)) £(n) = O(g(n)) 3e1 > 0,m1 >0, such that f(n) < c1g(n),¥n > m 4 %.v"zu..,,u.)2(-;/0.)@_.,(,.)=n(/(n)) Part2: f(n) =0(g(n)) <= g(n) =(f(n)) 9(n) = Q(f(n)) & 3e2 > 0.na > 0, such that g(n) > c2f(n),Yn > ny b= é"’" >ny. f(n) < &hg(n) & f(n) = O(g(n)) 7. (4 points) Prove that if f(n) = o(g(n)) and g(n) = o(h(n)) then h(n) = w(f(n)), where o(-) stands for little-o and w(-) stands for little-w. Answer: f(n) = o(g(n)) == Ver >0, 3n; > 0, such that f(n) < c1g(n),Yn>m g(n) = o(h(n)) = Vez > 0, 3nz > 0, such that g(n) < c2h(n),¥n > na To prove h(n) = w(f(n)), we need to show ¥C >0, 3N >0, such that h(n) > Cf(n),¥n > N We choose ¢ = 3 = J = ko = C, and since f(n) = o{g(n)) and g(n) = o{h(n)), we can find the corresponding ny and nz in the above statements, and let N = max(ny,nz), and J(n) < e1g(n),¥n > N >y g(n) < ezh(n),¥n > N > ny ~.¥n2 N, h(n) > lg/(n) > L/(n) =Cf(n) o2 acz . h(n) = w(f(n)) 4. Largest i numebrs in sorted order. Given a set of n numbers, we wish to find the i largest in sorted order using a comparison-based algorithm. Find the algorithm that implements each of the following methods with the best asymptotic worst-case running time, and analyze the running times of the algorithms in terms of n and i. (a) Sort the numbers, and list the i largest. Solution: The running time of sorting the numbers is O(nlgn), and the running time of listing the i largest numbers is O(i), Therefore, the total running time is O(nlgn +1). (b) Build a max-priority queue from the numbers, and call EXTRACT-MAX i times. Solution: The running time of building a max-priority queue (using a heap) is O(n), and the running time of each call EXTRACT-MAX is O(Ign), Therefore, the total running time is O(n + ilgn). Use an order-statistic algorithm to find the ith largest number, partition around that number, and sort the i largest numbers. Solution: The running time of finding and partitioning around the i** largest number is O(n), and the running time of sorting the i largest numbers is O(ilgi), Therefore, the total running time is O(n + ilgi). Notes: For the best worst-case running time, please do not use quicksort. Similarly, there is a worst-case linear-time order-statistic algorithm. We should not do random partitions with worst-case O(n?).

14. (12 points) Suppose there are n min-heaps, each min-heap contains m numbers (1 < m < n). Develop an algorithm that selects the i-th (1 < i < n) onder statistic from the combined nm numbers from all min-heaps with the complexity of ©(n +i(logn + logm)). (a) (4 points) describe the main steps of your algorithm (no need to write down the detailed pseudo code); (b) (4 points) analyze the complexity of your algorithum; (c) (4 points) we showed that there is a worst-case linear time SELECT algorithm that selects the i-th order statistic from n numbers with T(n) = ©(n) comparisons. Briefly describe how you can utilize this algorithm to update your algorithm in (a) to reduce its complexity to 6(n + i(logi +logm)). Answer a) The root of each min-heap s the minimum of the m numbes min-heap H with the roots of the n min-heaps, extract the minimum origin eap Z. smallest of the nm n in that min-heap. Form another from H, also extract = from its ssert the new root y of Z into H, repeat the previous steps for i times to get the i-th bers. 1 min- b). Get the oot of each min-heap takes 6(1), a total of ©(n). Build the new min-heap H takes ©(n) time. The complexity of each of the following i steps: extract min of H takes ©(logn), extract = and y from Z takes ©(log m), finally insert y into H takes ©(logn), so the total complexity is ©(n + i(logn + log m)). €) Apply the linear time SELECT algorithm on all the roots of the min-heap to find the i-th smallest, let it be z. Then select the i min-heaps whose roots are smaller than or equal to z, the i-th minimum of the combined nm mmbers must be in the selected i min-heaps. (Any element in the non-selected heaps will be larger than the i selected roots, so it cannot be the i-th smallest among all elements.) Form another min-heap H with the roots of the i min-heaps, extract the minimum = from H, also extract = from i nal min-heap Z, insert. the new root y of Z into H, repeat the previous steps for i times to get the i-th smallest of the nm numbers. Updated Complezity: initial selection to find z: ©(n), the complexity of each of the following i steps: extract min of H takes ©(logi), extract z and y from Z takes ©(log m), finally insert y into H takes ©(logi), so the total complexity is ©(n + i(log i +log m)). (Alp,i] represents the mmbers in sub-array starting at index p and ending at index i) Loop-invariant (LI): before each loop, Alp,] < and A[j, r] > z, where z is th r+1, both Alp, dected pivot. Initialization: Before the first while loop, i holds p-1j and A[j,r] are empty, the LT Maintenance: Assume LI holds before an iteration, during which the algorithm is not retumed. We have Alp.iaa) € x and Aljou, ] > z. During the iteration, j decreases t0 juew and i se¢ Alp, inew = 1] < , Alinew] 2 2, Aljnew +1,7] 2 7, Aljnew] < 7. After the exchange, / 2, and Alp,inew] < 7, Aljnews7] 2 2. S0 the LI stll holds after the iteration. Termination: The loop terminates when i > j, meaning that the two indices meet. We have Ap,i— 1] < 2, Alj+1.7] 2 2, Alj] Sz, andi > j = Alp.j—1] < .50 Alp.j] <z < Alj + 1.7]. Around index j, the array is partitioned into two sub-arrays. All values in the first sub-arr: maller than or equal to the pivot, while all values in the second sub-array is larger than or equal to the pivot. The Hoares parti correct. Teases L0 inew, and we can [inew] < 2, Aljnew] 2 9. (20 points) A k-way merge operation. Suppose you have k sorted arrays, each with n elements, and you want to combine them into a single sorted array of kn elements. () (5 points) Here's one strategy: Using the merge procedure, merge the first two arrays, then merge in the third, then merge in the fourth, and so on. What is the time complexity of this algorithm, in terms of k and n? Answer: (k+2)(k—1) 2! (b) (8 points) Design and analyze a more efficient algorithm, using divide-and-conquer, that solves this problem in O(knlogk) Answer: Recursive Pairwise Merge 2n+3n+ - -kn O(kn) T(k) = 27( %) + kn, in terms of just k, not kn Complexity is O(knlogk), not O(knlogkn) (¢) (7 points) Design and analyze another efficient algorithm, using min-heap, that solves this problem in O(knlogk) time. Answer: Take the smallest element from each sorted array, form a min-heap with k elements, do extract-min, that will be the minimum of all k arrays, insert the next larger element from the array will the minimum is taken into the min-heap, do extract-min, get the second minimum of all k arrays, repeat kn times to get all sorted Total complexity: O(k) +nkO(log k) = O(nklog k).