Comp2123-1

.pdf

School

The University of Sydney**We aren't endorsed by this school

Course

COSC 3011

Subject

Computer Science

Date

Dec 20, 2024

Pages

Uploaded by SuperLightning15922

ContentsPrefacexiiIThe Basics11Introduction21.1Why Study Algorithms?21.2Integer Multiplication31.3Karatsuba Multiplication61.4MergeSort: The Algorithm101.5MergeSort: The Analysis141.6Guiding Principles for the Analysis of Algorithms20Problems242Asymptotic Notation272.1The Gist272.2Big-O Notation332.3Two Basic Examples352.4Big-Omega and Big-Theta Notation372.5Additional Examples40Problems423Divide-and-Conquer Algorithms453.1The Divide-and-Conquer Paradigm453.2Counting Inversions inO(nlogn)Time463.3Strassen’s Matrix Multiplication Algorithm53*3.4AnO(nlogn)-Time Algorithm for the Closest Pair58Problems684The Master Method714.1Integer Multiplication Revisited714.2Formal Statement734.3Six Examples75*4.4Proof of the Master Method79Problems87vii

viiiContents5QuickSort905.1Overview905.2Partitioning Around a Pivot Element935.3The Importance of Good Pivots985.4RandomizedQuickSort101*5.5Analysis of RandomizedQuickSort104*5.6Sorting Requires⌦(nlogn)Comparisons112Problems1166Linear-Time Selection1206.1TheRSelectAlgorithm120*6.2Analysis ofRSelect125*6.3TheDSelectAlgorithm129*6.4Analysis ofDSelect132Problems138II Graph Algorithms and Data Structures1417Graphs: The Basics1427.1Some Vocabulary1427.2A Few Applications1437.3Measuring the Size of a Graph1447.4Representing a Graph146Problems1518Graph Search and Its Applications1538.1Overview1538.2Breadth-First Search and Shortest Paths1598.3Computing Connected Components1668.4Depth-First Search1708.5Topological Sort174*8.6Computing Strongly Connected Components1818.7The Structure of the Web190Problems1939Dijkstra’s Shortest-Path Algorithm1989.1The Single-Source Shortest Path Problem1989.2Dijkstra’s Algorithm202*9.3Why Is Dijkstra’s Algorithm Correct?2049.4Implementation and Running Time209Problems21010The Heap Data Structure21410.1 Data Structures: An Overview21410.2 Supported Operations21610.3 Applications218

Contentsix10.4 Speeding Up Dijkstra’s Algorithm222*10.5 Implementation Details226Problems23511Search Trees23711.1 Sorted Arrays23711.2 Search Trees: Supported Operations239*11.3 Implementation Details241*11.4 Balanced Search Trees251Problems25412Hash Tables and Bloom Filters25712.1 Supported Operations25712.2 Applications259*12.3 Implementation: High-Level Ideas263*12.4 Further Implementation Details27312.5 Bloom Filters: The Basics277*12.6 Bloom Filters: Heuristic Analysis282Problems286III Greedy Algorithms and Dynamic Programming29013Introduction to Greedy Algorithms29113.1 The Greedy Algorithm Design Paradigm29113.2 A Scheduling Problem29313.3 Developing a Greedy Algorithm29513.4 Proof of Correctness299Problems30314Huffman Codes30614.1 Codes30614.2 Codes as Trees30914.3 Huffman’s Greedy Algorithm313*14.4 Proof of Correctness320Problems32615Minimum Spanning Trees32815.1 Problem Definition32815.2 Prim’s Algorithm331*15.3 Speeding Up Prim’s Algorithm via Heaps335*15.4 Prim’s Algorithm: Proof of Correctness34015.5 Kruskal’s Algorithm346*15.6 Speeding Up Kruskal’s Algorithm via Union-Find349*15.7 Kruskal’s Algorithm: Proof of Correctness35715.8 Application: Single-Link Clustering358Problems362

xContents16Introduction to Dynamic Programming36616.1 The Weighted Independent Set Problem36616.2 A Linear-Time Algorithm for WIS in Paths37016.3 A Reconstruction Algorithm37616.4 The Principles of Dynamic Programming37716.5 The Knapsack Problem381Problems38817Advanced Dynamic Programming39217.1 Sequence Alignment392*17.2 Optimal Binary Search Trees400Problems41118Shortest Paths Revisited41518.1 Shortest Paths with Negative Edge Lengths41518.2 The Bellman-Ford Algorithm41818.3 The All-Pairs Shortest Path Problem42918.4 The Floyd-Warshall Algorithm430Problems438IV Algorithms for NP-Hard Problems44119What Is NP-Hardness?44219.1 MST vs. TSP: An Algorithmic Mystery44219.2 Possible Levels of Expertise44619.3 Easy and Hard Problems44819.4 Algorithmic Strategies for NP-Hard Problems45319.5 Proving NP-Hardness: A Simple Recipe45719.6 Rookie Mistakes and Acceptable Inaccuracies464Problems46720Compromising on Correctness: Efficient Inexact Algorithms47120.1 Makespan Minimization47120.2 Maximum Coverage481*20.3 Influence Maximization49020.4 The 2-OPT Heuristic Algorithm for the TSP49720.5 Principles of Local Search502Problems51121Compromising on Speed: Exact Inefficient Algorithms51921.1 The Bellman-Held-Karp Algorithm for the TSP519*21.2 Finding Long Paths by Color Coding52521.3 Problem-Specific Algorithms vs. Magic Boxes53521.4 Mixed Integer Programming Solvers53721.5 Satisfiability Solvers540Problems545

Contentsxi22Proving Problems NP-Hard55122.1 Reductions Revisited55122.2 3-SAT and the Cook-Levin Theorem55322.3 The Big Picture55422.4 A Template for Reductions55722.5 Independent Set Is NP-Hard558*22.6 Directed Hamiltonian Path Is NP-Hard56222.7 The TSP Is NP-Hard56722.8 Subset Sum Is NP-Hard569Problems57423P, NP, and All That577*23.1 Amassing Evidence of Intractability577*23.2 Decision, Search, and Optimization579*23.3NP: Problems with Easily Recognized Solutions580*23.4 The P6=NP Conjecture584*23.5 The Exponential Time Hypothesis587*23.6 NP-Completeness590Problems59424Case Study: The FCC Incentive Auction59624.1 Repurposing Wireless Spectrum59624.2 Greedy Heuristics for Buying Back Licenses59824.3 Feasibility Checking60424.4 Implementation as a Descending Clock Auction60924.5 The Final Outcome613Problems615AQuick Review of Proofs By Induction617A.1A Template for Proofs by Induction617A.2Example: A Closed-Form Formula618A.3Example: The Size of a Complete Binary Tree618BQuick Review of Discrete Probability620B.1Sample Spaces620B.2Events621B.3Random Variables622B.4Expectation623B.5Linearity of Expectation624B.6Example: Load Balancing627Epilogue: A Field Guide to Algorithm Design630Hints and Solutions632Index655

PrefaceThis book has only one goal:to teach the basics of algorithms in the most accessible waypossible. Think of it as a transcript of what an expert algorithms tutor would say to youover a year of one-on-one lessons. This book is inspired by my online algorithms coursesthat have been running regularly since 2012, which in turn are based on an undergraduatecourse that I taught many times at Stanford University. People of all ages, backgrounds, andwalks of life are well represented in these online courses, with large numbers of students(high-school, college, etc.), software engineers (both current and aspiring), scientists, andprofessionals hailing from all corners of the world.What We’ll CoverAlgorithms Illuminatedwill transform you from an algorithms newbie (say, a rising third-year undergraduate) to a seasoned veteran with expertise comparable to graduates of theworld’s best computer science master’s degree programs. Specifically, this book will helpyou master the following topics.Asymptotic analysis and big-O notation.Asymptotic notation provides the basic vo-cabulary for discussing the design and analysis of algorithms. The key concept here is“big-O” notation, which is a modeling choice about the granularity with which we measurethe running time of an algorithm. We’ll see that the sweet spot for clear high-level thinkingabout algorithm design is to ignore constant factors and lower-order terms, and to concentrateon how an algorithm’s performance scales with the size of the input.Divide-and-conquer algorithms and the master method.There’s no silver bullet inalgorithm design, no single problem-solving method that cracks all computational problems.However, there are a few general algorithm design techniques that find successful applicationacross a range of different domains. In the “divide-and-conquer” technique, the key idea isto break a problem into smaller subproblems, solve the subproblems recursively, and thenquickly combine their solutions into one for the original problem. We’ll see fast divide-and-conquer algorithms for sorting, integer and matrix multiplication, and a basic problem incomputational geometry. We’ll also cover the master method, which is a powerful tool foranalyzing the running time of divide-and-conquer algorithms.Randomized algorithms.A randomized algorithm “flips coins” as it runs, and its behaviorcan depend on the outcomes of these coin flips. Surprisingly often, randomization leadsto simple, elegant, and practical algorithms. Among other randomized algorithms, we’lldescribe and analyze in detail the canonical example of randomized QuickSort.xii

PrefacexiiiSorting and selection.As a byproduct of studying the first three topics, we’ll learn severalfamous algorithms for sorting and selection, including MergeSort, QuickSort, and linear-time selection (both randomized and deterministic). These computational primitives are soblazingly fast that they do not take much more time than that needed just to read the input.This book will help you cultivate a collection of such “for-free primitives,” both to applydirectly to data and to use as the building blocks for solutions to more difficult problems.Graph search and applications.Graphs model many different types of networks, includ-ing road networks, communication networks, social networks, and networks of dependenciesbetween tasks. Graphs can get complex, but there are several blazingly fast primitives forreasoning about graph structure. We begin with linear-time algorithms for searching a graph,with applications ranging from network analysis to task sequencing.Shortest paths.In the shortest-path problem, the goal is to compute the best route in anetwork from point A to point B. The problem has obvious applications, like computingdriving directions, and also shows up in disguise in many more general planning problems.We’ll generalize one of our graph search algorithms and arrive at Dijkstra’s famous shortest-path algorithm.Data structures.This book will turn you into an educated client of several different datastructures for maintaining an evolving set of objects with keys. The primary goal is todevelop your intuition about which data structure is the right one for your application. Theoptional advanced sections provide guidance in how to implement these data structures fromscratch.We first discuss heaps, which can quickly identify the stored object with the smallestkey and are useful for sorting, implementing a priority queue, and implementing Dijkstra’salgorithm in near-linear time. Search trees maintain a total ordering over the keys of thestored objects and support an even richer array of operations. Hash tables are optimized forsuper-fast lookups and are ubiquitous in modern programs. We’ll also cover the bloom filter,a close cousin of the hash table that uses less space at the expense of occasional errors, andthe union-find (disjoint-set) data structure.Greedy algorithms and applications.Greedy algorithms solve problems by making asequence of myopic and irrevocable decisions. For many problems, they are easy to deviseand often blazingly fast. Most greedy algorithms are not guaranteed to be correct, but we’llcover several killer applications that are exceptions to this rule. Examples include schedulingproblems, optimal compression, and minimum spanning trees of graphs.Dynamic programming and applications.Few benefits of a serious study of algorithmsrival the empowerment that comes from mastering dynamic programming. This designparadigm takes a lot of practice to perfect, but it has countless applications to problemsthat appear unsolvable using any simpler method. Our dynamic programming boot campwill double as a tour of some of the paradigm’s killer applications, including the knapsackproblem, the Needleman-Wunsch genome sequence alignment algorithm, Knuth’s algorithmfor optimal binary search trees, and the Bellman-Ford and Floyd-Warshall shortest-pathalgorithms.

xivPrefaceAlgorithmic tools for tacklingNP-hard problems.Many real-world problems are “NP-hard” and appear unsolvable by always-correct and always-fast algorithms.When anNP-hard problem shows up in your own work, you must compromise on either correctnessor speed. We’ll see techniques old (like greedy algorithms) and new (like local search)for devising fast heuristic algorithms that are “approximately correct,” with applications toscheduling, influence maximization in social networks, and the traveling salesman problem.We’ll also cover techniques old (like dynamic programming) and new (like MIP and SATsolvers) for developing correct algorithms that improve dramatically over exhaustive search;applications here include the traveling salesman problem (again), finding signaling pathwaysin biological networks, and television station repacking in a recent and high-stakes spectrumauction in the United States.RecognizingNP-hard problems.This book will also train you to quickly recognize anNP-hard problem so that you don’t inadvertently waste time trying to design a too-good-to-be-true algorithm for it. You’ll acquire familiarity with many famous and basicNP-hardproblems, ranging from satisfiability to graph coloring to the Hamiltonian path problem.Through practice, you’ll learn the tricks of the trade in proving problemsNP-hard viareductions.For a more detailed look into the book’s contents, check out the “Upshot” sectionsthat conclude each chapter and highlight the most important points. The “Field Guide toAlgorithm Design” on page 630 provides a bird’s-eye view of how to apply the algorithmictoolbox you’ll acquire from this book to problems that you encounter in your own work.The starred sections of the book are the most advanced ones. The time-constrainedreader can skip these sections on a first reading without any loss of continuity.Skills You’ll LearnMastering algorithms takes time and effort. Why bother?Become a better programmer.You’ll learn several blazingly fast subroutines for process-ing data as well as several useful data structures for organizing data that you can deploydirectly in your own programs. Implementing and using these algorithms will stretch andimprove your programming skills. You’ll also learn general algorithm design paradigmsthat are relevant to many different problems across different domains, as well as tools forpredicting the performance of such algorithms. These “algorithmic design patterns” can helpyou come up with new algorithms for problems that arise in your own work.Sharpen your analytical skills.You’ll get lots of practice describing and reasoning aboutalgorithms. Through mathematical analysis, you’ll gain a deep understanding of the specificalgorithms and data structures that this book covers. You’ll acquire facility with severalmathematical techniques that are broadly useful for analyzing algorithms.Think algorithmically.After you learn about algorithms, you’ll start seeing them ev-erywhere, whether you’re riding an elevator, watching a flock of birds, managing yourinvestment portfolio, or even watching an infant learn. Algorithmic thinking is increasinglyuseful and prevalent in disciplines outside of computer science, including biology, statistics,and economics.

PrefacexvLiteracy with computer science’s greatest hits.Studying algorithms can feel like watch-ing a highlight reel of many of the greatest hits from the last sixty years of computer science.No longer will you feel excluded at that computer science cocktail party when someonecracks a joke about Dijkstra’s algorithm. After reading this book, you’ll know exactly whatthey mean.Ace your technical interviews.Over the years, countless students have regaled me withstories about how mastering the concepts in this book enabled them to ace every technicalinterview question they were ever asked.Using this Book in a CourseAll of this book’s material has been battle-tested in a university setting—by yours truly atStanford, and by many instructors at other schools. Parts I–III are tailor-made for serving asthe primary text in an introductory undergraduate course on algorithms and data structures,focusing on the basics (Part I), graph algorithms and data structures (Part II), and greedyalgorithms and dynamic programming (Part III). For example, I cover 75-80% of this contentover nineteen 75-minute lectures; a semester-long course could accommodate the rest ofit, or selected topics from Chapter 19 aboutNP-hard problems. Parts III and IV of thebook form an ideal basis for a traditional master’s level course on basic and advancedalgorithms, emphasizing greedy algorithms and their applications, dynamic programmingand its applications, the recognition ofNP-hard problems, and algorithmic tools for tacklingsuch problems.Many chapters of this book are logically independent of each other. For example, thedesign paradigms of divide-and-conquer algorithms (Chapters 3–6), greedy algorithms(Chapters 13–15), and dynamic programming (Chapter 16–18) can be covered in anyorder. Similarly, data structures (Chapters 10–12) can be taught before or after basic graphalgorithms (Chapters 7–9), with the exception of the heap-based implementation of Dijkstra’salgorithm in Section 10.4.As for prerequisites, students in an algorithms course generally have at least a littlebackground in programming (including, for example, the use of arrays and recursion) andin mathematical reasoning (such as proofs by induction and by contradiction). Readerscan level up their programming and mathematical skills with any number of free onlineresources (seewww.algorithmsilluminated.orgfor specific recommendations).Appendices A and B also offer quick reviews of induction and discrete probability, respec-tively. Alternatively, readers without any programming experience can learn the basics ofalgorithm design and analysis from this book at the level of pseudocode descriptions (ifnot concrete implementations), and those with little mathematical background can focussquarely on algorithm design techniques (if not detailed algorithm analysis).I’ve recorded over 200 YouTube videos that cover in detail every section of the book,as well as additional advanced content (on Karger’s random contraction algorithm, pathcompression in disjoint set data structures, Johnson’s all-pairs shortest-path algorithm, andmore). These videos, which are catalogued atwww.algorithmsilluminated.org,can serve an instructor in three different ways: (i) as part of a flipped classroom, with studentswatching lecture videos in advance of class time, which can then be used for discussion andproblem-solving exercises; (ii) as a way to help students fill in missing prerequisites without

xviPrefacetaking up class time; and (iii) as supplementary material above and beyond what is coveredduring class time (perhaps for an honors section or extra credit).Additional Features and ResourcesThis book is based on online courses that are currently running on the Coursera and EdXplatforms. I’ve made several resources available to help you replicate as much of the onlinecourse experience as you like.Videos.If you’re more in the mood to watch and listen than to read, check out theYouTube video playlists available atwww.algorithmsilluminated.org. Thesevideos feature yours truly teaching all the topics in this book, as well as additional advancedtopics. I hope they exude a contagious enthusiasm for algorithms that, alas, is impossible toreplicate fully on the printed page.Quizzes.How can you know if you’re truly absorbing the concepts in this book? Over 100quizzes with solutions and explanations are scattered throughout the text; when you encounterone, I encourage you to pause and think about the answer before reading on.End-of-chapter problems.At the end of each chapter, you’ll find several relativelystraightforward questions that test your understanding, followed by harder and more open-ended challenge problems. Hints or solutions to all of these problems (as indicated by an“(H)” or “(S),” respectively) are included at the end of the book. Readers can interact withme and each other about the end-of-chapter problems through the book’s discussion forum(see below).Programming problems.Most of the chapters conclude with suggested programmingprojects whose goal is to help you develop a detailed understanding of an algorithm bycreating your own working implementation of it. Data sets, along with test cases and theirsolutions, can be found atwww.algorithmsilluminated.org.Discussion forums.A big reason for the success of online courses is the opportunitiesthey provide for participants to help each other understand the course material and debugprograms through discussion forums. Readers of this book have the same opportunity viathe forums available atwww.algorithmsilluminated.org.AcknowledgmentsThis book would not exist without the passion and hunger supplied by the hundreds ofthousands of participants in my algorithms courses over the years. I am particularly gratefulto those who supplied detailed feedback on earlier drafts: Tonya Blust, Yuan Cao, LaurenCowles, Leslie Damon, Tyler Dae Devlin, Roman Gafiteanu, Carlos Guia, Blanca Huergo,Jim Humelsine, Tim Kearns, Vladimir Kokshenev, Bayram Kuliyev, Patrick Monkelban,Kyle Schiller, Nissanka Wickremasinghe, Clayton Wong, Lexin Ye, Daniel Zingaro, severalanonymous reviewers, and many pseudonymous contributors to the book’s discussion forums.Thanks also to several experts who provided technical advice: Amir Abboud, VincentConitzer, Christian Kroer, Aviad Rubinstein, and Ilya Segal.

PrefacexviiI always appreciate suggestions and corrections from readers. These are best communi-cated through the discussion forums mentioned above.Tim RoughgardenNew York, NYApril 2022