Lecture10

.pdf

School

Pennsylvania State University**We aren't endorsed by this school

Course

CMPSC 465

Subject

Computer Science

Date

Dec 15, 2024

Pages

Uploaded by MateAtom11863

Lecture 10CMPSC 465 Fall 2024Mingfu ShaoDFS with TimingThe AlgorithmThe DFS-with-timing is a variant of DFS that records the time of starting and finishing the explore of eachvertex. It uses the following data structures (we assumen=|V|). These data structures are global variables,so that the explore function can get access to and edit them.1. variable clock servers as a timer that stores the current time;2. binary arrayvisited[1..n], wherevisited[i]indicates ifv[i]has been explored/visited, 1≤i≤n;3. arraypre[1..n], wherepre[i]records the time of starting exploringvi, 1≤i≤n;4. arraypost[1..n], wherepost[i]records the time of finishing exploringvi, 1≤i≤n;5. arraypostlist, stores the vertices in decreasing order ofpost[·].The pseudo-code of DFS with timing is given below.function DFS-with-timing (G= (V,E))clock=1;postlist=/0;pre[i] =post[i] =−1, for 1≤i≤n;fori=1→ |V|if (visited[i] =0): explore (G,vi);end for;end algorithm;function explore (G= (V,E),vi∈V)visited[i] =1;pre[i] =clock;clock=clock+1;for any edge(vi,vj)∈Eif (visited[j] =0): explore (G,vj);end for;post[i] =clock;clock=clock+1;addvito the front ofpostlist;end algorithm;An example of running DFS with timing is given below. Notice that this algorithm partitions all edges intotwo categories: solid edges(u,v)implies thatvis visited for the first time (and therefore explorevwill startright now, and after exploringvthe program will return to exploreu), while dashed edges(u,v)implies thatat that timevhas been visited already (and thereforevwill be skipped and the next out-edge ofuwill beexamined in the for-loop).1

Lecture 10CMPSC 465 Fall 2024Mingfu Shaov1v3v5v4v2v6v7input graph:G= (V,E)v1v3v5v2DFS(G)with timingv6v4v7(1,10)(2,5)(3,4)(6,9)(12,13)(7,8)(11,14)Figure 1: Example of running DFS (with timing) on a directed graph. The[pre,post]interval for each vertexis marked next to each vertex. Thepostlistfor this run ispostlist= (v4,v7,v1,v3,v6,v2,v5).In explorevi, the adjacent vertices{vj|(vi,vj)∈E}can be examined in any arbitrary order, i.e., all conclu-sions/properties we show hold regardless the order that{vj}gets examined. In practice though, we mightfollow a specific order; in Figure 1, we examine{vj}in increasing order of their indexes.The above DFS-with-timing algorithm runs inΘ(|V|+|E|)time, since each vertex and each edge will beexamined a constant number of times (once for directed graph, twice for undirected graph).The above DFS-with-timing algorithm gives an interval[pre,post]for each vertex. For two verticesvi,vj∈V, their corresponding intervals can either bedisjoint, i.e., the two intervals do not overlap, ornested, i.e.,one interval is within the other. See Figure 2. But two intervals cannot bepartially overlapping. Why?This is because the explore funtion is recursive. There are only two possiblities thatpre[i]<pre[j]. Thefirst one is that explorevjiswithinexplorevi; in this case the recursive behaviour of explore leads to thatpost[j]<post[i], as explorevjmust terminate first and return to explorevi. This case corresponds to thatthe two intervals are nested.The second one is that explorevjstarts after explorevifinishes; this casecorresponds to that the two intervals are disjoint.pre[i]pre[j]post[j]post[i][[]][][][[]]pre[i]post[j]pre[j]post[i]pre[i]pre[j]post[j]post[i]timenesteddisjointpartially overlapping (not possible)Figure 2: Relations between two[pre,post]intervals.Claim 1.If the[pre[j],post[j]]is nested within[pre[i],post[i]], thenvjis reachable fromvi.Proof.Consider when an explore will be called within another explore: only if there is an edge(vi,vj)∈E(andvisited[j] =0), explorevjwill be called within explorevi. Consequently, the time interval forvjwillbe within the interval forvi. Note that explorevjmight call explore other vertices, such as explorevk. Whenthis happens, the time interval forvkwill be within the interval forvj, and therefore within the interval forvi. But again this happens only if there exists edge(vj,vk), and hence a pathvi→vj→vk. This argumentcan be extended to longer paths, proving the conclusion above.2

Lecture 10CMPSC 465 Fall 2024Mingfu ShaoDetermine of Existence of CyclesLet’s see the first application of DFS-with-timing—to decide if a given (directed) graph contains cycles ornot. We can simply modify the explore function, given below, and use the same DFS-with-timining function.Specifically, when the algorithm examines an edge(vi,vj): ifvjhas been exploredandits post-number hasnot been set yet, then the algorithm reports thatGcontains cycle.function explore (G= (V,E),vi∈V)visited[i] =1;pre[i] =clock;clock=clock+1;for any edge(vi,vj)∈Eif (visited[j] =0): explore (G,vj);else if (post[j] =−1): report “Gcontains cycle”;end for;post[i] =clock;clock=clock+1;addvito the front ofpostlist;end algorithm;Now let’s prove this algorithm is correct. We first prove that ifGcontains cycle then the algorithm willalways report at some time. LetCbe the cycle. Letvj∈Cbe the first vertex that is explored inC. Let(vi,vj)∈Ebe an edge inC. Asvjcan reachvi(reason: both in cycleC), within exploringvjthere will be atime thatvigets explored. In explorevi, consider the time of examining edge(vi,vj): at this timevisited[j]has been set as 1, but its post-number has not been set, as now the algorithm is still within exploringvj.Therefore, the algorithm will report thatGcontains cycle.We then prove that if the algorithm reports, thenGmust contain cycles. Consider that the algorithm isexploringvi, examining edge(vi,vj)and findsvisited[j] =1and post[j] =−1. The fact thatpost[j]has notbeen set implies that the algorithm is within exploringvj. Therefore the interval forvimust be nested withinthe interval forvj. Following Claim 1, we know thatvjcan reachvi. In addition, there exists edge(vi,vj).Combined,Gcontains cycle.Note that this algorithm is essentially determining if there exists edge(vi,vj)∈Esuch that the interval[pre[i],post[i]]is within interval[pre[j],post[j]]. (Such edges are calledback edgesin textbook [DPV],page 95.)Key PropertyBefore seeing more applications, we prove an important property that relates the post values and meta-graph.Claim 2.LetCiandCjbe two connected components of directed graphG= (V,E), i.e.,CiandCjare twovertices in its coresponding meta-graphGM= (VM,EM). If we have(Ci,Cj)∈EMthen we must have thatmaxu∈Cipost[u]>maxv∈Cjpost[v].Intuitively, following an edge in the meta-graph, the largest post value decreases. Before seeing a formalproof, please see an example in Figure 3: the largest post values forC1,C2,C3, andC4are 9, 6, 10, and 14,3

Lecture 10CMPSC 465 Fall 2024Mingfu Shaoand you may verify that following any edge in the meta-graph, the largest post value always decreases.v1v3v5v4v2v6v7input graph:G= (V,E)v1v3v5v2DFS-with-timing(G)v6v4v7(1,10)(8,9)(5,6)(2,7)(12,13)(3,4)(11,14)C2={v5}C4={v7,v4}C1={v2}C3={v1,v3,v6}meta-graphGM= (VM,EM)ofGFigure 3: Example of running DFS (with timing) on a directed graph. The[pre,post]interval for each vertexis marked next to each vertex.Proof.Letu∗:=argminu∈Ci∪Cjpre[u], i.e.,u∗is the first explored vertex inCi∪Cj. Consider the two cases.u∗CiCju∗CiCjFigure 4: Two cases in proving above claim.First, assume thatu∗∈Ci. Thenu∗can reach all vertices inCi∪Cj\ {u∗}. Hence, all vertices inCi∪Cj\{u∗}will be exploredwithinexploringu∗. In other words, for any vertexv∈Ci∪Cj\ {u∗}, the interval[pre[v],post[v]]is a subset of[pre[u∗],post[u∗]]. This results in two facts: maxu∈Cipost[u] =post[u∗]andmaxv∈Cjpost[v]<post[u∗]. Combined, we have that maxu∈Cipost[u]>maxv∈Cjpost[v].Second, assume thatu∗∈Cj.Thenu∗cannotreach any vertex inCi; otherwiseCi∪Cjform a singleconnected component, conflicting to the fact that any connected component must be maximal.Hence,all vertices inCiwill remain unexplored after exploringu∗.In other words, for any vertexv∈Ci, theinterval[pre[v],post[v]]locates after (and disjoint with)[pre[u∗],post[u∗]]. This gives that maxu∈Cipost[u]>post[u∗]. Besides, we also have maxv∈Cjpost[v] =post[u∗]as all vertices inCj\ {u∗}will be examinedwithin exploringu∗. Combined, we have that maxu∈Cipost[u]>maxv∈Cjpost[v].Finding a Linearization of a DAGThe above key property holds for all directed graphs. We now consider DAGs. Note that each connectedcomponent of a DAGGcontains exactly one vertex, i.e., each vertex in a DAGGforms the connectedcomponent of its own. (Can you spot this using Figure 5?) This is because, if a connected componentcontains at least two verticesuandvthenucan reachvandvcan reachuso a cycle must exist. Consequently,the meta-graphGMof any DAGGis also itself, i.e.,G=GM.Now let’s interpret above key conclusion in the context of DAGs.For a DAGG= (V,E), components4

Lecture 10CMPSC 465 Fall 2024Mingfu ShaoCiandCjwill degenerate into two vertices, sayviandvj, edge(Ci,Cj)∈EMbecomes(vi,vj)∈E, andmaxu∈Cipost[u] =post[i], and maxv∈Cjpost[v] =post[j]. We haveCorollary 1.LetG= (V,E)be a DAG. If(vi,vj)∈E, thenpost[i]>post[j].Now recall the definition of linearization:Xis a linearization if and only if for every edge(vi,vj)∈E,viis beforevjinX. Since it is guaranteed above that, for every edge(vi,vj)∈E,post[i]>post[j], we canimmediately conclude that vertices sorted in decreasing order of post-values is a linearization! This order isnothing else but the postlist.Corollary 2.LetG= (V,E)be a DAG. The postlist generated in the DFS-with-timing algorithm is a lin-earization ofG.v1v6v5v4v3v2v1v6v5v4v3v2(1,12)(10,11)(6,7)(5,8)(3,4)(2,9)Figure 5: Example of running DFS (with timing) on a DAGG. The[pre,post]interval for each vertex ismarked next to each vertex. Thepostlistfor this run is(v1,v5,v2,v4,v6,v3), which is a linearization ofG.5