Home Explore [Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

[Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

Published by hpmaverick007, 2019-08-05 02:21:55

Description: [Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

Read the Text Version

Pages:

332 Chapter 7 Sorting make decisions based on ordering information only. Naturally, if there is extra information available, we should expect to ﬁnd a more efﬁcient algorithm, since otherwise the extra information would be wasted. Although bucket sort seems like much too trivial an algorithm to be useful, it turns out that there are many cases where the input is only small integers, so that using a method like quicksort is really overkill. One such example is radix sort. Radix sort is sometimes known as card sort because it was used until the advent of modern computers to sort old-style punch cards. Suppose we have 10 numbers in the range 0 to 999 that we would like to sort. In general this is N numbers in the range 0 to b p −1 for some constant p. Obviously we cannot use bucket sort; there would be too many buckets. The trick is to use several passes of bucket sort. The natural algorithm would be to bucket sort by the most signiﬁcant “digit” (digit is taken to base b), then next most signiﬁcant, and so on. But a simpler idea is to perform bucket sorts in the reverse order, starting with the least signiﬁcant “digit” ﬁrst. Of course, more than one number could fall into the same bucket, and unlike the original bucket sort, these numbers could be different, so we keep them in a list. Each pass is stable: Items that agree in the current digit retain the ordering determined in prior passes. The trace in Figure 7.24 shows the result of sorting 64, 8, 216, 512, 27, 729, 0, 1, 343, 125, which is the ﬁrst ten cubes arranged randomly (we use 0s to make clear the tens and hundreds digits). After the ﬁrst pass, the items are sorted on the least signiﬁcant digit, and in general, after the kth pass, the items are sorted on the k least signiﬁcant digits. So at the end, the items are completely sorted. To see that the algorithm works, notice that the only possible failure would occur if two numbers came out of the same bucket in the wrong order. But the previous passes ensure that when several numbers enter a bucket, they enter in sorted order according to the k-1 least signiﬁcant digits. The running time is O (p(N + b)) where p is the number of passes, N is the number of elements to sort, and b is the number of buckets. One application of radix sort is sorting strings. If all the strings have the same length L, then by using buckets for each character, we can implement a radix sort in O (NL) time. The most straightforward way of doing this is shown in Figure 7.25. In our code, we assume that all characters are ASCII, residing in the ﬁrst 256 positions of the Unicode character set. In each pass, we add an item to the appropriate bucket, and then after all the buckets are populated, we step through the buckets dumping everything back to the array. Notice that when a bucket is populated and emptied in the next pass, the order from the current pass is preserved. Counting radix sort is an alternative implementation of radix sort that avoids using vectors to represent buckets. Instead, we maintain a count of how many items would go in each bucket; this information can go into an array count, so that count[k] is the number of items that are in bucket k. Then we can use another array offset, so that offset[k] INITIAL ITEMS: 064, 008, 216, 512, 027, 729, 000, 001, 343, 125 SORTED BY 1’s digit: 000, 001, 512, 343, 064, 125, 216, 027, 008, 729 SORTED BY 10’s digit: 000, 001, 008, 512, 216, 125, 027, 729, 343, 064 SORTED BY 100’s digit: 000, 001, 008, 027, 064, 125, 216, 343, 512, 729 Figure 7.24 Radix sort trace

7.11 Linear-Time Sorts: Bucket Sort and Radix Sort 333 1 /* 2 * Radix sort an array of Strings 3 * Assume all are all ASCII 4 * Assume all have same length 5 */ 6 void radixSortA( vector<string> & arr, int stringLen ) 7{ 8 const int BUCKETS = 256; 9 vector<vector<string>> buckets( BUCKETS ); 10 11 for( int pos = stringLen - 1; pos >= 0; --pos ) 12 { 13 for( string & s : arr ) 14 buckets[ s[ pos ] ].push_back( std::move( s ) ); 15 16 int idx = 0; 17 for( auto & thisBucket : buckets ) 18 { 19 for( string & s : thisBucket ) 20 arr[ idx++ ] = std::move( s ); 21 22 thisBucket.clear( ); 23 } 24 } 25 } Figure 7.25 Simple implementation of radix sort for strings, using an ArrayList of buckets represents the number of items whose value is strictly smaller than k. Then when we see a value k for the ﬁrst time in the ﬁnal scan, offset[k] tells us a valid array spot where it can be written to (but we have to use a temporary array for the write), and after that is done, offset[k] can be incremented. Counting radix sort thus avoids the need to keep lists. As a further optimization, we can avoid using offset by reusing the count array. The modiﬁcation is that we initially have count[k+1] represent the number of items that are in bucket k. Then after that information is computed, we can scan the count array from the smallest to largest index and increment count[k] by count[k-1]. It is easy to verify that after this scan, the count array stores exactly the same information that would have been stored in offset. Figure 7.26 shows an implementation of counting radix sort. Lines 18 to 27 implement the logic above, assuming that the items are stored in array in and the result of a single pass is placed in array out. Initially, in represents arr and out represents the temporary array, buffer. After each pass, we switch the roles of in and out. If there are an even number of passes, then at the end, out is referencing arr, so the sort is complete. Otherwise, we have to move from the buffer back into arr.

334 Chapter 7 Sorting 1 /* 2 * Counting radix sort an array of Strings 3 * Assume all are all ASCII 4 * Assume all have same length 5 */ 6 void countingRadixSort( vectro<string> & arr, int stringLen ) 7{ 8 const int BUCKETS = 256; 9 10 int N = arr.size( ); 11 vector<string> buffer( N ); 12 13 vector<string> *in = &arr; 14 vector<string> *out = &buffer; 15 16 for( int pos = stringLen - 1; pos >= 0; --pos ) 17 { 18 vector<int> count( BUCKETS + 1 ); 19 20 for( int i = 0; i < N; ++i ) 21 ++count[ (*in)[ i ] [ pos ] + 1 ]; 22 23 for( int b = 1; b <= BUCKETS; ++b ) 24 count[ b ] += count[ b - 1 ]; 25 26 for( int i = 0; i < N; ++i ) 27 (*out)[ count[ (*in)[ i ] [ pos ] ]++ ] = std::move( (*in)[ i ] ); 28 29 // swap in and out roles 30 std::swap( in, out ); 31 } 32 33 // if odd number of passes, in is buffer, out is arr; so move back 34 if( stringLen % 2 == 1 ) 35 for( int i = 0; i < arr.size( ); ++i ) 36 (*out)[ i ] = std::move( (*in)[ i ] ); 37 } Figure 7.26 Counting radix sort for ﬁxed-length strings Generally, counting radix sort is prefereable to using vectors to store buckets, but it can suffer from poor locality (out is ﬁlled in non-sequentially) and thus, surprisingly, it is not always faster than using a vector of vectors. We can extend either version of radix sort to work with variable-length strings. The basic algorithm is to ﬁrst sort the strings by their length. Instead of looking at all the strings,

7.11 Linear-Time Sorts: Bucket Sort and Radix Sort 335 1 /* 2 * Radix sort an array of Strings 3 * Assume all are all ASCII 4 * Assume all have length bounded by maxLen 5 */ 6 void radixSort( vector<string> & arr, int maxLen ) 7{ 8 const int BUCKETS = 256; 9 10 vector<vector<string>> wordsByLength( maxLen + 1 ); 11 vector<vector<string>> buckets( BUCKETS ); 12 13 for( string & s : arr ) 14 wordsByLength[ s.length( ) ].push_back( std::move( s ) ); 15 16 int idx = 0; 17 for( auto & wordList : wordsByLength ) 18 for( string & s : wordList ) 19 arr[ idx++ ] = std::move( s ); 20 21 int startingIndex = arr.size( ); 22 for( int pos = maxLen - 1; pos >= 0; --pos ) 23 { 24 startingIndex -= wordsByLength[ pos + 1 ].size( ); 25 26 for( int i = startingIndex; i < arr.size( ); ++i ) 27 buckets[ arr[ i ][ pos ] ].push_back( std::move( arr[ i ] ) ); 28 29 idx = startingIndex; 30 for( auto & thisBucket : buckets ) 31 { 32 for( string & s : thisBucket ) 33 arr[ idx++ ] = std::move( s ); 34 35 thisBucket.clear( ); 36 } 37 } 38 } Figure 7.27 Radix sort for variable-length strings we can then look only at strings that we know are long enough. Since the string lengths are small numbers, the initial sort by length can be done by—bucket sort! Figure 7.27 shows this implementation of radix sort, with vectors to store buckets. Here, the words are grouped into buckets by length at lines 13 and 14 and then placed back into the array at lines 16 to 19. Lines 26 and 27 look at only those strings that have a character at position

336 Chapter 7 Sorting pos, by making use of the variable startingIndex, which is maintained at lines 21 and 24. Except for that, lines 21 to 37 in Figure 7.27 are the same as lines 11 to 24 in Figure 7.25. The running time of this version of radix sort is linear in the total number of characters in all the strings (each character appears exactly once at line 27, and the statement at line 33 executes precisely as many times as the line 27). Radix sort for strings will perform especially well when the characters in the string are drawn from a reasonably small alphabet and when the strings either are relatively short or are very similar. Because the O(N log N) comparison-based sorting algorithms will generally look only at a small number of characters in each string comparison, once the aver- age string length starts getting large, radix sort’s advantage is minimized or evaporates completely. 7.12 External Sorting So far, all the algorithms we have examined require that the input ﬁt into main memory. There are, however, applications where the input is much too large to ﬁt into memory. This section will discuss external sorting algorithms, which are designed to handle very large inputs. 7.12.1 Why We Need New Algorithms Most of the internal sorting algorithms take advantage of the fact that memory is directly addressable. Shellsort compares elements a[i] and a[i-hk] in one time unit. Heapsort compares elements a[i] and a[i*2+1] in one time unit. Quicksort, with median-of-three partitioning, requires comparing a[left], a[center], and a[right] in a constant number of time units. If the input is on a tape, then all these operations lose their efﬁciency, since elements on a tape can only be accessed sequentially. Even if the data are on a disk, there is still a practical loss of efﬁciency because of the delay required to spin the disk and move the disk head. To see how slow external accesses really are, create a random ﬁle that is large, but not too big to ﬁt in main memory. Read the ﬁle in and sort it using an efﬁcient algorithm. The time it takes to read the input is certain to be signiﬁcant compared to the time to sort the input, even though sorting is an O(N log N) operation and reading the input is only O(N). 7.12.2 Model for External Sorting The wide variety of mass storage devices makes external sorting much more device depen- dent than internal sorting. The algorithms that we will consider work on tapes, which are probably the most restrictive storage medium. Since access to an element on tape is done by winding the tape to the correct location, tapes can be efﬁciently accessed only in sequential order (in either direction). We will assume that we have at least three tape drives to perform the sorting. We need two drives to do an efﬁcient sort; the third drive simpliﬁes matters. If only one tape drive is present, then we are in trouble: Any algorithm will require (N2) tape accesses.

7.12 External Sorting 337 7.12.3 The Simple Algorithm The basic external sorting algorithm uses the merging algorithm from mergesort. Suppose we have four tapes, Ta1, Ta2, Tb1, Tb2, which are two input and two output tapes. Depending on the point in the algorithm, the a and b tapes are either input tapes or output tapes. Suppose the data are initially on Ta1. Suppose further that the internal memory can hold (and sort) M records at a time. A natural ﬁrst step is to read M records at a time from the input tape, sort the records internally, and then write the sorted records alternately to Tb1 and Tb2. We will call each set of sorted records a run. When this is done, we rewind all the tapes. Suppose we have the same input as our example for Shellsort. Ta1 81 94 11 96 12 35 17 99 28 58 41 75 15 Ta2 Tb1 Tb2 If M = 3, then after the runs are constructed, the tapes will contain the data indicated in the following ﬁgure. Ta1 Ta2 Tb1 11 81 94 17 28 99 15 Tb2 12 35 96 41 58 75 Now Tb1 and Tb2 contain a group of runs. We take the ﬁrst run from each tape and merge them, writing the result, which is a run twice as long, onto Ta1. Recall that merging two sorted lists is simple; we need almost no memory, since the merge is performed as Tb1 and Tb2 advance. Then we take the next run from each tape, merge these, and write the result to Ta2. We continue this process, alternating between Ta1 and Ta2, until either Tb1 or Tb2 is empty. At this point either both are empty or there is one run left. In the latter case, we copy this run to the appropriate tape. We rewind all four tapes and repeat the same steps, this time using the a tapes as input and the b tapes as output. This will give runs of 4M. We continue the process until we get one run of length N. This algorithm will require log(N/M) passes, plus the initial run-constructing pass. For instance, if we have 10 million records of 128 bytes each, and four megabytes of internal memory, then the ﬁrst pass will create 320 runs. We would then need nine more passes to complete the sort. Our example requires log 13/3 = 3 more passes, which are shown in the following ﬁgures: Ta1 11 12 35 81 94 96 15 Ta2 17 28 41 58 75 99 Tb1 Tb2

338 Chapter 7 Sorting Ta1 Ta2 Tb1 11 12 17 28 35 41 58 75 81 94 96 99 Tb2 15 Ta1 11 12 15 17 28 35 41 58 75 81 94 96 99 Ta2 Tb1 Tb2 7.12.4 Multiway Merge If we have extra tapes, then we can expect to reduce the number of passes required to sort our input. We do this by extending the basic (two-way) merge to a k-way merge. Merging two runs is done by winding each input tape to the beginning of each run. Then the smaller element is found, placed on an output tape, and the appropriate input tape is advanced. If there are k input tapes, this strategy works the same way, the only difference being that it is slightly more complicated to ﬁnd the smallest of the k elements. We can ﬁnd the smallest of these elements by using a priority queue. To obtain the next element to write on the output tape, we perform a deleteMin operation. The appropriate input tape is advanced, and if the run on the input tape is not yet completed, we insert the new element into the priority queue. Using the same example as before, we distribute the input onto the three tapes. Ta1 Ta2 Ta3 Tb1 11 81 94 41 58 75 Tb2 12 35 96 15 Tb3 17 28 99 We then need two more passes of three-way merging to complete the sort. Ta1 11 12 17 28 35 81 94 96 99 Ta2 15 41 58 75 Ta3 Tb1 Tb2 Tb3

7.12 External Sorting 339 Ta1 Ta2 Ta3 Tb1 11 12 15 17 28 35 41 58 75 81 94 96 99 Tb2 Tb3 After the initial run construction phase, the number of passes required using k-way merging is logk(N/M) , because the runs get k times as large in each pass. For the example above, the formula is veriﬁed, since log3(13/3) = 2. If we have 10 tapes, then k = 5, and our large example from the previous section would require log5 320 = 4 passes. 7.12.5 Polyphase Merge The k-way merging strategy developed in the last section requires the use of 2k tapes. This could be prohibitive for some applications. It is possible to get by with only k + 1 tapes. As an example, we will show how to perform two-way merging using only three tapes. Suppose we have three tapes, T1, T2, and T3, and an input ﬁle on T1 that will produce 34 runs. One option is to put 17 runs on each of T2 and T3. We could then merge this result onto T1, obtaining one tape with 17 runs. The problem is that since all the runs are on one tape, we must now put some of these runs on T2 to perform another merge. The logical way to do this is to copy the ﬁrst eight runs from T1 onto T2 and then perform the merge. This has the effect of adding an extra half pass for every pass we do. An alternative method is to split the original 34 runs unevenly. Suppose we put 21 runs on T2 and 13 runs on T3. We would then merge 13 runs onto T1 before T3 was empty. At this point, we could rewind T1 and T3, and merge T1, with 13 runs, and T2, which has 8 runs, onto T3. We could then merge 8 runs until T2 was empty, which would leave 5 runs left on T1 and 8 runs on T3. We could then merge T1 and T3, and so on. The following table shows the number of runs on each tape after each pass: Run After After After After After After After Const. T3 + T2 T1 + T2 T1 + T3 T2 + T3 T1 + T2 T1 + T3 T2 + T3 T1 0 13 5 0 3 1 0 1 T2 21 8 0 5 2 0 1 0 T3 13 0 8 3 0 2 1 0 The original distribution of runs makes a great deal of difference. For instance, if 22 runs are placed on T2, with 12 on T1, then after the ﬁrst merge, we obtain 12 runs on T3 and 10 runs on T2. After another merge, there are 10 runs on T1 and 2 runs on T3. At this point the going gets slow, because we can only merge two sets of runs before T3 is exhausted. Then T1 has 8 runs and T2 has 2 runs. Again, we can only merge two sets of runs, obtaining T1 with 6 runs and T3 with 2 runs. After three more passes, T2 has two

340 Chapter 7 Sorting runs and the other tapes are empty. We must copy one run to another tape, and then we can ﬁnish the merge. It turns out that the ﬁrst distribution we gave is optimal. If the number of runs is a Fibonacci number FN, then the best way to distribute them is to split them into two Fibonacci numbers FN−1 and FN−2. Otherwise, it is necessary to pad the tape with dummy runs in order to get the number of runs up to a Fibonacci number. We leave the details of how to place the initial set of runs on the tapes as an exercise. We can extend this to a k-way merge, in which case we need kth order Fibonacci numbers for the distribution, where the kth order Fibonacci number is deﬁned as F(k)(N) = F(k)(N − 1) + F(k)(N − 2) + · · · + F(k)(N − k), with the appropriate initial conditions F(k)(N) = 0, 0 ≤ N ≤ k − 2, F(k)(k − 1) = 1. 7.12.6 Replacement Selection The last item we will consider is construction of the runs. The strategy we have used so far is the simplest possible: We read as many records as possible and sort them, writing the result to some tape. This seems like the best approach possible, until one realizes that as soon as the ﬁrst record is written to an output tape, the memory it used becomes available for another record. If the next record on the input tape is larger than the record we have just output, then it can be included in the run. Using this observation, we can give an algorithm for producing runs. This technique is commonly referred to as replacement selection. Initially, M records are read into memory and placed in a priority queue. We perform a deleteMin, writing the smallest (valued) record to the output tape. We read the next record from the input tape. If it is larger than the record we have just written, we can add it to the priority queue. Otherwise, it cannot go into the current run. Since the priority queue is smaller by one element, we can store this new element in the dead space of the priority queue until the run is completed and use the element for the next run. Storing an element in the dead space is similar to what is done in heapsort. We continue doing this until the size of the priority queue is zero, at which point the run is over. We start a new run by building a new priority queue, using all the elements in the dead space. Figure 7.28 shows the run construction for the small example we have been using, with M = 3. Dead elements are indicated by an asterisk. In this example, replacement selection produces only three runs, compared with the ﬁve runs obtained by sorting. Because of this, a three-way merge ﬁnishes in one pass instead of two. If the input is randomly distributed, replacement selection can be shown to produce runs of average length 2M. For our large example, we would expect 160 runs instead of 320 runs, so a ﬁve-way merge would require four passes. In this case, we have not saved a pass, although we might if we get lucky and have 125 runs or less. Since external sorts take so long, every pass saved can make a signiﬁcant difference in the running time. As we have seen, it is possible for replacement selection to do no better than the stan- dard algorithm. However, the input is frequently sorted or nearly sorted to start with, in which case replacement selection produces only a few very long runs. This kind of input is common for external sorts and makes replacement selection extremely valuable.

Exercises 341 Run 1 3 Elements in Heap Array Output Next Element Read Run 2 h[1] h[2] h[3] 11 96 Run 3 11 94 81 81 12* 81 94 96 94 35* 94 96 12* 96 17* 96 35* 12* End of Run Rebuild Heap 17* 35* 12* 12 99 12 35 17 17 28 17 35 99 28 58 28 99 35 35 41 35 99 58 41 15* 41 99 58 58 End of Tape 58 99 15* 99 99 15* End of Run Rebuild Heap 15 15* 15 Figure 7.28 Example of run construction Summary Sorting is one of the oldest and most well-studied problems in computing. For most general internal sorting applications, an insertion sort, Shellsort, mergesort, or quicksort is the method of choice. The decision regarding which to use depends on the size of the input and on the underlying environment. Insertion sort is appropriate for very small amounts of input. Shellsort is a good choice for sorting moderate amounts of input. With a proper increment sequence, it gives excellent performance and uses only a few lines of code. Mergesort has O(N log N) worst-case performance but requires additional space. However, the number of comparisons that are used is nearly optimal, because any algorithm that sorts by using only element comparisons must use at least log (N!) comparisons for some input sequence. Quicksort does not by itself provide this worst-case guarantee and is tricky to code. However, it has almost certain O(N log N) performance and can be combined with heapsort to give an O(N log N) worst-case guarantee. Strings can be sorted in linear time using radix sort, and this may be a practical alternative to comparison-based sorts in some instances. Exercises 7.1 Sort the sequence 3, 1, 4, 1, 5, 9, 2, 6, 5 using insertion sort. 7.2 What is the running time of insertion sort if all elements are equal?

342 Chapter 7 Sorting 7.3 Suppose we exchange elements a[i] and a[i+k], which were originally out of order. Prove that at least 1 and at most 2k − 1 inversions are removed. 7.4 Show the result of running Shellsort on the input 9, 8, 7, 6, 5, 4, 3, 2, 1 using the increments {1, 3, 7}. 7.5 a. What is the running time of Shellsort using the two-increment sequence {1, 2}? b. Show that for any N, there exists a three-increment sequence such that Shellsort runs in O(N5/3) time. c. Show that for any N, there exists a six-increment sequence such that Shellsort runs in O(N3/2) time. 7.6 a. Prove that the running time of Shellsort is (N2) using increments of the form 1, c, c2, . . . , ci for any integer c. b. Prove that for these increments, the average running time is (N3/2). 7.7 Prove that if a k-sorted ﬁle is then h-sorted, it remains k-sorted. 7.8 Prove that the running time of Shellsort, using the increment sequence suggested by Hibbard, is (N3/2) in the worst case. (Hint: You can prove the bound by con- sidering the special case of what Shellsort does when all elements are either 0 or 1.) Set a[i] = 1 if i is expressible as a linear combination of ht, ht−1, . . . , h t/2 +1 and 0 otherwise. 7.9 Determine the running time of Shellsort for a. sorted input b. reverse-ordered input 7.10 Do either of the following modiﬁcations to the Shellsort routine coded in Figure 7.6 affect the worst-case running time? a. Before line 8, subtract one from gap if it is even. b. Before line 8, add one to gap if it is even. 7.11 Show how heapsort processes the input 142, 543, 123, 65, 453, 879, 572, 434, 111, 242, 811, 102. 7.12 What is the running time of heapsort for presorted input? 7.13 Show that there are inputs that force every percolateDown in heapsort to go all the way to a leaf. (Hint: Work backward.) 7.14 Rewrite heapsort so that it sorts only items that are in the range low to high, which are passed as additional parameters. 7.15 Sort 3, 1, 4, 1, 5, 9, 2, 6 using mergesort. 7.16 How would you implement mergesort without using recursion? 7.17 Determine the running time of mergesort for a. sorted input b. reverse-ordered input c. random input 7.18 In the analysis of mergesort, constants have been disregarded. Prove that the number of comparisons used in the worst case by mergesort is N log N − 2 log N + 1.

Exercises 343 7.19 Sort 3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5 using quicksort with median-of-three partitioning and a cutoff of 3. 7.20 Using the quicksort implementation in this chapter, determine the running time of quicksort for a. sorted input b. reverse-ordered input c. random input 7.21 Repeat Exercise 7.20 when the pivot is chosen as a. the ﬁrst element b. the larger of the ﬁrst two distinct elements c. a random element d. the average of all elements in the set 7.22 a. For the quicksort implementation in this chapter, what is the running time when all keys are equal? b. Suppose we change the partitioning strategy so that neither i nor j stops when an element with the same key as the pivot is found. What ﬁxes need to be made in the code to guarantee that quicksort works, and what is the running time when all keys are equal? c. Suppose we change the partitioning strategy so that i stops at an element with the same key as the pivot, but j does not stop in a similar case. What ﬁxes need to be made in the code to guarantee that quicksort works, and when all keys are equal, what is the running time of quicksort? 7.23 Suppose we choose the element in the middle position of the array as pivot. Does this make it unlikely that quicksort will require quadratic time? 7.24 Construct a permutation of 20 elements that is as bad as possible for quicksort using median-of-three partitioning and a cutoff of 3. 7.25 The quicksort in the text uses two recursive calls. Remove one of the calls as follows: a. Rewrite the code so that the second recursive call is unconditionally the last line in quicksort. Do this by reversing the if/else and returning after the call to insertionSort. b. Remove the tail recursion by writing a while loop and altering left. 7.26 Continuing from Exercise 7.25, after part (a), a. Perform a test so that the smaller subarray is processed by the ﬁrst recursive call, while the larger subarray is processed by the second recursive call. b. Remove the tail recursion by writing a while loop and altering left or right, as necessary. c. Prove that the number of recursive calls is logarithmic in the worst case. 7.27 Suppose the recursive quicksort receives an int parameter, depth, from the driver that is initially approximately 2 log N. a. Modify the recursive quicksort to call heapsort on its current subarray if the level of recursion has reached depth. (Hint: Decrement depth as you make recursive calls; when it is 0, switch to heapsort.)

344 Chapter 7 Sorting b. Prove that the worst-case running time of this algorithm is O(N log N). c. Conduct experiments to determine how often heapsort gets called. d. Implement this technique in conjunction with tail-recursion removal in Exercise 7.25. e. Explain why the technique in Exercise 7.26 would no longer be needed. 7.28 When implementing quicksort, if the array contains lots of duplicates, it may be better to perform a three-way partition (into elements less than, equal to, and greater than the pivot) to make smaller recursive calls. Assume three-way comparisons. a. Give an algorithm that performs a three-way in-place partition of an N-element subarray using only N − 1 three-way comparisons. If there are d items equal to the pivot, you may use d additional Comparable swaps, above and beyond the two-way partitioning algorithm. (Hint: As i and j move toward each other, maintain ﬁve groups of elements as shown below): EQUAL SMALL UNKNOWN LARGE EQUAL ij b. Prove that using the algorithm above, sorting an N-element array that contains only d different values, takes O(dN) time. 7.29 Write a program to implement the selection algorithm. 7.30 Solve the following recurrence: T(N) = (1/N)[ N−1 T(i)] + cN, T(0) = 0. i=0 7.31 A sorting algorithm is stable if elements with equal elements are left in the same order as they occur in the input. Which of the sorting algorithms in this chapter are stable and which are not? Why? 7.32 Suppose you are given a sorted list of N elements followed by f(N) randomly ordered elements. How would you sort the entire list if a. f(N) = O(1)? b. f(N) = O(l√og N)? c. f(N) = O( N)? d. How large can f(N) be for the entire list still to be sortable in O(N) time? 7.33 Prove that any algorithm that ﬁnds an element X in a sorted list of N elements 7.34 requires (log Nfo)rcmoumlap,aNri!so≈ns(.N/e)N√2π N, give a precise estimate for log(N!). Using Stirling’s 7.35 a. In how many ways can two sorted arrays of N elements be merged? b. Give a nontrivial lower bound on the number of comparisons required to merge two sorted lists of N elements by taking the logarithm of your answer in part (a). 7.36 Prove that merging two sorted arrays of N items requires at least 2N − 1 compar- isons. You must show that if two elements in the merged list are consecutive and from different lists, then they must be compared. 7.37 Consider the following algorithm for sorting six numbers: r Sort the ﬁrst three numbers using Algorithm A. r Sort the second three numbers using Algorithm B. r Merge the two sorted groups using Algorithm C.

Exercises 345 Show that this algorithm is suboptimal, regardless of the choices for Algorithms A, B, and C. 7.38 Write a program that reads N points in a plane and outputs any group of four or more colinear points (i.e., points on the same line). The obvious brute-force algorithm requires O(N4) time. However, there is a better algorithm that makes use of sorting and runs in O(N2 log N) time. 7.39 Show that the two smallest elements among N can be found in N + log N − 2 comparisons. 7.40 The following divide-and-conquer algorithm is proposed for ﬁnding the simul- taneous maximum and minimum: If there is one item, it is the maximum and minimum, and if there are two items, then compare them, and in one compari- son you can ﬁnd the maximum and minimum. Otherwise, split the input into two halves, divided as evenly as possibly (if N is odd, one of the two halves will have one more element than the other). Recursively ﬁnd the maximum and minimum of each half, and then in two additional comparisons produce the maximum and minimum for the entire problem. a. Suppose N is a power of 2. What is the exact number of comparisons used by this algorithm? b. Suppose N is of the form 3 · 2k. What is the exact number of comparisons used by this algorithm? c. Modify the algorithm as follows: When N is even, but not divisible by four, split the input into sizes of N/2 − 1 and N/2 + 1. What is the exact number of comparisons used by this algorithm? 7.41 Suppose we want to partition N items into G equal-sized groups of size N/G, such that the smallest N/G items are in group 1, the next smallest N/G items are in group 2, and so on. The groups themselves do not have to be sorted. For simplicity, you may assume that N and G are powers of two. a. Give an O(N log G) algorithm to solve this problem. b. Prove an (N log G) lower bound to solve this problem using comparison-based algorithms. 7.42 Give a linear-time algorithm to sort N fractions, each of whose numerators and denominators are integers between 1 and N. 7.43 Suppose arrays A and B are both sorted and both contain N elements. Give an O(log N) algorithm to ﬁnd the median of A ∪ B. 7.44 Suppose you have an array of N elements containing only two distinct keys, true and false. Give an O(N) algorithm to rearrange the list so that all false elements precede the true elements. You may use only constant extra space. 7.45 Suppose you have an array of N elements, containing three distinct keys, true, false, and maybe. Give an O(N) algorithm to rearrange the list so that all false elements precede maybe elements, which in turn precede true elements. You may use only constant extra space. 7.46 a. Prove that any comparison-based algorithm to sort 4 elements requires 5 comparisons. b. Give an algorithm to sort 4 elements in 5 comparisons.

346 Chapter 7 Sorting 7.47 a. Prove that 7 comparisons are required to sort 5 elements using any comparison- based algorithm. b. Give an algorithm to sort 5 elements with 7 comparisons. 7.48 Write an efﬁcient version of Shellsort and compare performance when the following increment sequences are used: a. Shell’s original sequence b. Hibbard’s increments c. Knuth’s increments: hi = 1 (3i + 1) 2 hk+1 d. Gonnet’s increments: ht = N and hk = 2.2 (with h1 = 1 if h2 = 2) 2.2 e. Sedgewick’s increments 7.49 Implement an optimized version of quicksort and experiment with combinations of the following: a. pivot: ﬁrst element, middle element, random element, median of three, median of ﬁve b. cutoff values from 0 to 20 7.50 Write a routine that reads in two alphabetized ﬁles and merges them together, forming a third, alphabetized, ﬁle. 7.51 Suppose we implement the median-of-three routine as follows: Find the median of a[left], a[center], a[right], and swap it with a[right]. Proceed with the normal partitioning step starting i at left and j at right-1 (instead of left+1 and right-2). a. Suppose the input is 2, 3, 4, . . . , N − 1, N, 1. For this input, what is the running time of this version of quicksort? b. Suppose the input is in reverse order. For this input, what is the running time of this version of quicksort? 7.52 Prove that any comparison-based sorting algorithm requires (N log N) compar- isons on average. 7.53 We are given an array that contains N numbers. We want to determine if there are two numbers whose sum equals a given number K. For instance, if the input is 8, 4, 1, 6, and K is 10, then the answer is yes (4 and 6). A number may be used twice. Do the following: a. Give an O(N2) algorithm to solve this problem. b. Give an O(N log N) algorithm to solve this problem. (Hint: Sort the items ﬁrst. After that is done, you can solve the problem in linear time.) c. Code both solutions and compare the running times of your algorithms. 7.54 Repeat Exercise 7.53 for four numbers. Try to design an O(N2 log N) algorithm. (Hint: Compute all possible sums of two elements. Sort these possible sums. Then proceed as in Exercise 7.53.) 7.55 Repeat Exercise 7.53 for three numbers. Try to design an O(N2) algorithm. 7.56 Consider the following strategy for percolateDown: We have a hole at node X. The normal routine is to compare X’s children and then move the child up to X if it is larger (in the case of a (max)heap) than the element we are trying to place, thereby pushing the hole down; we stop when it is safe to place the new element in the hole. The alternative strategy is to move elements up and the hole down as far as

References 347 possible, without testing whether the new cell can be inserted. This would place the new cell in a leaf and probably violate the heap order; to ﬁx the heap order, percolate the new cell up in the normal manner. Write a routine to include this idea, and compare the running time with a standard implementation of heapsort. 7.57 Propose an algorithm to sort a large ﬁle using only two tapes. 7.58 a. Show that a lower bound of N!/22N on the number of heaps is implied by the fact that buildHeap uses at most 2N comparisons. b. Use Stirling’s formula to expand this bound. 7.59 M is an N-by-N matrix in which the entries in each rows are in increasing order and the entries in each column are in increasing order (reading top to bottom). Consider the problem of determining if x is in M using three-way comparisons (i.e., one comparison of x with M[i][j] tells you either that x is less than, equal to, or greater than M[i][j]). a. Give an algorithm that uses at most 2N − 1 comparisons. b. Prove that any algorithm must use at least 2N − 1 comparisons. 7.60 There is a prize hidden in a box; the value of the prize is a positive integer between 1 and N, and you are given N. To win the prize, you have to guess its value. Your goal is to do it in as few guesses as possible; however, among those guesses, you may only make at most g guesses that are too high. The value g will be speciﬁed at the start of the game, and if you make more than g guesses that are too high, you lose. So, for example, if g = 0, you then can win in N guesses by simply guessing the sequence 1, 2, 3, . . . a. Suppose g = log N . What strategy minimizes the number of guesses? b. Suppose g = 1. Show that you can always win in O ( N1/2 ) guesses. c. Suppose g = 1. Show that any algorithm that wins the prize must use ( N1/2 ) guesses. d. Give an algorithm and matching lower bound for any constant g. References Knuth’s book [16] is a comprehensive reference for sorting. Gonnet and Baeza-Yates [5] has some more results, as well as a huge bibliography. The original paper detailing Shellsort is [29]. The paper by Hibbard [9] suggested the use of the increments 2k − 1 and tightened the code by avoiding swaps. Theorem 7.4 is from [19]. Pratt’s lower bound, which uses a more complex method than that sug- gested in the text, can be found in [22]. Improved increment sequences and upper bounds appear in [13], [28], and [31]; matching lower bounds have been shown in [32]. It has been shown that no increment sequence gives an O(N log N) worst-case running time [20]. The average-case running time for Shellsort is still unresolved. Yao [34] has performed an extremely complex analysis for the three-increment case. The result has yet to be extended to more increments, but has been slightly improved [14]. The paper by Jiang, Li, and Vityani [15] has shown an (p N1+1/p) lower bound on the average-case running time of p-pass Shellsort. Experiments with various increment sequences appear in [30].

348 Chapter 7 Sorting Heapsort was invented by Williams [33]; Floyd [4] provided the linear-time algorithm for heap construction. Theorem 7.5 is from [23]. An exact average-case analysis of mergesort has been described in [7]. An algorithm to perform merging in linear time without extra space is described in [12]. Quicksort is from Hoare [10]. This paper analyzes the basic algorithm, describes most of the improvements, and includes the selection algorithm. A detailed analysis and empir- ical study was the subject of Sedgewick’s dissertation [27]. Many of the important results appear in the three papers [24], [25], and [26]. [1] provides a detailed C implemen- tation with some additional improvements, and points out that older implementations of the UNIX qsort library routine are easily driven to quadratic behavior. Exercise 7.27 is from [18]. Decision trees and sorting optimality are discussed in Ford and Johnson [5]. This paper also provides an algorithm that almost meets the lower bound in terms of number of comparisons (but not other operations). This algorithm was eventually shown to be slightly suboptimal by Manacher [17]. The selection lower bounds obtained in Theorem 7.9 are from [6]. The lower bound for ﬁnding the maximum and minimum simultaneously is from Pohl [21]. The current best lower bound for ﬁnding the median is slightly above 2N comparisons due to Dor and Zwick [3]; they also have the best upper bound, which is roughly 2.95N comparisons [2]. External sorting is covered in detail in [16]. Stable sorting, described in Exercise 7.31, has been addressed by Horvath [11]. 1. J. L. Bentley and M. D. McElroy, “Engineering a Sort Function,” Software—Practice and Experience, 23 (1993), 1249–1265. 2. D. Dor and U. Zwick, “Selecting the Median,” SIAM Journal on Computing, 28 (1999), 1722– 1758. 3. D. Dor and U. Zwick, “Median Selection Requires (2 + ε)n Comparisons,” SIAM Journal on Discrete Math, 14 (2001), 312–325. 4. R. W. Floyd, “Algorithm 245: Treesort 3,” Communications of the ACM, 7 (1964), 701. 5. L. R. Ford and S. M. Johnson, “A Tournament Problem,” American Mathematics Monthly, 66 (1959), 387–389. 6. F. Fussenegger and H. Gabow, “A Counting Approach to Lower Bounds for Selection Problems,” Journal of the ACM, 26 (1979), 227–238. 7. M. Golin and R. Sedgewick, “Exact Analysis of Mergesort,” Fourth SIAM Conference on Discrete Mathematics, 1988. 8. G. H. Gonnet and R. Baeza-Yates, Handbook of Algorithms and Data Structures, 2d ed., Addison-Wesley, Reading, Mass., 1991. 9. T. H. Hibbard, “An Empirical Study of Minimal Storage Sorting,” Communications of the ACM, 6 (1963), 206–213. 10. C. A. R. Hoare, “Quicksort,” Computer Journal, 5 (1962), 10–15. 11. E. C. Horvath, “Stable Sorting in Asymptotically Optimal Time and Extra Space,” Journal of the ACM, 25 (1978), 177–199. 12. B. Huang and M. Langston, “Practical In-place Merging,” Communications of the ACM, 31 (1988), 348–352. 13. J. Incerpi and R. Sedgewick, “Improved Upper Bounds on Shellsort,” Journal of Computer and System Sciences, 31 (1985), 210–224.

References 349 14. S. Janson and D. E. Knuth, “Shellsort with Three Increments,” Random Structures and Algorithms, 10 (1997), 125–142. 15. T. Jiang, M. Li, and P. Vitanyi, “A Lower Bound on the Average-Case Complexity of Shellsort,” Journal of the ACM, 47 (2000), 905–911. 16. D. E. Knuth, The Art of Computer Programming. Volume 3: Sorting and Searching, 2d ed., Addison-Wesley, Reading, Mass., 1998. 17. G. K. Manacher, “The Ford-Johnson Sorting Algorithm Is Not Optimal,” Journal of the ACM, 26 (1979), 441–456. 18. D. R. Musser, “Introspective Sorting and Selection Algorithms,” Software—Practice and Experience, 27 (1997), 983–993. 19. A. A. Papernov and G. V. Stasevich, “A Method of Information Sorting in Computer Memories,” Problems of Information Transmission, 1 (1965), 63–75. 20. C. G. Plaxton, B. Poonen, and T. Suel, “Improved Lower Bounds for Shellsort,” Proceedings of the Thirty-third Annual Symposium on the Foundations of Computer Science (1992), 226–235. 21. I. Pohl, “A Sorting Problem and Its Complexity,” Communications of the ACM, 15 (1972), 462–464. 22. V. R. Pratt, Shellsort and Sorting Networks, Garland Publishing, New York, 1979. (Originally presented as the author’s Ph.D. thesis, Stanford University, 1971.) 23. R. Schaffer and R. Sedgewick, “The Analysis of Heapsort,” Journal of Algorithms, 14 (1993), 76–100. 24. R. Sedgewick, “Quicksort with Equal Keys,” SIAM Journal on Computing, 6 (1977), 240–267. 25. R. Sedgewick, “The Analysis of Quicksort Programs,” Acta Informatica, 7 (1977), 327–355. 26. R. Sedgewick, “Implementing Quicksort Programs,” Communications of the ACM, 21 (1978), 847–857. 27. R. Sedgewick, Quicksort, Garland Publishing, New York, 1978. (Originally presented as the author’s Ph.D. thesis, Stanford University, 1975.) 28. R. Sedgewick, “A New Upper Bound for Shellsort,” Journal of Algorithms, 7 (1986), 159–173. 29. D. L. Shell, “A High-Speed Sorting Procedure,” Communications of the ACM, 2 (1959), 30–32. 30. M. A. Weiss, “Empirical Results on the Running Time of Shellsort,” Computer Journal, 34 (1991), 88–91. 31. M. A. Weiss and R. Sedgewick, “More on Shellsort Increment Sequences,” Information Processing Letters, 34 (1990), 267–270. 32. M. A. Weiss and R. Sedgewick, “Tight Lower Bounds for Shellsort,” Journal of Algorithms, 11 (1990), 242–251. 33. J. W. J. Williams, “Algorithm 232: Heapsort,” Communications of the ACM, 7 (1964), 347–348. 34. A. C. Yao, “An Analysis of (h, k, 1) Shellsort,” Journal of Algorithms, 1 (1980), 14–50.

This page intentionally left blank

8C H A P T E R The Disjoint Sets Class In this chapter, we describe an efﬁcient data structure to solve the equivalence problem. 351 The data structure is simple to implement. Each routine requires only a few lines of code, and a simple array can be used. The implementation is also extremely fast, requiring constant average time per operation. This data structure is also very interesting from a theoretical point of view, because its analysis is extremely difﬁcult; the functional form of the worst case is unlike any we have yet seen. For the disjoint sets data structure, we will . . . r Show how it can be implemented with minimal coding effort. r Greatly increase its speed, using just two simple observations. r Analyze the running time of a fast implementation. r See a simple application. 8.1 Equivalence Relations A relation R is deﬁned on a set S if for every pair of elements (a, b), a, b ∈ S, a R b is either true or false. If a R b is true, then we say that a is related to b. An equivalence relation is a relation R that satisﬁes three properties: 1. (Reﬂexive) a R a, for all a ∈ S. 2. (Symmetric) a R b if and only if b R a. 3. (Transitive) a R b and b R c implies that a R c. We will consider several examples. The ≤ relationship is not an equivalence relationship. Although it is reﬂexive, since a ≤ a, and transitive, since a ≤ b and b ≤ c implies a ≤ c, it is not symmetric, since a ≤ b does not imply b ≤ a. Electrical connectivity, where all connections are by metal wires, is an equivalence relation. The relation is clearly reﬂexive, as any component is connected to itself. If a is electrically connected to b, then b must be electrically connected to a, so the relation is symmetric. Finally, if a is connected to b and b is connected to c, then a is connected to c. Thus electrical connectivity is an equivalence relation. Two cities are related if they are in the same country. It is easily veriﬁed that this is an equivalence relation. Suppose town a is related to b if it is possible to travel from a to b by taking roads. This relation is an equivalence relation if all the roads are two-way.

352 Chapter 8 The Disjoint Sets Class 8.2 The Dynamic Equivalence Problem Given an equivalence relation ∼, the natural problem is to decide, for any a and b, if a ∼ b. If the relation is stored as a two-dimensional array of Boolean variables, then, of course, this can be done in constant time. The problem is that the relation is usually not explicitly, but rather implicitly, deﬁned. As an example, suppose the equivalence relation is deﬁned over the ﬁve-element set {a1, a2, a3, a4, a5}. Then there are 25 pairs of elements, each of which is either related or not. However, the information a1 ∼ a2, a3 ∼ a4, a5 ∼ a1, a4 ∼ a2 implies that all pairs are related. We would like to be able to infer this quickly. The equivalence class of an element a ∈ S is the subset of S that contains all the elements that are related to a. Notice that the equivalence classes form a partition of S: Every member of S appears in exactly one equivalence class. To decide if a ∼ b, we need only to check whether a and b are in the same equivalence class. This provides our strategy to solve the equivalence problem. The input is initially a collection of N sets, each with one element. This initial repre- sentation is that all relations (except reﬂexive relations) are false. Each set has a different element, so that Si ∩ Sj = ∅; this makes the sets disjoint. There are two permissible operations. The ﬁrst is find, which returns the name of the set (that is, the equivalence class) containing a given element. The second operation adds relations. If we want to add the relation a ∼ b, then we ﬁrst see if a and b are already related. This is done by performing finds on both a and b and checking whether they are in the same equivalence class. If they are not, then we apply union.1 This operation merges the two equivalence classes containing a and b into a new equivalence class. From a set point of view, the result of ∪ is to create a new set Sk = Si ∪ Sj, destroying the originals and preserving the disjointness of all the sets. The algorithm to do this is frequently known as the disjoint set union/ﬁnd algorithm for this reason. This algorithm is dynamic because, during the course of the algorithm, the sets can change via the union operation. The algorithm must also operate online: When a find is performed, it must give an answer before continuing. Another possibility would be an ofﬂine algorithm. Such an algorithm would be allowed to see the entire sequence of unions and finds. The answer it provides for each find must still be consistent with all the unions that were performed up until the find, but the algorithm can give all its answers after it has seen all the questions. The difference is similar to taking a written exam (which is generally ofﬂine—you only have to give the answers before time expires) or an oral exam (which is online, because you must answer the current question before proceeding to the next question). Notice that we do not perform any operations comparing the relative values of elements but merely require knowledge of their location. For this reason, we can assume that all the elements have been numbered sequentially from 0 to N − 1 and that the numbering can 1 union is a (little-used) reserved word in C++. We use it throughout in describing the union/ﬁnd algorithm, but when we write code, the member function will be named unionSets.

8.3 Basic Data Structure 353 be determined easily by some hashing scheme. Thus, initially we have Si = {i} for i = 0 through N − 1.2 Our second observation is that the name of the set returned by find is actually fairly arbitrary. All that really matters is that find(a)==find(b) is true if and only if a and b are in the same set. These operations are important in many graph theory problems and also in compilers which process equivalence (or type) declarations. We will see an application later. There are two strategies to solve this problem. One ensures that the find instruction can be executed in constant worst-case time, and the other ensures that the union instruction can be executed in constant worst-case time. It has recently been shown that both cannot be done simultaneously in constant worst-case time. We will now brieﬂy discuss the ﬁrst approach. For the find operation to be fast, we could maintain, in an array, the name of the equivalence class for each element. Then find is just a simple O(1) lookup. Suppose we want to perform union(a,b). Suppose that a is in equivalence class i and b is in equivalence class j. Then we scan down the array, changing all is to j. Unfortunately, this scan takes (N). Thus, a sequence of N − 1 unions (the maximum, since then everything is in one set) would take (N2) time. If there are (N2) find operations, this performance is ﬁne, since the total running time would then amount to O(1) for each union or find operation over the course of the algorithm. If there are fewer finds, this bound is not acceptable. One idea is to keep all the elements that are in the same equivalence class in a linked list. This saves time when updating, because we do not have to search through the entire array. This by itself does not reduce the asymptotic running time, because it is still possible to perform (N2) equivalence class updates over the course of the algorithm. If we also keep track of the size of each equivalence class, and when performing unions we change the name of the smaller equivalence class to the larger, then the total time spent for N − 1 merges is O(N log N). The reason for this is that each element can have its equivalence class changed at most log N times, since every time its class is changed, its new equivalence class is at least twice as large as its old. Using this strategy, any sequence of M finds and up to N − 1 unions takes at most O(M + N log N) time. In the remainder of this chapter, we will examine a solution to the union/ﬁnd problem that makes unions easy but finds hard. Even so, the running time for any sequence of at most M finds and up to N − 1 unions will be only a little more than O(M + N). 8.3 Basic Data Structure Recall that the problem does not require that a find operation return any speciﬁc name, just that finds on two elements return the same answer if and only if they are in the same set. One idea might be to use a tree to represent each set, since each element in a tree has the same root. Thus, the root can be used to name the set. We will represent each set by a tree. (Recall that a collection of trees is known as a forest.) Initially, each set contains one element. The trees we will use are not necessarily binary trees, but their representation is 2 This reﬂects the fact that array indices start at 0.

354 Chapter 8 The Disjoint Sets Class 01234567 Figure 8.1 Eight elements, initially in different sets easy, because the only information we will need is a parent link. The name of a set is given by the node at the root. Since only the name of the parent is required, we can assume that this tree is stored implicitly in an array: Each entry s[i] in the array represents the parent of element i. If i is a root, then s[i] = −1. In the forest in Figure 8.1, s[i] = −1 for 0 ≤ i < 8. As with binary heaps, we will draw the trees explicitly, with the understanding that an array is being used. Figure 8.1 shows the explicit representation. We will draw the root’s parent link vertically for convenience. To perform a union of two sets, we merge the two trees by making the parent link of one tree’s root link to the root node of the other tree. It should be clear that this operation takes constant time. Figures 8.2, 8.3, and 8.4 represent the forest after each of union(4,5), union(6,7), union(4,6), where we have adopted the convention that the new root after the union(x,y) is x. The implicit representation of the last forest is shown in Figure 8.5. A find(x) on element x is performed by returning the root of the tree containing x. The time to perform this operation is proportional to the depth of the node representing x, assuming, of course, that we can ﬁnd the node representing x in constant time. Using the strategy above, it is possible to create a tree of depth N − 1, so the worst-case running 0123 4 67 Figure 8.2 After union(4,5) 5 0123 4 6 Figure 8.3 After union(6,7) 5 7

8.3 Basic Data Structure 355 0123 4 6 Figure 8.4 After union(4,6) 5 7 –1 –1 –1 –1 –1 4 4 6 01234 5 6 7 Figure 8.5 Implicit representation of previous tree time of a find is (N). Typically, the running time is computed for a sequence of M inter- mixed instructions. In this case, M consecutive operations could take (MN) time in the worst case. The code in Figures 8.6 through 8.9 represents an implementation of the basic algo- rithm, assuming that error checks have already been performed. In our routine, unions are performed on the roots of the trees. Sometimes the operation is performed by passing any two elements and having the union perform two finds to determine the roots. In previ- ously seen data structures, find has always been an accessor, and thus a const member function. Section 8.5 describes a mutator version that is more efﬁcient. Both versions can 1 class DisjSets 2{ 3 public: 4 explicit DisjSets( int numElements ); 5 6 int find( int x ) const; 7 int find( int x ); 8 void unionSets( int root1, int root2 ); 9 10 private: 11 vector<int> s; 12 }; Figure 8.6 Disjoint sets class interface

356 Chapter 8 The Disjoint Sets Class 1 /** 2 * Construct the disjoint sets object. 3 * numElements is the initial number of disjoint sets. 4 */ 5 DisjSets::DisjSets( int numElements ) : s{ numElements, - 1 } 6{ 7} Figure 8.7 Disjoint sets initialization routine 1 /** 2 * Union two disjoint sets. 3 * For simplicity, we assume root1 and root2 are distinct 4 * and represent set names. 5 * root1 is the root of set 1. 6 * root2 is the root of set 2. 7 */ 8 void DisjSets::unionSets( int root1, int root2 ) 9{ 10 s[ root2 ] = root1; 11 } Figure 8.8 union (not the best way) 1 /** 2 * Perform a find. 3 * Error checks omitted again for simplicity. 4 * Return the set containing x. 5 */ 6 int DisjSets::find( int x ) const 7{ 8 if( s[ x ] < 0 ) 9 return x; 10 else 11 return find( s[ x ] ); 12 } Figure 8.9 A simple disjoint sets find algorithm be supported simultaneously. The mutator is always called, unless the controlling object is unmodiﬁable. The average-case analysis is quite hard to do. The least of the problems is that the answer depends on how to deﬁne average (with respect to the union operation). For instance, in the forest in Figure 8.4, we could say that since there are ﬁve trees, there are 5·4 = 20 equally likely results of the next union (as any two different trees can be unioned).

8.4 Smart Union Algorithms 357 Of course, the implication of this model is that there is only a 2 chance that the next union 5 will involve the large tree. Another model might say that all unions between any two ele- ments in different trees are equally likely, so a larger tree is more likely to be involved in the next union than a smaller tree. In the example above, there is an 8 chance that the large 11 tree is involved in the next union, since (ignoring symmetries) there are 6 ways in which to merge two elements in {0, 1, 2, 3}, and 16 ways to merge an element in {4, 5, 6, 7} with an element in {0, 1, 2, 3}. There are still more models and no general agreement on which is the best. The average running time depends on the model; (M), (M log N), and (MN) bounds have actually been shown for three different models, although the latter bound is thought to be more realistic. Quadratic running time for a sequence of operations is generally unacceptable. Fortunately, there are several ways of easily ensuring that this running time does not occur. 8.4 Smart Union Algorithms The unions above were performed rather arbitrarily, by making the second tree a subtree of the ﬁrst. A simple improvement is always to make the smaller tree a subtree of the larger, breaking ties by any method; we call this approach union-by-size. The three unions in the preceding example were all ties, and so we can consider that they were performed by size. If the next operation were union(3,4), then the forest in Figure 8.10 would form. Had the size heuristic not been used, a deeper tree would have been formed (Fig. 8.11). We can prove that if unions are done by size, the depth of any node is never more than log N. To see this, note that a node is initially at depth 0. When its depth increases as a result of a union, it is placed in a tree that is at least twice as large as before. Thus, its depth can be increased at most log N times. (We used this argument in the quick-ﬁnd algorithm at the end of Section 8.2.) This implies that the running time for a find operation is O(log N), and a sequence of M operations takes O(M log N). The tree in Figure 8.12 shows the worst tree possible after 16 unions and is obtained if all unions are between equal-sized trees (the worst-case trees are binomial trees, discussed in Chapter 6). To implement this strategy, we need to keep track of the size of each tree. Since we are really just using an array, we can have the array entry of each root contain the negative of 012 4 35 6 7 Figure 8.10 Result of union-by-size

358 Chapter 8 The Disjoint Sets Class 0123 6 4 7 5 Figure 8.11 Result of an arbitrary union 0 2 4 8 1 3 5 69 7 10 12 14 11 13 15 Figure 8.12 Worst-case tree for N = 16 the size of its tree. Thus, initially the array representation of the tree is all −1s. When a union is performed, check the sizes; the new size is the sum of the old. Thus, union-by-size is not at all difﬁcult to implement and requires no extra space. It is also fast, on average. For virtually all reasonable models, it has been shown that a sequence of M operations requires O(M) average time if union-by-size is used. This is because when random unions are performed, generally very small (usually one-element) sets are merged with large sets throughout the algorithm. An alternative implementation, which also guarantees that all the trees will have depth at most O(log N), is union-by-height. We keep track of the height, instead of the size, of each tree and perform unions by making the shallow tree a subtree of the deeper tree. This is an easy algorithm, since the height of a tree increases only when two equally deep trees are joined (and then the height goes up by one). Thus, union-by-height is a trivial modiﬁcation of union-by-size. Since heights of zero would not be negative, we actually store the negative of height, minus an additional 1. Initially, all entries are −1. Figure 8.13 shows a forest and its implicit representation for both union-by-size and union-by-height. The code in Figure 8.14 implements union-by-height.

01 2 4 35 6 7 –1 –1 –1 4 –5 4 4 6 01234567 –1 –1 –1 4 –3 4 4 6 01234567 Figure 8.13 Forest with implicit representation for union-by-size and union-by-height 1 /** 2 * Union two disjoint sets. 3 * For simplicity, we assume root1 and root2 are distinct 4 * and represent set names. 5 * root1 is the root of set 1. 6 * root2 is the root of set 2. 7 */ 8 void DisjSets::unionSets( int root1, int root2 ) 9{ 10 if( s[ root2 ] < s[ root1 ] ) // root2 is deeper 11 s[ root1 ] = root2; // Make root2 new root 12 else 13 { 14 if( s[ root1 ] == s[ root2 ] ) 15 --s[ root1 ]; // Update height if same 16 s[ root2 ] = root1; // Make root1 new root 17 } 18 } Figure 8.14 Code for union-by-height (rank)

360 Chapter 8 The Disjoint Sets Class 8.5 Path Compression The union/ﬁnd algorithm, as described so far, is quite acceptable for most cases. It is very simple and linear on average for a sequence of M instructions (under all models). However, the worst case of O(M log N) can occur fairly easily and naturally. For instance, if we put all the sets on a queue and repeatedly dequeue the ﬁrst two sets and enqueue the union, the worst case occurs. If there are many more finds than unions, this running time is worse than that of the quick-ﬁnd algorithm. Moreover, it should be clear that there are probably no more improvements possible for the union algorithm. This is based on the observation that any method to perform the unions will yield the same worst-case trees, since it must break ties arbitrarily. Therefore, the only way to speed the algorithm up, without reworking the data structure entirely, is to do something clever on the find operation. The clever operation is known as path compression. Path compression is performed during a find operation and is independent of the strategy used to perform unions. Suppose the operation is find(x). Then the effect of path compression is that every node on the path from x to the root has its parent changed to the root. Figure 8.15 shows the effect of path compression after find(14) on the generic worst tree of Figure 8.12. The effect of path compression is that with an extra two link changes, nodes 12 and 13 are now one position closer to the root and nodes 14 and 15 are now two positions closer. Thus, the fast future accesses on these nodes will pay (we hope) for the extra work to do the path compression. As the code in Figure 8.16 shows, path compression is a trivial change to the basic find algorithm. The only change to the find routine (besides the fact that it is no longer a const member function) is that s[x] is made equal to the value returned by find; thus, after the root of the set is found recursively, x’s parent link references it. This occurs recursively to every node on the path to the root, so this implements path compression. When unions are done arbitrarily, path compression is a good idea, because there is an abundance of deep nodes and these are brought near the root by path compression. It has been proven that when path compression is done in this case, a sequence of M 0 124 8 12 14 15 3 5 6 9 10 13 7 11 Figure 8.15 An example of path compression

8.6 Worst Case for Union-by-Rank and Path Compression 361 1 /** 2 * Perform a find with path compression. 3 * Error checks omitted again for simplicity. 4 * Return the set containing x. 5 */ 6 int DisjSets::find( int x ) 7{ 8 if( s[ x ] < 0 ) 9 return x; 10 else 11 return s[ x ] = find( s[ x ] ); 12 } Figure 8.16 Code for disjoint sets find with path compression operations requires at most O(M log N) time. It is still an open problem to determine what the average-case behavior is in this situation. Path compression is perfectly compatible with union-by-size, and thus both routines can be implemented at the same time. Since doing union-by-size by itself is expected to execute a sequence of M operations in linear time, it is not clear that the extra pass involved in path compression is worthwhile on average. Indeed, this problem is still open. However, as we shall see later, the combination of path compression and a smart union rule guarantees a very efﬁcient algorithm in all cases. Path compression is not entirely compatible with union-by-height, because path com- pression can change the heights of the trees. It is not at all clear how to recompute them efﬁciently. The answer is do not! Then the heights stored for each tree become estimated heights (sometimes known as ranks), but it turns out that union-by-rank (which is what this has now become) is just as efﬁcient in theory as union-by-size. Furthermore, heights are updated less often than sizes. As with union-by-size, it is not clear whether path com- pression is worthwhile on average. What we will show in the next section is that with either union heuristic, path compression signiﬁcantly reduces the worst-case running time. 8.6 Worst Case for Union-by-Rank and Path Compression When both heuristics are used, the algorithm is almost linear in the worst case. Speciﬁcally, the time required in the worst case is (Mα(M, N)) (provided M ≥ N), where α(M, N) is an incredibly slowly growing function that for all intents and purposes is at most 5 for any problem instance. However, α(M, N) is not a constant, so the running time is not linear. In the remainder of this section, we ﬁrst look at some very slow-growing functions, and then in Sections 8.6.2 to 8.6.4, we establish a bound on the worst case for a sequence of at most N − 1 unions and M ﬁnd operations in an N-element universe in which union is by rank and ﬁnds use path compression. The same bound holds if union-by-rank is replaced with union-by-size.

362 Chapter 8 The Disjoint Sets Class 8.6.1 Slowly Growing Functions Consider the recurrence T(N) = 0 N≤1 (8.1) T( f(N) ) + 1 N>1 In this equation, T(N) represents the number of times, starting at N, that we must iteratively apply f(N) until we reach 1 (or less). We assume that f(N) is a nicely deﬁned function that reduces N. Call the solution to the equation f∗(N). We have already encountered this recurrence when analyzing binary search. There, f(N) = N/2; each step halves N. We know that this can happen at most log N times until N reaches 1; hence we have f∗(N) = log N (we ignore low-order terms, etc.). Observe that in this case, f∗(N) is much less than f(N). Figure 8.17 shows the solution for T(N) for various f(N). In our case, we are most interested in f(N) = log N. The solution T(N) = log∗ N is known as the iterated logarithm. The iterated logarithm, which represents the number of times the logarithm needs to be iteratively applied until we reach one, is a very slowly growing function. Observe that log∗ 2 = 1, log∗ 4 = 2, log∗ 16 = 3, log∗ 65536 = 4, and log∗ 265536 = 5. But keep in mind that 265536 is a 20,000-digit number. So while log∗ N is a growing function, for all intents and purposes, it is at most 5. But we can still produce even more slowly growing functions. For instance, if f(N) = log∗ N, then T(N) = log∗∗ N. In fact, we can add stars at will to produce functions that grow slower and slower. 8.6.2 An Analysis by Recursive Decomposition We now establish a tight bound on the running time of a sequence of M = (N) union/ﬁnd operations. The unions and finds may occur in any order, but unions are done by rank and finds are done with path compression. f(N) f ∗(N) N−1 N−1 N−2 N/2 N−c N/c N/2 log N N/c logc N √ log log N log∗ N N log∗∗ N log∗∗∗ N log N log∗ N log∗∗ N Figure 8.17 Different values of the iterated function

8.6 Worst Case for Union-by-Rank and Path Compression 363 5 01 2 3 4 0 01 01 2 01 2 3 0 0 01 0 01 01 2 0 0 0 01 0 Figure 8.18 A large disjoint set tree (numbers below nodes are ranks) We begin by establishing two lemmas concerning the properties of the ranks. Figure 8.18 gives a visual picture of both lemmas. Lemma 8.1 When executing a sequence of union instructions, a node of rank r > 0 must have at least one child of rank 0, 1, . . . , r − 1. Proof By induction. The basis r = 1 is clearly true. When a node grows from rank r − 1 to rank r, it obtains a child of rank r − 1. By the inductive hypothesis, it already has children of ranks 0, 1, . . . , r − 2, thus establishing the lemma. The next lemma seems somewhat obvious but is used implicitly in the analysis. Lemma 8.2 At any point in the union/ﬁnd algorithm, the ranks of the nodes on a path from the leaf to a root increase monotonically. Proof The lemma is obvious if there is no path compression. If, after path compression, some node v is a descendant of w, then clearly v must have been a descendant of w when only unions were considered. Hence the rank of v is less than the rank of w. Suppose we have two algorithms, A and B. Algorithm A works and computes all the answers correctly, but algorithm B does not compute correctly, or even produce useful answers. Suppose, however, that every step in algorithm A can be mapped to an equivalent step in algorithm B. Then it is easy to see that the running time for algorithm B describes the running time for algorithm A exactly. We can use this idea to analyze the running time of the disjoint sets data structure. We will describe an algorithm B, whose running time is exactly the same as the disjoint sets structure, and then algorithm C, whose running time is exactly the same as algorithm B. Thus any bound for algorithm C will be a bound for the disjoint sets data structure.

364 Chapter 8 The Disjoint Sets Class Partial Path Compression Algorithm A is our standard sequence of union-by-rank and ﬁnd with path compression operations. We design an algorithm B that will perform the exact same sequence of path compression operations as algorithm A. In algorithm B, we perform all the unions prior to any ﬁnd. Then each ﬁnd operation in algorithm A is replaced by a partial ﬁnd operation in algorithm B. A partial ﬁnd operation speciﬁes the search item and the node up to which the path compression is performed. The node that will be used is the node that would have been the root at the time the matching ﬁnd was performed in algorithm A. Figure 8.19 shows that algorithm A and algorithm B will get equivalent trees (forests) at the end, and it is easy to see that the exact same amount of parent changes are performed by algorithm A’s ﬁnds, compared to algorithm B’s partial ﬁnds. But algorithm B should be simpler to analyze, since we have removed the mixing of unions and ﬁnds from the equation. The basic quantity to analyze is the number of parent changes that can occur in any sequence of partial ﬁnds, since all but the top two nodes in any ﬁnd with path compression will obtain new parents. A Recursive Decomposition What we would like to do next is to divide each tree into two halves: a top half and a bottom half. We would then like to ensure that the number of partial ﬁnd operations in the top half plus the number of partial ﬁnd operations in the bottom half is exactly the same as the total number of partial ﬁnd operations. We would then like to write a formula for the total path compression cost in the tree in terms of the path compression cost in the top half plus the path compression cost in the bottom half. Without specifying how we decide which nodes are in the top half, and which nodes are in the bottom half, we can look at Figures 8.20, 8.21, and 8.22, to see how most of what we want to do can work immediately. In Figure 8.20, the partial ﬁnd resides entirely in the bottom half. Thus one partial ﬁnd in the bottom half corresponds to one original partial ﬁnd, and the charges can be recursively assigned to the bottom half. b fb ff ad ⇒e h acd ⇒e h beh c Find (c) Union (b, f) g g acdg b ⇒f f ⇒ f ad beh beh e h Union (b, f) a dg Partial find (c, b) c g acdg c Figure 8.19 Sequences of union and ﬁnd operations replaced with equivalent cost of union and partial ﬁnd operations

TOP y BOTTOM x Figure 8.20 Recursive decomposition, case 1: Partial ﬁnd is entirely in bottom TOP y BOTTOM x Figure 8.21 Recursive decomposition, case 2: Partial ﬁnd is entirely in top y TOP BOTTOM x Figure 8.22 Recursive decomposition, case 3: Partial ﬁnd goes from bottom to top

366 Chapter 8 The Disjoint Sets Class In Figure 8.21, the partial ﬁnd resides entirely in the top half. Thus one partial ﬁnd in the top half corresponds to one original partial ﬁnd, and the charges can be recursively assigned to the top half. However, we run into lots of trouble when we reach Figure 8.22. Here x is in the bottom half, and y is in the top half. The path compression would require that all nodes from x to y’s child acquire y as its parent. For nodes in the top half, that is no problem, but for nodes in the bottom half this is a deal breaker: Any recursive charges to the bottom have to keep everything in the bottom. So as Figure 8.23 shows, we can perform the path compression on the top, but while some nodes in the bottom will need new parents, it is not clear what to do, because the new parents for those bottom nodes cannot be top nodes, and the new parents cannot be other bottom nodes. The only option is to make a loop where these nodes’ parents are themselves and make sure these parent changes are correctly charged in our accounting. Although this is a new algorithm because it can no longer be used to generate an identical tree, we don’t need identical trees; we only need to be sure that each original partial ﬁnd can be mapped into a new partial ﬁnd operation and that the charges are identical. Figure 8.24 shows what the new tree will look like, and so the big remaining issue is the accounting. Looking at Figure 8.24, we see that the path compression charges from x to y can be split into three parts. First, there is the path compression from z (the ﬁrst top node on the upward path) to y. Clearly those charges are already accounted for recursively. Then there is the charge from the topmost-bottom node w to z. But that is only one unit, and there can be at most one of those per partial ﬁnd operation. In fact, we can do a little better: There can be at most one of those per partial ﬁnd operation on the top half. But how do we account for the parent changes on the path from x to w? One idea would be to argue that those changes would be exactly the same cost as if there were a partial ﬁnd from x to w. But there is a big problem with that argument: It converts an original partial ﬁnd into a partial ﬁnd on the top plus a partial ﬁnd on the bottom, which means the number of operations, y TOP z BOTTOM x Figure 8.23 Recursive decomposition, case 3: Path compression can be performed on the top nodes, but the bottom nodes must get new parents; the parents cannot be top parents, and they cannot be other bottom nodes

8.6 Worst Case for Union-by-Rank and Path Compression 367 y TOP z BOTTOM w x Figure 8.24 Recursive decomposition, case 3: The bottom node new parents are the nodes themselves M, would no longer be the same. Fortunately, there is a simpler argument: Since each node on the bottom can have its parent set to itself only once, the number of charges are limited by the number of nodes on the bottom whose parents are also in the bottom (i.e., w is excluded). There is one important detail that we must verify. Can we get in trouble on a subsequent partial ﬁnd given that our reformulation detaches the nodes between x and w from the path to y? The answer is no. In the original partial ﬁnd, suppose any of the nodes between x and w are involved in a subsequent original partial ﬁnd. In that case, it will be with one of y’s ancestors, and when that happens, any of those nodes will be the topmost “bottom node” in our reformulation. Thus on the subsequent partial ﬁnd, the original partial ﬁnd’s parent change will have a corresponding one unit charge in our reformulation. We can now proceed with the analysis. Let M be the total number of original partial ﬁnd operations. Let Mt be the total number of partial ﬁnd operations performed exclusively on the top half, and let Mb be the total number of partial ﬁnd operations performed exclusively on the bottom half. Let N be the total number of nodes. Let Nt be the total number of top- half nodes, let Nb be the total number of bottom-half nodes, and let Nnrb be the total number of non-root bottom nodes (i.e., the number of bottom nodes whose parents are also bottom nodes prior to any partial ﬁnds). Lemma 8.3 M = Mt + Mb. Proof In cases 1 and 3, each original partial ﬁnd operation is replaced by a partial ﬁnd on the top half, and in case 2, it is replaced by a partial ﬁnd on the bottom half. Thus each partial ﬁnd is replaced by exactly one partial ﬁnd operation on one of the halves. Our basic idea is that we are going to partition the nodes so that all nodes with rank s or lower are in the bottom, and the remaining nodes are in the top. The choice of s will be made later in the proof. The next lemma shows that we can provide a recursive formula

368 Chapter 8 The Disjoint Sets Class for the number of parent changes by splitting the charges into the top and bottom groups. One of the key ideas is that a recursive formula is written not only in terms of M and N, which would be obvious, but also in terms of the maximum rank in the group. Lemma 8.4 Let C(M, N, r) be the number of parent changes for a sequence of M ﬁnds with path compression on N items whose maximum rank is r. Suppose we partition so that all nodes with rank at s or lower are in the bottom, and the remaining nodes are in the top. Assuming appropriate initial conditions, C(M, N, r) < C(Mt, Nt, r) + C(Mb, Nb, s) + Mt + Nnrb . Proof The path compression that is performed in each of the three cases is covered by C(Mt, Nt, r) + C(Mb, Nb, s). Node w in case 3 is accounted for by Mt. Finally, all the other bottom nodes on the path are non-root nodes that can have their parent set to themselves at most once in the entire sequence of compressions. They are accounted for by Nnrb. If union-by-rank is used, then by Lemma 8.1, every top node has children of ranks 0, 1, . . ., s prior to the commencement of the partial ﬁnd operations. Each of those children are deﬁnitely root nodes in the bottom (their parent is a top node). So for each top node, s + 2 nodes (the s + 1 children plus the top node itself ) are deﬁnitely not included in Nnrb. Thus, we can refomulate Lemma 8.4 as follows: Lemma 8.5 Let C(M, N, r) be the number of parent changes for a sequence of M ﬁnds with path compression on N items whose maximum rank is r. Suppose we partition so that all nodes with rank at s or lower are in the bottom, and the remaining nodes are in the top. Assuming appropriate initial conditions, C(M, N, r) < C(Mt, Nt, r) + C(Mb, Nb, s) + Mt + N − (s + 2)Nt . Proof Substitute Nnrb < N − (s + 2)Nt into Lemma 8.4. If we look at Lemma 8.5, we see that C(M, N, r) is recursively deﬁned in terms of two smaller instances. Our basic goal at this point is to remove one of these instances by pro- viding a bound for it. What we would like to do is to remove C(Mt, Nt, r). Why? Because, if we do so, what is left is C(Mb, Nb, s). In that case, we have a recursive formula in which r is reduced to s. If s is small enough, we can make use of a variation of Equation (8.1), namely, that the solution to T(N) = 0 N≤1 (8.2) T( f(N) ) + M N>1 is O(M f∗(N)). So, let’s start with a simple bound for C(M, N, r):

8.6 Worst Case for Union-by-Rank and Path Compression 369 Theorem 8.1 C(M, N, r) < M + N log r. Proof We start with Lemma 8.5: C(M, N, r) < C(Mt, Nt, r) + C(Mb, Nb, s) + Mt + N − (s + 2)Nt (8.3) Observe that in the top half, there are only nodes of rank s+1, s+2, . . . , r, and thus no node can have its parent change more than (r−s−2) times. This yields a trivial bound of Nt(r−s−2) for C(Mt, Nt, r). Thus, C(M, N, r) < Nt(r − s − 2) + C(Mb, Nb, s) + Mt + N − (s + 2)Nt (8.4) Combining terms, C(M, N, r) < Nt(r − 2s − 4) + C(Mb, Nb, s) + Mt + N (8.5) Select s = r/2 . Then r − 2s − 4 < 0, so C(M, N, r) < C(Mb, Nb, r/2 ) + Mt + N (8.6) Equivalently, since according to Lemma 8.3, M = Mb+Mt (the proof falls apart without this), C(M, N, r) − M < C(Mb, Nb, r/2 ) − Mb + N (8.7) Let D(M, N, r) = C(M, N, r) − M; then D(M, N, r) < D(Mb, Nb, r/2 ) + N (8.8) which implies D(M, N, r) < N log r. This yields C(M, N, r) < M + N log r. Theorem 8.2 Any sequence of N − 1 unions and M ﬁnds with path compression makes at most M + N log log N parent changes during the ﬁnds. Proof The bound is immediate from Theorem 8.1 since r ≤ log N. 8.6.3 An O( M log * N ) Bound The bound in Theorem 8.2 is pretty good, but with a little work, we can do even better. Recall, that a central idea of the recursive decomposition is choosing s to be as small as possible. But to do this, the other terms must also be small, and as s gets smaller, we would expect C(Mt, Nt, r) to get larger. But the bound for C(Mt, Nt, r) used a primitive estimate, and Theorem 8.1 itself can now be used to give a better estimate for this term. Since the C(Mt, Nt, r) estimate will now be lower, we will be able to use a lower s. Theorem 8.3 C(M, N, r) < 2M + N log∗ r.

370 Chapter 8 The Disjoint Sets Class Proof From Lemma 8.5 we have, C(M, N, r) < C(Mt, Nt, r) + C(Mb, Nb, s) + Mt + N − (s + 2)Nt (8.9) and by Theorem 8.1, C(Mt, Nt, r) < Mt + Nt log r. Thus, C(M, N, r) < Mt + Nt log r + C(Mb, Nb, s) + Mt + N − (s + 2)Nt (8.10) Rearranging and combining terms yields C(M, N, r) < C(Mb, Nb, s) + 2Mt + N − (s − log r + 2)Nt (8.11) So choose s = log r . Clearly, this choice implies that (s − log r + 2) > 0, and thus we obtain C(M, N, r) < C(Mb, Nb, log r ) + 2Mt + N (8.12) Rearranging as in Theorem 8.1, we obtain C(M, N, r) − 2M < C(Mb, Nb, log r ) − 2Mb + N (8.13) This time, let D(M, N, r) = C(M, N, r) − 2M; then D(M, N, r) < D(Mb, Nb, log r ) + N (8.14) which implies D(M, N, r) < N log∗ r. This yields C(M, N, r) < 2M + N log∗ r. 8.6.4 An O( M α(M, N) ) Bound (8.15) (8.16) Not surprisingly, we can now use Theorem 8.3 to improve Theorem 8.3: (8.17) (8.18) Theorem 8.4 C(M, N, r) < 3M + N log∗∗ r. Proof Following the steps in the proof of Theorem 8.3, we have C(M, N, r) < C(Mt, Nt, r) + C(Mb, Nb, s) + Mt + N − (s + 2)Nt and by Theorem 8.3, C(Mt, Nt, r) < 2Mt + Nt log∗ r. Thus, C(M, N, r) < 2Mt + Nt log∗ r + C(Mb, Nb, s) + Mt + N − (s + 2)Nt Rearranging and combining terms yields C(M, N, r) < C(Mb, Nb, s) + 3Mt + N − (s − log∗ r + 2)Nt So choose s = log∗ r to obtain C(M, N, r) < C(Mb, Nb, log∗ r) + 3Mt + N

8.6 Worst Case for Union-by-Rank and Path Compression 371 Rearranging as in Theorems 8.1 and 8.3, we obtain (8.19) C(M, N, r) − 3M < C(Mb, Nb, log∗ r) − 3Mb + N This time, let D(M, N, r) = C(M, N, r) − 3M; then D(M, N, r) < D(Mb, Nb, log∗ r) + N (8.20) which implies D(M, N, r) < N log∗∗ r. This yields C(M, N, r) < 3M + N log∗∗ r. Needless to say, we could continue this ad inﬁnitim. Thus with a bit of math, we get a progression of bounds: C(M, N, r) < 2M + N log∗ r C(M, N, r) < 3M + N log∗∗ r C(M, N, r) < 4M + N log∗∗∗ r C(M, N, r) < 5M + N log∗∗∗∗ r C(M, N, r) < 6M + N log∗∗∗∗∗ r Each of these bounds would seem to be better than the previous since, after all, the more ∗s the slower log∗∗...∗∗r grows. However, this ignores the fact that while log∗∗∗∗∗r is smaller than log∗∗∗∗r, the 6M term is NOT smaller than the 5M term. Thus what we would like to do is to optimize the number of ∗s that are used. Deﬁne α(M, N) to represent the optimal number of ∗s that will be used. Speciﬁcally, ⎧⎫ ⎪⎨ i times ⎬⎪ α(M, N) = min ⎩⎪i ≥ 1 log ∗∗∗∗ (log N) ≤ (M/N)⎪⎭ Then, the running time of the union/ﬁnd algorithm can be bounded by O(Mα(M, N)). Theorem 8.5 Any sequence of N − 1 unions and M ﬁnds with path compression makes at most i times (i + 1)M + N log ∗∗∗∗ (log N) parent changes during the ﬁnds. Proof This follows from the above discussion and the fact that r ≤ log N. Theorem 8.6 Any sequence of N − 1 unions and M ﬁnds with path compression makes at most Mα(M, N) + 2M parent changes during the ﬁnds. Proof In Theorem 8.5, choose i to be α(M, N); thus, we obtain a bound of (i+1)M+N(M/N), or Mα(M, N) + 2M.

372 Chapter 8 The Disjoint Sets Class 8.7 An Application An example of the use of the union/ﬁnd data structure is the generation of mazes, such as the one shown in Figure 8.25. In Figure 8.25, the starting point is the top-left corner, and the ending point is the bottom-right corner. We can view the maze as a 50-by-88 rectangle of cells in which the top-left cell is connected to the bottom-right cell, and cells are separated from their neighboring cells via walls. A simple algorithm to generate the maze is to start with walls everywhere (except for the entrance and exit). We then continually choose a wall randomly, and knock it down if the cells that the wall separates are not already connected to each other. If we repeat this process until the starting and ending cells are connected, then we have a maze. It is actually better to continue knocking down walls until every cell is reachable from every other cell (this generates more false leads in the maze). We illustrate the algorithm with a 5-by-5 maze. Figure 8.26 shows the initial conﬁg- uration. We use the union/ﬁnd data structure to represent sets of cells that are connected to each other. Initially, walls are everywhere, and each cell is in its own equivalence class. Figure 8.27 shows a later stage of the algorithm, after a few walls have been knocked down. Suppose, at this stage, the wall that connects cells 8 and 13 is randomly targeted. Because 8 and 13 are already connected (they are in the same set), we would not remove the wall, as it would simply trivialize the maze. Suppose that cells 18 and 13 are randomly targeted next. By performing two find operations, we see that these are in different sets; thus 18 and 13 are not already connected. Therefore, we knock down the wall that sep- arates them, as shown in Figure 8.28. Notice that as a result of this operation, the sets Figure 8.25 A 50-by-88 maze

8.7 An Application 373 01234 56789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {0} {1} {2} {3} {4} {5} {6} {7} {8} {9} {10} {11} {12} {13} {14} {15} {16} {17} {18} {19} {20} {21} {22} {23} {24} Figure 8.26 Initial state: all walls up, all cells in their own set containing 18 and 13 are combined via a union operation. This is because everything that was connected to 18 is now connected to everything that was connected to 13. At the end of the algorithm, depicted in Figure 8.29, everything is connected, and we are done. The running time of the algorithm is dominated by the union/ﬁnd costs. The size of the union/ﬁnd universe is equal to the number of cells. The number of find operations is proportional to the number of cells, since the number of removed walls is one less than the number of cells, while with care, we see that there are only about twice the number of walls as cells in the ﬁrst place. Thus, if N is the number of cells, since there are two ﬁnds per randomly targeted wall, this gives an estimate of between (roughly) 2N and 4N find operations throughout the algorithm. Therefore, the algorithm’s running time can be taken as O(N log∗ N), and this algorithm quickly generates a maze. 01234 56789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {0, 1} {2} {3} {4, 6, 7, 8, 9, 13, 14} {5} {10, 11, 15} {12} {16, 17, 18, 22} {19} {20} {21} {23} {24} Figure 8.27 At some point in the algorithm: Several walls down, sets have merged; if at this point the wall between 8 and 13 is randomly selected, this wall is not knocked down, because 8 and 13 are already connected

374 Chapter 8 The Disjoint Sets Class 01234 56789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {0, 1} {2} {3} {4, 6, 7, 8, 9, 13, 14, 16, 17, 18, 22} {5} {10, 11, 15} {12} {19} {20} {21} {23} {24} Figure 8.28 Wall between squares 18 and 13 is randomly selected in Figure 8.27; this wall is knocked down, because 18 and 13 are not already connected; their sets are merged 01234 56789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24} Figure 8.29 Eventually, 24 walls are knocked down; all elements are in the same set Summary We have seen a very simple data structure to maintain disjoint sets. When the union operation is performed, it does not matter, as far as correctness is concerned, which set retains its name. A valuable lesson that should be learned here is that it can be very important to consider the alternatives when a particular step is not totally speciﬁed. The union step is ﬂexible; by taking advantage of this, we are able to get a much more efﬁcient algorithm. Path compression is one of the earliest forms of self-adjustment, which we have seen elsewhere (splay trees, skew heaps). Its use is extremely interesting, especially from a the- oretical point of view, because it was one of the ﬁrst examples of a simple algorithm with a not-so-simple worst-case analysis.

Exercises 375 Exercises 8.1 Show the result of the following sequence of instructions: union(1,2), union(3,4), union(3,5), union(1,7), union(3,6), union(8,9), union(1,8), union(3,10), union (3,11), union(3,12), union(3,13), union(14,15), union(16,0), union(14,16), union (1,3), union(1, 14) when the unions are a. performed arbitrarily b. performed by height c. performed by size 8.2 For each of the trees in the previous exercise, perform a find with path compression on the deepest node. 8.3 Write a program to determine the effects of path compression and the various unioning strategies. Your program should process a long sequence of equivalence operations using all six of the possible strategies. 8.4 Show that if unions are performed by height, then the depth of any tree is O(log N). 8.5 Suppose f(N) is a nicely deﬁned function that reduces N to a smaller integer. What is the solution to the recurrence T(N) = N T(f (N)) + N with appropriate initial f (N) conditions? 8.6 a. Show that if M = N2, then the running time of M union/ﬁnd operations is O(M). b. Show that if M = N log N, then the running time of M union/ﬁnd operations is O(M). c. Suppose M = (N log log N). What is the running time of M union/ﬁnd operations? d. Suppose M = (N log∗ N). What is the running time of M union/ﬁnd operations? 8.7 Tarjan’s original bound for the union/ﬁnd algorithm deﬁned α(M, N) = min{i ≥ 1|(A (i, M/N ) > log N)}, where A(1, j) = 2j j≥1 A(i, 1) = A(i − 1, 2) i≥2 A(i, j) = A(i − 1, A(i, j − 1)) i, j ≥ 2 Here, A(m, n) is one version of the Ackermann function. Are the two deﬁnitions of α asymptotically equivalent? 8.8 Prove that for the mazes generated by the algorithm in Section 8.7, the path from the starting to ending points is unique. 8.9 Design an algorithm that generates a maze that contains no path from start to ﬁnish but has the property that the removal of a prespeciﬁed wall creates a unique path. 8.10 Suppose we want to add an extra operation, deunion, which undoes the last union operation that has not been already undone. a. Show that if we do union-by-height and finds without path compression, then deunion is easy, and a sequence of M union, find, and deunion operations takes O(M log N) time. b. Why does path compression make deunion hard?

376 Chapter 8 The Disjoint Sets Class c. Show how to implement all three operations so that the sequence of M operations takes O(M log N/log log N) time. 8.11 Suppose we want to add an extra operation, remove(x), which removes x from its current set and places it in its own. Show how to modify the union/ﬁnd algorithm so that the running time of a sequence of M union, find, and remove operations is O(Mα(M, N)). 8.12 Show that if all of the unions precede the finds, then the disjoint sets algorithm with path compression requires linear time, even if the unions are done arbitrarily. 8.13 Prove that if unions are done arbitrarily, but path compression is performed on the finds, then the worst-case running time is (M log N). 8.14 Prove that if unions are done by size and path compression is performed, the worst- case running time is O(Mα(M, N)). 8.15 The disjoint set analysis in Section 8.6 can be reﬁned to provide tight bounds for small N. a. Show that C(M, N, 0) and C(M, N, 1) are both 0. b. Show that C(M, N, 2) is at most M. c. Let r ≤ 8. Choose s = 2 and show that C(M, N, r) is at most M + N. 8.16 Suppose we implement partial path compression on find(i) by making every other node on the path from i to the root link to its grandparent (where this makes sense). This is known as path halving. a. Write a procedure to do this. b. Prove that if path halving is performed on the finds and either union-by-height or union-by-size is used, the worst-case running time is O(Mα(M, N)). 8.17 Write a program that generates mazes of arbitrary size. If you are using a sys- tem with a windowing package, generate a maze similar to that in Figure 8.25. Otherwise describe a textual representation of the maze (for instance, each line of output represents a square and has information about which walls are present) and have your program generate a representation. References Various solutions to the union/ﬁnd problem can be found in [6], [9], and [11]. Hopcroft and Ullman showed an O(M log∗ N) bound using a nonrecursive decomposition. Tarjan [16] obtained the bound O(Mα(M, N)), where α(M, N) is as deﬁned in Exercise 8.7. A more precise (but asymptotically identical) bound for M < N appears in [2] and [19]. The analysis in Section 8.6 is due to Seidel and Sharir [15]. Various other strategies for path compression and unions also achieve the same bound; see [19] for details. A lower bound showing that under certain restrictions (Mα(M, N)) time is required to process M union/ﬁnd operations was given by Tarjan [17]. Identical bounds under less restrictive conditions have been shown in [7] and [14]. Applications of the union/ﬁnd data structure appear in [1] and [10]. Certain special cases of the union/ﬁnd problem can be solved in O(M) time [8]. This reduces the running time of several algorithms, such as [1], graph dominance, and reducibility (see references

References 377 in Chapter 9) by a factor of α(M, N). Others, such as [10] and the graph connectivity problem in this chapter, are unaffected. The paper lists 10 examples. Tarjan has used path compression to obtain efﬁcient algorithms for several graph problems [18]. Average-case results for the union/ﬁnd problem appear in [5], [12], [22], and [3]. Results bounding the running time of any single operation (as opposed to the entire sequence) appear in [4] and [13]. Exercise 8.10 is solved in [21]. A general union/ﬁnd structure, supporting more operations, is given in [20]. 1. A. V. Aho, J. E. Hopcroft, and J. D. Ullman, “On Finding Lowest Common Ancestors in Trees,” SIAM Journal on Computing, 5 (1976), 115–132. 2. L. Banachowski, “A Complement to Tarjan’s Result about the Lower Bound on the Complexity of the Set Union Problem,” Information Processing Letters, 11 (1980), 59–65. 3. B. Bollobás and I. Simon, “Probabilistic Analysis of Disjoint Set Union Algorithms,” SIAM Journal on Computing, 22 (1993), 1053–1086. 4. N. Blum, “On the Single-Operation Worst-Case Time Complexity of the Disjoint Set Union Problem,” SIAM Journal on Computing, 15 (1986), 1021–1024. 5. J. Doyle and R. L. Rivest, “Linear Expected Time of a Simple Union Find Algorithm,” Information Processing Letters, 5 (1976), 146–148. 6. M. J. Fischer, “Efﬁciency of Equivalence Algorithms,” in Complexity of Computer Computation (eds. R. E. Miller and J. W. Thatcher), Plenum Press, New York, 1972, 153–168. 7. M. L. Fredman and M. E. Saks, “The Cell Probe Complexity of Dynamic Data Structures,” Proceedings of the Twenty-ﬁrst Annual Symposium on Theory of Computing (1989), 345–354. 8. H. N. Gabow and R. E. Tarjan, “A Linear-Time Algorithm for a Special Case of Disjoint Set Union,” Journal of Computer and System Sciences, 30 (1985), 209–221. 9. B. A. Galler and M. J. Fischer, “An Improved Equivalence Algorithm,” Communications of the ACM, 7 (1964), 301–303. 10. J. E. Hopcroft and R. M. Karp, “An Algorithm for Testing the Equivalence of Finite Automata,” Technical Report TR-71-114, Department of Computer Science, Cornell University, Ithaca, N.Y., 1971. 11. J. E. Hopcroft and J. D. Ullman, “Set Merging Algorithms,” SIAM Journal on Computing, 2 (1973), 294–303. 12. D. E. Knuth and A. Schonhage, “The Expected Linearity of a Simple Equivalence Algorithm,” Theoretical Computer Science, 6 (1978), 281–315. 13. J. A. LaPoutre, “New Techniques for the Union-Find Problem,” Proceedings of the First Annual ACM–SIAM Symposium on Discrete Algorithms (1990), 54–63. 14. J. A. LaPoutre, “Lower Bounds for the Union-Find and the Split-Find Problem on Pointer Machines,” Proceedings of the Twenty-second Annual ACM Symposium on Theory of Computing (1990), 34–44. 15. R. Seidel and M. Sharir, “Top-Down Analysis of Path Compression,” SIAM Journal on Computing, 34 (2005), 515–525. 16. R. E. Tarjan, “Efﬁciency of a Good but Not Linear Set Union Algorithm,” Journal of the ACM, 22 (1975), 215–225. 17. R. E. Tarjan, “A Class of Algorithms Which Require Nonlinear Time to Maintain Disjoint Sets,” Journal of Computer and System Sciences, 18 (1979), 110–127.

378 Chapter 8 The Disjoint Sets Class 18. R. E. Tarjan, “Applications of Path Compression on Balanced Trees,” Journal of the ACM, 26 (1979), 690–715. 19. R. E. Tarjan and J. van Leeuwen, “Worst-Case Analysis of Set Union Algorithms,” Journal of the ACM, 31 (1984), 245–281. 20. M. J. van Kreveld and M. H. Overmars, “Union-Copy Structures and Dynamic Segment Trees,” Journal of the ACM, 40 (1993), 635–652. 21. J. Westbrook and R. E. Tarjan, “Amortized Analysis of Algorithms for Set Union with Back- tracking,” SIAM Journal on Computing, 18 (1989), 1–11. 22. A. C. Yao, “On the Average Behavior of Set Merging Algorithms,” Proceedings of Eighth Annual ACM Symposium on the Theory of Computation (1976), 192–195.

9C H A P T E R Graph Algorithms In this chapter, we discuss several common problems in graph theory. Not only are these algorithms useful in practice, they are also interesting because in many real-life applications they are too slow unless careful attention is paid to the choice of data structures. We will. . . r Show several real-life problems, which can be converted to problems on graphs. r Give algorithms to solve several common graph problems. r Show how the proper choice of data structures can drastically reduce the running time of these algorithms. r See an important technique, known as depth-ﬁrst search, and show how it can be used to solve several seemingly nontrivial problems in linear time. 9.1 Deﬁnitions A graph G = (V, E) consists of a set of vertices, V, and a set of edges, E. Each edge is a pair (v, w), where v, w ∈ V. Edges are sometimes referred to as arcs. If the pair is ordered, then the graph is directed. Directed graphs are sometimes referred to as digraphs. Vertex w is adjacent to v if and only if (v, w) ∈ E. In an undirected graph with edge (v, w), and hence (w, v), w is adjacent to v and v is adjacent to w. Sometimes an edge has a third component, known as either a weight or a cost. A path in a graph is a sequence of vertices w1, w2, w3, . . . , wN such that (wi, wi+1) ∈ E for 1 ≤ i < N. The length of such a path is the number of edges on the path, which is equal to N − 1. We allow a path from a vertex to itself; if this path contains no edges, then the path length is 0. This is a convenient way to deﬁne an otherwise special case. If the graph contains an edge (v, v) from a vertex to itself, then the path v, v is sometimes referred to as a loop. The graphs we will consider will generally be loopless. A simple path is a path such that all vertices are distinct, except that the ﬁrst and last could be the same. A cycle in a directed graph is a path of length at least 1 such that w1 = wN; this cycle is simple if the path is simple. For undirected graphs, we require that the edges be distinct. The logic of these requirements is that the path u, v, u in an undirected graph should not be considered a cycle, because (u, v) and (v, u) are the same edge. In a directed graph, these are different edges, so it makes sense to call this a cycle. A directed graph is acyclic if it has no cycles. A directed acyclic graph is sometimes referred to by its abbreviation, DAG. 379

380 Chapter 9 Graph Algorithms An undirected graph is connected if there is a path from every vertex to every other vertex. A directed graph with this property is called strongly connected. If a directed graph is not strongly connected, but the underlying graph (without direction to the arcs) is connected, then the graph is said to be weakly connected. A complete graph is a graph in which there is an edge between every pair of vertices. An example of a real-life situation that can be modeled by a graph is the airport system. Each airport is a vertex, and two vertices are connected by an edge if there is a nonstop ﬂight from the airports that are represented by the vertices. The edge could have a weight, representing the time, distance, or cost of the ﬂight. It is reasonable to assume that such a graph is directed, since it might take longer or cost more (depending on local taxes, for example) to ﬂy in different directions. We would probably like to make sure that the airport system is strongly connected, so that it is always possible to ﬂy from any airport to any other airport. We might also like to quickly determine the best ﬂight between any two airports. “Best” could mean the path with the fewest number of edges or could be taken with respect to one, or all, of the weight measures. Trafﬁc ﬂow can be modeled by a graph. Each street intersection represents a vertex, and each street is an edge. The edge costs could represent, among other things, a speed limit or a capacity (number of lanes). We could then ask for the shortest route or use this information to ﬁnd the most likely location for bottlenecks. In the remainder of this chapter, we will see several more applications of graphs. Many of these graphs can be quite large, so it is important that the algorithms we use be efﬁcient. 9.1.1 Representation of Graphs We will consider directed graphs (undirected graphs are similarly represented). Suppose, for now, that we can number the vertices, starting at 1. The graph shown in Figure 9.1 represents 7 vertices and 12 edges. 12 345 6 7 Figure 9.1 A directed graph

9.1 Deﬁnitions 381 One simple way to represent a graph is to use a two-dimensional array. This is known as an adjacency matrix representation. For each edge (u, v), we set A[u][v] to true; otherwise the entry in the array is false. If the edge has a weight associated with it, then we can set A[u][v] equal to the weight and use either a very large or a very small weight as a sentinel to indicate nonexistent edges. For instance, if we were looking for the cheapest airplane route, we could represent nonexistent ﬂights with a cost of ∞. If we were looking, for some strange reason, for the most expensive airplane route, we could use −∞ (or perhaps 0) to represent nonexistent edges. Although this has the merit of extreme simplicity, the space requirement is (|V|2), which can be prohibitive if the graph does not have very many edges. An adjacency matrix is an appropriate representation if the graph is dense: |E| = (|V|2). In most of the appli- cations that we shall see, this is not true. For instance, suppose the graph represents a street map. Assume a Manhattan-like orientation, where almost all the streets run either north–south or east–west. Therefore, any intersection is attached to roughly four streets, so if the graph is directed and all streets are two-way, then |E| ≈ 4|V|. If there are 3,000 intersections, then we have a 3,000-vertex graph with 12,000 edge entries, which would require an array of size 9,000,000. Most of these entries would contain zero. This is intu- itively bad, because we want our data structures to represent the data that are actually there and not the data that are not present. If the graph is not dense, in other words, if the graph is sparse, a better solution is an adjacency list representation. For each vertex, we keep a list of all adjacent vertices. The space requirement is then O(|E| + |V|), which is linear in the size of the graph.1 The abstract representation should be clear from Figure 9.2. If the edges have weights, then this additional information is also stored in the adjacency lists. Adjacency lists are the standard way to represent graphs. Undirected graphs can be similarly represented; each edge (u, v) appears in two lists, so the space usage essentially doubles. A common requirement in graph algorithms is to ﬁnd all vertices adjacent to some given vertex v, and this can be done, in time proportional to the number of such vertices found, by a simple scan down the appropriate adjacency list. There are several alternatives for maintaining the adjacency lists. First, observe that the lists themselves can be maintained in either vectors or lists. However, for sparse graphs, when using vectors, the programmer may need to initialize each vector with a smaller capacity than the default; otherwise, there could be signiﬁcant wasted space. Because it is important to be able to quickly obtain the list of adjacent vertices for any vertex, the two basic options are to use a map in which the keys are vertices and the values are adjacency lists, or to maintain each adjacency list as a data member of a Vertex class. The ﬁrst option is arguably simpler, but the second option can be faster, because it avoids repeated lookups in the map. In the second scenario, if the vertex is a string (for instance, an airport name, or the name of a street intersection), then a map can be used in which the key is the vertex name and the value is a Vertex (typically a pointer to a Vertex), and each Vertex object keeps a list of (pointers to the) adjacent vertices and perhaps also the original string name. 1 When we speak of linear-time graph algorithms, O(|E| + |V|) is the running time we require.

Pages:

hpmaverick007

[Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

Like this book? You can publish your book online for free in a few minutes!

Create your own flipbook

TOP SEARCH

business design fashion music health life sports home marketing children

[Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

Description: [Mark_A._Weiss]_Data_Structures_and_Algorithm_Anal(BookZZ.org)

Read the Text Version

hpmaverick007

TOP SEARCH

RELATED PUBLICATIONS