Sorting

Site: Saylor Academy
Course: CS102: Introduction to Computer Science II
Book: Sorting
Printed by: Guest user
Date: Friday, April 19, 2024, 6:12 PM

Description

This page explains and implements selection sort, bubble sort, merge sort, quick sort, insertion sort, and shell sort.

1. Objectives

To be able to explain and implement selection sort, bubble sort, merge sort, quick sort, insertion sort, and shell sort.


Source: Runestone Academy, https://runestone.academy/runestone/books/published/cppds/Sort/toctree.html
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License.


2. Sorting

Sorting is the process of placing elements from a collection in some kind of order. For example, a list of words could be sorted alphabetically or by length. A list of cities could be sorted by population, by area, or by zip code. We have already seen a number of algorithms that were able to benefit from having a sorted list (recall the final anagram example and the binary search).

There are many, many sorting algorithms that have been developed and analyzed. This suggests that sorting is an important area of study in computer science. Sorting a large number of items can take a substantial amount of computing resources. Like searching, the efficiency of a sorting algorithm is related to the number of items being processed. For small collections, a complex sorting method may be more trouble than it’s worth. The overhead may be too high. On the other hand, for larger collections, we want to take advantage of as many improvements as possible. In this section we will discuss several sorting techniques and compare them with respect to their running time.

Before getting into specific algorithms, we should think about the operations that can be used to analyze a sorting process. First, it will be necessary to compare two values to see which is smaller (or larger). In order to sort a collection, it will be necessary to have some systematic way to compare values to see if they are out of order. The total number of comparisons will be the most common way to measure a sort procedure. Second, when values are not in the correct position with respect to one another, it may be necessary to exchange them. This exchange is a costly operation and the total number of exchanges will also be important for evaluating the overall efficiency of the algorithm.

3. The Bubble Sort

The bubble sort makes multiple passes through an array. It compares adjacent items and exchanges those that are out of order. Each pass through the array places the next largest value in its proper place. In essence, each item "bubbles" up to the location where it belongs.

Figure 1 shows the first pass of a bubble sort. The shaded items are being compared to see if they are out of order. If there are n items in the array, then there are n1 pairs of items that need to be compared on the first pass. It is important to note that once the largest value in the array is part of a pair, it will continually be moved along until the pass is complete.

../_images/bubblepass.png

Figure 1: bubbleSort: The First Pass

At the start of the second pass, the largest value is now in place. There are n1 items left to sort, meaning that there will be n2 pairs. Since each pass places the next largest value in place, the total number of passes necessary will be  n1. After completing the n1 passes, the smallest item must be in the correct position with no further processing required. ActiveCode 1 shows the complete bubbleSort function. It takes the array as a parameter, and modifies it by swapping items as necessary.

Typically, swapping two elements in an array requires a temporary storage location (an additional memory location). A code fragment such as

temp = alist[i];
alist[i] = alist[j];
alist[j] = temp;

will exchange the ith and jth items in the array. Without the temporary storage, one of the values would be overwritten.

Lines 5-7 in ActiveCode 1 perform the exchange of the i and (i+1)th items using the three–step procedure described earlier. Note that we could also have used the simultaneous assignment to swap the items.

../_images/swap.png

Figure 2: Exchanging Two Values in Python


4. The Selection Sort

The selection sort improves on the bubble sort by making only one exchange for every pass through the first part of the vector. We will call this a step. In order to do this, a selection sort looks for the largest value as it makes a partial pass and, after completing the partial pass, places it in the proper location, ending the step. As with a bubble sort, after the first step, the largest item is in the correct place. After the second step, the next largest is in place. This process continues and requires n1 steps to sort n items, since the final item must be in place after the (n1) step.

On each step, the largest remaining item is selected and then placed in its proper location. The first pass places 93, the second pass places 77, the third places 55, and so on. The function is shown in ActiveCode 1.

This visualization allows you to step through the algorithm. Yellow bars represent the current element, red represents the element being looked at, and blue represents the last element to look at during a step.

The following visualization shows selection sort in action. Each pass compares the bars in sequential order. The smallest bar is selected on each pass and is set as the minimum, represented in orange. Every remaining bar is then compared to the minimum. If the bar is larger, it is represented in blue, if it is smaller, it becomes the new orange bar. After each pass, a counter will increment which bar in our container will start with. This increment is representedby a thin black line falling before the bar to be started at.

 

 

You may see that the selection sort makes the same number of comparisons as the bubble sort and is therefore also O(n2). However, due to the reduction in the number of exchanges, the selection sort typically executes faster in benchmark studies.


5. The Insertion Sort

The insertion sort, although still O(n2), works in a slightly different way. It always maintains a sorted subvector in the lower positions of the vector. Each new item is then "inserted" back into the previous subvector such that the sorted subvector is one item larger. Figure 4 shows the insertion sorting process. The shaded items represent the ordered subvectors as the algorithm makes each pass.

../_images/insertionsort.png

Figure 4: insertionSort

We begin by assuming that a vector with one item (position 0) is already sorted. On each pass, one for each item 1 through n1, the current item is checked against those in the already sorted subvector. As we look back into the already sorted subvector, we shift those items that are greater to the right. When we reach a smaller item or the end of the subvector, the current item can be inserted.

Figure 5 shows the fifth pass in detail. At this point in the algorithm, a sorted subvector of five items consisting of 17, 26, 54, 77, and 93 exists. We want to insert 31 back into the already sorted items. The first comparison against 93 causes 93 to be shifted to the right. 77 and 54 are also shifted. When the item 26 is encountered, the shifting process stops and 31 is placed in the open position. Now we have a sorted subvector of six items.

../_images/insertionpass.png

Figure 5: insertionSort: Fifth Pass of the Sort

The implementation of insertionSort (ActiveCode 1) shows that there are again n1 passes to sort n items. The iteration starts at position 1 and moves through position  n1, as these are the items that need to be inserted back into the sorted subvectors. Line 8 performs the shift operation that moves a value up one position in the vector, making room behind it for the insertion. Remember that this is not a complete exchange as was performed in the previous algorithms.

The maximum number of comparisons for an insertion sort is the sum of the first n1 integers. Again, this is O(n2). However, in the best case, only one comparison needs to be done on each pass. This would be the case for an already sorted vector.

One note about shifting versus exchanging is also important. In general, a shift operation requires approximately a third of the processing work of an exchange since only one assignment is performed. In benchmark studies, insertion sort will show very good performance.

The visualization above allows you to step through the algorithm. Red bars represent the element being looked at and blue represents the last element to look at during a pass.

The visualization below allows you to examine the individual steps of the algorithm at a slower pace. Bars that are orange indicate that it is being compared to items to the left. Bars that are blue indicate that it is one of the items currently being compared against the orange bar.

 
 

6. The Merge Sort

We now turn our attention to using a divide and conquer strategy as a way to improve the performance of sorting algorithms. The first algorithm we will study is the merge sort. Merge sort is a recursive algorithm that continually splits a vector in half. If the vector is empty or has one item, it is sorted by definition (the base case). If the vector has more than one item, we split the vector and recursively invoke a merge sort on both halves. Once the two halves are sorted, the fundamental operation, called a merge, is performed. Merging is the process of taking two smaller sorted vectors and combining them together into a single, sorted, new vector. Figure 10 shows our familiar example vector as it is being split by mergeSort. Figure 11 shows the simple vectors, now sorted, as they are merged back together.

../_images/mergesortA.png

Figure 10: Splitting the vector in a Merge Sort

../_images/mergesortB.png

Figure 11: vectors as They Are Merged Together

The mergeSort function shown in ActiveCode 1 begins by asking the base case question. If the length of the vector is less than or equal to one, then we already have a sorted vector and no more processing is necessary. If, on the other hand, the length is greater than one, then we utilize the vector intializer using .begin to extract the left and right halves. This is similar to using the Python slice operation to extract the left and right halves. It is important to note that the vector may not have an even number of items. That does not matter, as the lengths will differ by at most one.


Once the mergeSort function is invoked on the left half and the right half (lines 8–9), it is assumed they are sorted. The rest of the function (lines 11–31) is responsible for merging the two smaller sorted vectors into a larger sorted vector. Notice that the merge operation places the items back into the original vector (avector) one at a time by repeatedly taking the smallest item from the sorted vectors.

The mergeSort function has been augmented with a print statement (line 2) to show the contents of the vector being sorted at the start of each invocation. There is also a print statement (line 32) to show the merging process. The transcript shows the result of executing the function on our example vector. Note that the vector with 44, 55, and 20 will not divide evenly. The first split gives [44] and the second gives [55,20]. It is easy to see how the splitting process eventually yields a vector that can be immediately merged with other sorted vectors.


The visualization above allows you to step through the algorithm. Red bars represent the element being looked at and blue represents the last element to look at during a pass.


 

 

The visualization above highlights the individual components of the algorithm itself. The arrows on the bottom indicate the left, middle, and right portions that the algorithm is currently examining. Left and right components are indicated by the color brown, while the middle is indicated by orange. Look for the "divide-and-conquer" aspect of the algorithm here.

In order to analyze the mergeSort function, we need to consider the two distinct processes that make up its implementation. First, the vector is split into halves. We already computed (in a binary search) that we can divide a vector in half logn times where n is the length of the vector. The second process is the merge. Each item in the vector will eventually be processed and placed on the sorted vector. So the merge operation which results in a vector of size n requires n operations. The result of this analysis is that logn splits, each of which costs n for a total of nlogn operations. A merge sort is an O(nlogn) algorithm.

Recall that the slicing operator is O(k) where k is the size of the slice. In order to guarantee that mergeSort will be O(nlogn) we will need to remove the slice operator. Again, this is possible if we simply pass the starting and ending indices along with the vector when we make the recursive call. We leave this as an exercise.

It is important to notice that the mergeSort function requires extra space to hold the two halves as they are extracted with the slicing operations. This additional space can be a critical factor if the vector is large and can make this sort problematic when working on large data sets. In the case with using lists in python, the space complexity is O(logn).

7. The Shell Sort

The shell sort, sometimes called the "diminishing increment sort," improves on the insertion sort by breaking the original vector into a number of smaller subvectors, each of which is sorted using an insertion sort. The unique way that these subvectors are chosen is the key to the shell sort. Instead of breaking the vector into subvectors of contiguous items, the shell sort uses an increment i, sometimes called the gap, to create a subvector by choosing all items that are i items apart.

This can be seen in Figure 6. This vector has nine items. If we use an increment of three, there are three subvectors, each of which can be sorted by an insertion sort. After completing these sorts, we get the vector shown in Figure 7. Although this vector is not completely sorted, something very interesting has happened. By sorting the subvectors, we have moved the items closer to where they actually belong.

../_images/shellsortA.png

Figure 6: A Shell Sort with Increments of Three

../_images/shellsortB.png

Figure 7: A Shell Sort after Sorting Each subvector

Figure 8 shows a final insertion sort using an increment of one; in other words, a standard insertion sort. Note that by performing the earlier subvector sorts, we have now reduced the total number of shifting operations necessary to put the vector in its final order. For this case, we need only four more shifts to complete the process.

../_images/shellsortC.png

Figure 8: ShellSort: A Final Insertion Sort with Increment of 1

../_images/shellsortD.png

Figure 9: Initial Subvectors for a Shell Sort

We said earlier that the way in which the increments are chosen is the unique feature of the shell sort. The function shown in ActiveCode 1 uses a different set of increments. In this case, we begin with n2 subvectors. On the next pass, n4 subvectors are sorted. Eventually, a single vector is sorted with the basic insertion sort. Figure 9 shows the first subvectors for our example using this increment.


The following visualization shows the "gap" attribute in the form of brown, vertical bars. There are marker arrows that portray the "current values" being compared during the sort. It finishes by performing a full insertion sort on the full set of bars.

At first glance you may think that a shell sort cannot be better than an insertion sort, since it does a complete insertion sort as the last step. It turns out, however, that this final insertion sort does not need to do very many comparisons (or shifts) since the list has been pre-sorted by earlier incremental insertion sorts, as described above. In other words, each pass produces a list that is "more sorted" than the previous one. This makes the final pass very efficient.

Although a general analysis of the shell sort is well beyond the scope of this text, we can say that it tends to fall somewhere between O(n) and O(n2), based on the behavior described above. For the increments shown in Listing 5, the performance is O(n2). By changing the increment, for example using  2k1 (1, 3, 7, 15, 31, and so on), a shell sort can perform at O(n32).

8. The Quick Sort

The quick sort uses divide and conquer to gain the same advantages as the merge sort, while not using additional storage. As a trade-off, however, it is possible that the list may not be divided in half. When this happens, we will see that performance is diminished.

A quick sort first selects a value, which is called the pivot value. Although there are many different ways to choose the pivot value, we will simply use the first item in the list. The role of the pivot value is to assist with splitting the list. The actual position where the pivot value belongs in the final sorted list, commonly called the split point, will be used to divide the list for subsequent calls to the quick sort.

Figure 12 shows that 54 will serve as our first pivot value. Since we have looked at this example a few times already, we know that 54 will eventually end up in the position currently holding 31. The partition process will happen next. It will find the split point and at the same time move other items to the appropriate side of the list, either less than or greater than the pivot value.

../_images/firstsplit.png

Figure 12: The First Pivot Value for a Quick Sort

Partitioning begins by locating two position markers­ – let’s call them leftmark and rightmark ­– at the beginning and end of the remaining items in the list (positions 1 and 8 in Figure 13). The goal of the partition process is to move items that are on the wrong side with respect to the pivot value while also converging on the split point. Figure 13 shows this process as we locate the position of 54.

../_images/partitionA.png

Figure 13: Finding the Split Point for 54

We begin by incrementing leftmark until we locate a value that is greater than the pivot value. We then decrement rightmark until we find a value that is less than the pivot value. At this point we have discovered two items that are out of place with respect to the eventual split point. For our example, this occurs at 93 and 20. Now we can exchange these two items and then repeat the process again.

At the point where rightmark becomes less than leftmark, we stop. The position of rightmark is now the split point. The pivot value can be exchanged with the contents of the split point and the pivot value is now in place (Figure 14). In addition, all the items to the left of the split point are less than the pivot value, and all the items to the right of the split point are greater than the pivot value. The list can now be divided at the split point and the quick sort can be invoked recursively on the two halves.

../_images/partitionB.png

Figure 14: Completing the Partition Process to Find the Split Point for 54

The quickSort function shown in ActiveCode 1 invokes a recursive function, quickSortHelperquickSortHelper begins with the same base case as the merge sort. If the length of the list is less than or equal to one, it is already sorted. If it is greater, then it can be partitioned and recursively sorted. The partition function implements the process described earlier.

 

 

The visualization above shows how quick sort works in action. Our pivot is represented by the arrow on screen. If an object is bigger than the pivot, it will turn blue and stay where it is. If it is smaller it will turn red and swap to the left side of the pivot. Once an object is sorted, it will turn yellow.

To analyze the quickSort function, note that for a list of length n, if the partition always occurs in the middle of the list, there will again be logn divisions. In order to find the split point, each of the n items needs to be checked against the pivot value. The result is nlogn. In addition, there is no need for additional memory as in the merge sort process.

Unfortunately, in the worst case, the split points may not be in the middle and can be very skewed to the left or the right, leaving a very uneven division. In this case, sorting a list of n items divides into sorting a list of 0 items and a list of n1 items. Then sorting a list of n1 divides into a list of size 0 and a list of size n2, and so on. The result is an O(n2) sort with all of the overhead that recursion requires.

We mentioned earlier that there are different ways to choose the pivot value. In particular, we can attempt to alleviate some of the potential for an uneven division by using a technique called median of three. To choose the pivot value, we will consider the first, the middle, and the last element in the list. In our example, those are 54, 77, and 20. Now pick the median value, in our case 54, and use it for the pivot value (of course, that was the pivot value we used originally). The idea is that in the case where the the first item in the list does not belong toward the middle of the list, the median of three will choose a better "middle" value. This will be particularly useful when the original list is somewhat sorted to begin with. We leave the implementation of this pivot value selection as an exercise.

9. Summary

  • A sequential search is O(n) for ordered and unordered lists.

  • A binary search of an ordered list is O(logn) in the worst case.

  • Hash tables can provide constant time searching.

  • A bubble sort, a selection sort, and an insertion sort are O(n2) algorithms.

  • A shell sort improves on the insertion sort by sorting incremental sublists. It falls between O(n) and O(n2).

  • A merge sort is O(nlogn), but requires additional space for the merging process.

  • A quick sort is O(nlogn), but may degrade to O(n2) if the split points are not near the middle of the list. It does not require additional space.

10. Discussion Questions

  1. Generate a random list of integers. Show how this list is sorted by the following algorithms:

    • bubble sort

    • selection sort

    • insertion sort

    • shell sort (you decide on the increments)

    • merge sort

    • quick sort (you decide on the pivot value)

  2. Consider the following list of integers: [1,2,3,4,5,6,7,8,9,10]. Show how this list is sorted by the following algorithms:

    • bubble sort

    • selection sort

    • insertion sort

    • shell sort (you decide on the increments)

    • merge sort

    • quick sort (you decide on the pivot value)

  3. Consider the following list of integers: [10,9,8,7,6,5,4,3,2,1]. Show how this list is sorted by the following algorithms:

    • bubble sort

    • selection sort

    • insertion sort

    • shell sort (you decide on the increments)

    • merge sort

    • quick sort (you decide on the pivot value)

  4. Consider the list of characters: ['P','Y','T','H','O','N']. Show how this list is sorted using the following algorithms:

    • bubble sort

    • selection sort

    • insertion sort

    • shell sort (you decide on the increments)

    • merge sort

    • quick sort (you decide on the pivot value)

  5. Devise alternative strategies for choosing the pivot value in quick sort. For example, pick the middle item. Re-implement the algorithm and then execute it on random data sets. Under what criteria does your new strategy perform better or worse than the strategy from this chapter?


11. Programming Exercises

  1. Using a random number generator, create a list of 500 integers. Perform a benchmark analysis using some of the sorting algorithms from this chapter. What is the difference in execution speed?

  2. Implement the bubble sort using simultaneous assignment.

  3. A bubble sort can be modified to "bubble" in both directions. The first pass moves "up" the list, and the second pass moves "down". This alternating pattern continues until no more passes are necessary. Implement this variation and describe under what circumstances it might be appropriate.

  4. Implement the selection sort using simultaneous assignment.

  5. Perform a benchmark analysis for a shell sort, using different increment sets on the same vector.

  6. One way to improve the quick sort is to use an insertion sort on lists that have a small length (call it the "partition limit"). Why does this make sense? Re-implement the quick sort and use it to sort a random list of integers. Perform an analysis using different list sizes for the partition limit.

  7. Implement the median-of-three method for selecting a pivot value as a modification to quickSort. Run an experiment to compare the two techniques.


12. Glossary

bubble sort

sorting method that makes multiple passes through a collection, comparing adjacent items, and swaps items that are out of order

gap

an increment used to divide a collection into subsets without breaking apart the collection during a shell sort

insertion sort

sorting method that maintains a sorted and unsorted subset of a collection and inserts elements from the unsorted subset into the sorted subset

median of three

method of choosing the pivot value for a quick sort by taking the median of the first, middle, and last element of a collection

merge

part of merge sort that takes two smaller sorted subsets and combines them

merge sort

sorting method that uses recursion to split a collection in half until there is one item and then combines the smaller subsets back into larger sorted subsets

partition

process of quick sort that that finds the split point and moves items to the appropriate side of the collection, either less than or greater than the pivot value

pivot value

value selected in a collection during quick sort in order to split a collection

selection sort

sorting method that makes multiple passes through a collection, taking the largest (ascending) or smallest (descending) unsorted element and places it into its correct place by swapping places with the next largest or lowest element

shell sort

sorting method that divides the collection into subsets, sorts the subsets individually using insertion sort, then also sorts the combination of the sorted subsets using insertion sort

short bubble

a modified bubble sort that stops if there are no exchanges to do

sorting

the process of placing elements from a collection in some kind of order

split point

the position of the pivot value in the sorted collection; used to divide the collection for subsequent calls to quick sort

quick sort

sorting method that uses recursion to split a collection in half (without using extra space) and places elements on the proper side of the split point