Topic outline
-
In this unit, students will learn to apply searching and sorting algorithms to arrays. Students will also learn how to conduct worst-case and best-case analysis for these algorithms, determine their efficiency, and assign them generalized Big O notations.
Completing this unit should take approximately 7 hours.
-
-
This lecture introduces search algorithms. It also reviews what we have learned about time-complexity. These lectures use Python for illustration, but we will present code in C++. The point of the lectures is to understand the background of the various approaches; don't focus on Python specifically. This contrasts with other teaching approaches that start with the syntax of a particular language.
We begin with this particular lecture, since it introduces concepts that are important to understanding search and sort, graphs and hashing, and the scalability of various approaches. Though we will look at some other lectures in this series, the entire lecture series covers more than this course's purview, so we will not directly engage with all of them. However, you may wish to check out the course page for the original course at MIT, if you'd like more information on what is covered here.
"Complexity" refers to ALL resources consumed by an algorithm, although most people speak only of runtime (processing time, time to run successfully to completion). Refer to our earlier discussion on time/memory tradeoffs. You must consider the overall cost of running an algorithm. For instance, memory is expensive in cloud computing, so you have to be careful about saving time by using algorithms that are memory-intensive. You also have to consider other resources such as network utilization as data is transferred from one place to another. As you work, continue to think in this manner.
Searching, sorting, graphs, and hashing, using arrays or other data structures, are typically discussed in the context of recursion. While these are "academically elegant" and arguably easier to explain, recursion is memory inefficient. Generally, it assumes that large amounts of memory are available. However, that is rarely the case in industrial applications, especially with embedded systems. Additionally, the memory cost for cloud-based systems is high. Constant expansion of memory use with recursive approaches can eat up a lot of money very quickly. Thus, recursion will not be emphasized here.
-
-
-
Here we emphasize the fact that a computer program is not an algorithm. Rather, a computer program is just one way of expressing an algorithm. Computer languages are just a means of expressing in syntax what we want the computer to do. Understanding this important nuance is essential as you delve into the algorithms we explore in this unit. For instance, sorting is not a matter of computer programming but a matter of algorithm development. It is true that ultimately, the computer has to be made to carry out the task at hand. However, we must start with an algorithm (what we want the computer to do), not with a computer program (how we make a computer do something). Otherwise, the computer becomes a limitation instead of an aid. This lecture also discusses document distance, which will be important for several examples in later lectures.
-
-
-
Why bother with sorting data? This lecture discusses that question and then goes into two different sorting methods. You will notice that many search algorithms assume sorted data, especially with the large data volumes that are common today.
-
Watch this lecture to learn about the divide-and-conquer approach to sorting and its application to merge sorts. This approach is especially worthwhile when multiple computing cores are available, either with or without shared memory. Each divided component can be exercised independent of the other components. The merge step brings it all together. Divide-and-conquer applies to more than just sorting methods, but we focus on that in this lecture.
-
-
Read this efficiency analysis of insertion sort. The pseudocode is easily reformatted for C/C++. Efficiency analysis is important when selecting solution methods appropriate for the application and situation.
-
This page presents an efficiency analysis of merge sort. It will help you choose the sorting algorithm appropriate to your project's requirements and situation. For instance, if data arrives over time, insertion sort using linked lists may well be the best choice. You certainly do not want to re-sort the entire collected data just to add one element. Most demonstrations of sorting methods assume that a large block of data already exists and that the task is to sort that data block. The case of data arriving over time is rarely considered but that is often the case in modern industrial and commercial applications. The "Internet of Things" (IoT) is a prime example, as is ongoing retail sales.
-
-
-
Since the lectures in this unit use Python, review these examples of Insertion and Merge sorts in C++.
-
-
-
-
Fibonacci search expands on linear search in that the steps are greater than one. The comments in this C++ implementation explains.
-
-
-
Watch this lecture about binary search, bubble sort, and selection sort methods. You need to be aware of various approaches to solving sorting requirements.
-
Since the lectures in this unit use Python for illustration, review these examples in C++.
-
-
-
Watch this lecture until 18:44 for an explanation of Quicksort. If you watch through the end, you will be led through a deep analysis that derives Big-O for this algorithm.
-
This recursive implementation of quicksort in C++ closely mirrors the one discussed in the lecture, which used Python.
-
This is a non-recursive version of Quicksort for C/C++. The variable "recursive" is used to show where recursion would normally take place. We present this since the lectures in this unit use Python for illustration.
-
-
-
Take this assessment to see how well you understood this unit.
- This assessment does not count towards your grade. It is just for practice!
- You will see the correct answers when you submit your answers. Use this to help you study for the final exam!
- You can take this assessment as many times as you want, whenever you want.
-