Recursion in C++
Site: | Saylor Academy |
Course: | CS102: Introduction to Computer Science II |
Book: | Recursion in C++ |
Printed by: | Guest user |
Date: | Friday, September 20, 2024, 11:14 PM |
Description
The principles of recursion are the same, regardless of the language used for implementation. This chapter views the topic through the lens of C++. There are a fair number of examples and visualizations. Read through these at a minimum. You might find the exercises useful since the application of a principle is an aid to your understanding. Read Chapter 5 in this book on C++.
Table of contents
- 1. Objectives
- 2. What Is Recursion?
- 3. Calculating the Sum of a Vector of Numbers
- 4. The Three Laws of Recursion
- 5. Converting an Integer to a String in Any Base
- 6. Stack Frames: Implementing Recursion
- 7. Introduction: Visualizing Recursion
- 8. Sierpinski Triangle
- 9. Complex Recursive Problems
- 10. Tower of Hanoi
- 11. Exploring a Maze
- 12. Dynamic Programming
- 13. Summary
- 14. Discussion Questions
- 15. Programming Exercises
- 16. Glossary
1. Objectives
The goals for this chapter are as follows:
-
To understand that complex problems that may otherwise be difficult to solve may have a simple recursive solution.
-
To learn how to formulate programs recursively.
-
To understand and apply the three laws of recursion.
-
To understand recursion as a form of iteration.
-
To implement the recursive formulation of a problem.
-
To understand how recursion is implemented by a computer system.
Source: Runestone Academy, https://runestone.academy/runestone/books/published/cppds/Recursion/toctree.html
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License.
2. What Is Recursion?
Recursion is a method of solving problems that involves breaking a problem down into smaller and smaller subproblems until you get to a small enough problem that it can be solved trivially. Recursion involves a function calling itself. While it may not seem like much on the surface, recursion allows us to write elegant solutions to problems that may otherwise be very difficult to program.
3. Calculating the Sum of a Vector of Numbers
We will begin our investigation with a simple problem that you already know how to solve without using recursion. Suppose that you want to calculate the sum of a vector of numbers such as: [ 1,3 , 5,7 , 9]. An iterative function that computes the sum is shown in ActiveCode 1. The function uses an accumulator variable ( theSum) to compute a running total of all the numbers in the vector by starting with 0 and adding each number in the vector.
xxxxxxxxxx
//Example of summing up a vector without using recursion.
using namespace std;
int vectsum(int numVect[]){
int theSum = 0;
for (int i = 0; i < 5; i++){
theSum += numVect[i];
}
return theSum;
}
int main() {
int numVect[5] = {1, 3, 5, 7, 9};
cout << vectsum(numVect) << endl;
return 0;
}
xxxxxxxxxx
#Example of summing up a list without using recursion.
def listsum(numList):
theSum = 0
for i in numList:
theSum = theSum + i
return theSum
def main():
print(listsum([1, 3, 5, 7, 9]))
main()
Pretend for a minute that you do not have while
loops or for
loops. How would you compute the sum
of a vector of numbers? If you were a mathematician you might start by recalling that addition is a function that is defined for two parameters, a pair of numbers. To redefine the problem from adding a vector to adding pairs of numbers, we could
rewrite the vector as a fully parenthesized expression. Such an expression looks like this:
We can also parenthesize the expression the other way around,
Notice that the innermost set of parentheses, , is a problem that we can solve without a loop or any special constructs. In fact, we can use the following sequence of simplifications to compute a final sum.
total= (1+ ( 3+( 5+( 7+9 ))) )
total= (1 +(3+( 5 +16 )))
total= (1+ (3+ 21))
total= (1+ 24)
total= 25
How can we take this idea and turn it into a C++ program? First, let's restate the sum problem in terms of C++ arrays. We might say the sum of the vector numVect
is the sum
of the first element of the vector (numVect[0]
), and the sum of the numbers in the rest of the array (numVect.erase(numVect.begin()+0)
).
In this equation first( numVect) returns the first element of the array and rest(numVect) returns an array of everything but the first element. This is easily expressed in C++ as shown in ActiveCode 2.
xxxxxxxxxx
//Example of summing a vector by using recursion.
using namespace std;
int vectsum(vector<int> numVect){
if (numVect.size() <= 1){
return numVect[0];
}
else {
vector<int> slice(numVect.begin() + 1, numVect.begin()+numVect.size());
return numVect[0] + vectsum(slice); //function makes a recursive call to itself.
}
}
int main() {
int arr2[] = {1, 3, 5, 7, 9};
vector<int> numVect(arr2, arr2 + (sizeof(arr2) / sizeof(arr2[0]))); //Initializes vector with same items as arr2.
cout << vectsum(numVect) << endl;
return 0;
}
xxxxxxxxxx
#Example of summing a list using recurison.
def listsum(numList):
if len(numList) == 1:
return numList[0]
else:
return numList[0] + listsum(numList[1:]) #function makes a recursive call to itself.
def main():
print(listsum([1, 3, 5, 7, 9]))
main()
There are a few key ideas while using vector to look at. First, on line 6 we are checking to see if the vector is one element long. This check is crucial and is our escape clause from the function. The sum of a vector of length 1 is trivial; it is
just the number in the vector. Second, on line 11 our function calls itself! This is the reason that we call the vectsum
algorithm recursive. A recursive function is a
function that calls itself.
Figure 1 shows the series of recursive calls that are needed to sum the vector [ 1,3, 5,7, 9]. You should think of this series of calls as a series of simplifications. Each time we make a recursive call we are solving a smaller problem, until we reach the point where the problem cannot get any smaller.
When we reach the point where the problem is as simple as it can get, we begin to piece together the solutions of each of the small problems until the initial problem is solved. Figure 2 shows the additions that are performed as vectsum
works its way backward through the series of calls. When vectsum
returns from the topmost problem, we have the solution to the whole problem.
4. The Three Laws of Recursion
A recursive algorithm must have a base case.
A recursive algorithm must change its state and move toward the base case.
A recursive algorithm must call itself, recursively.
Let's look at each one of these laws in more detail and see how it was used in the vectsum
algorithm. First, a base case is the condition that allows the algorithm to stop
recursing. A base case is typically a problem that is small enough to solve directly. In the vectsum
algorithm the base case is a list of length 1.
To obey the second law, we must arrange for a change of state that moves the algorithm toward the base case. A change of state means that some data that the algorithm is using is modified. Usually the data that represents our problem gets smaller
in some way. In the vectsum
algorithm our primary data structure is a vector, so we must focus our state-changing efforts on the vector. Since the base case is a list of
length 1, a natural progression toward the base case is to shorten the vector. This is exactly what happens on line 5 of ActiveCode 2 when we call vectsum
with a shorter list.
The final law is that the algorithm must call itself. This is the very definition of recursion. Recursion is a confusing concept to many beginning programmers. As a novice programmer, you have learned that functions are good because you can take a large problem and break it up into smaller problems. The smaller problems can be solved by writing a function to solve each problem. When we talk about recursion it may seem that we are talking ourselves in circles. We have a problem to solve with a function, but that function solves the problem by calling itself! But the logic is not circular at all; the logic of recursion is an elegant expression of solving a problem by breaking it down into smaller and easier problems.
It is important to note that regardless of whether or not a recursive function implements these three rules, it may still take an unrealistic amount of time to compute (and thus not be particularly useful). A great example of this is the ackermann function, named after Wilhelm Ackermann, which recursively expands to unrealistic proportions. An example as to how many calls this function makes to itself is the case of ackermann (4,3). In this case, it calls itself well over 100 billion times, with an average time of expected completion that falls after the predicted end of the universe. Despite this, it will eventually come to an answer, but you probably won't be around to see it. This goes to show that the efficiency of recursive functions are largely dependent on how they're implemented.
xxxxxxxxxx
//ackermann function example
using namespace std;
unsigned int ackermann(unsigned int m, unsigned int n) {
if (m == 0) {//Base case
return n + 1;
}
if (n == 0) {
return ackermann(m - 1, 1);// subtract, move to base case
}
//notice a call to the ackermann function as a parameter
//for another call to the ackermann function. This is where
//it gets unrealistically complicated.
return ackermann(m - 1, ackermann(m, n - 1));//subtract here too
}
int main(){
//compute the ackermann function.
//Try replacing the 1,2 parameters with 4,3 and see what happens
cout << ackermann(1,2) << endl;
return 0;
}
In the remainder of this chapter we will look at more examples of recursion. In each case we will focus on designing a solution to a problem by using the three laws of recursion.
5. Converting an Integer to a String in Any Base
Suppose you want to convert an integer to a string in some base between binary and hexadecimal. For example, convert the integer 10 to its string representation in decimal as "10"
,
or to its string representation in binary as "1010"
. While there are many algorithms to solve this problem, including the algorithm discussed in the stack section, the
recursive formulation of the problem is very elegant.
Let's look at a concrete example using base 10 and the number 769. Suppose we have a sequence of characters corresponding to the first 10 digits, like convString = "0123456789"
.
It is easy to convert a number less than 10 to its string equivalent by looking it up in the sequence. For example, if the number is 9, then the string is
convString[9]
or "9"
. If we can arrange to break up the number 769 into three single-digit numbers,
7, 6, and 9, then converting it to a string is simple. A number less than 10 sounds like a good base case.
Knowing what our base suggests that the overall algorithm will involve three components:
Reduce the original number to a series of single-digit numbers.
Convert the single digit-number to a string using a lookup.
Concatenate the single-digit strings together to form the final result.
The next step is to figure out how to change state and make progress toward the base case. Since we are working with an integer, let's consider what mathematical operations might reduce a number. The most likely candidates are division and subtraction. While subtraction might work, it is unclear what we should subtract from what. Integer division with remainders gives us a clear direction. Let's look at what happens if we divide a number by the base we are trying to convert to.
Using integer division to divide 769 by 10, we get 76 with a remainder of 9. This gives us two good results. First, the remainder is a number less than our base that can be converted to a string immediately by lookup. Second, we get a number that is smaller than our original and moves us toward the base case of having a single number less than our base. Now our job is to convert 76 to its string representation. Again we will use integer division plus remainder to get results of 7 and 6 respectively. Finally, we have reduced the problem to converting 7, which we can do easily since it satisfies the base case condition of , where . The series of operations we have just performed is illustrated in Figure 3. Notice that the numbers we want to remember are in the remainder boxes along the right side of the diagram.
ActiveCode 1 shows the C++ and Python code that implements the algorithm outlined above for any base between 2 and 16.
xxxxxxxxxx
//Recursive example of converting from int to string.
using namespace std;
string toStr(int n, int base) {
string convertString = "0123456789ABCDEF";
if (n < base) {
return string(1, convertString[n]); // converts char to string, and returns it
} else {
return toStr(n/base, base) + convertString[n%base]; // function makes a recursive call to itself.
}
}
int main() {
cout << toStr(1453, 16);
}
xxxxxxxxxx
#Recursive example of converting an int to str.
def toStr(n,base):
convertString = "0123456789ABCDEF"
if n < base:
return convertString[n]
else:
return toStr(n//base,base) + convertString[n%base] #function makes a recursive call to itself.
def main():
print(toStr(1453,16))
main()
Notice that in line 7 we check for the base case where n
is less than the base we are converting to. When we detect the base case, we stop recursing and simply return the string
from the
convertString
sequence. In line 10 we satisfy both the second and third laws–by making the recursive call and by reducing the problem size–using division.
Let's trace the algorithm again; this time we will convert the number 10 to its base 2 string representation ("1010"
).
Figure 4 shows that we get the results we are looking for, but it looks like the digits are in the wrong order. The algorithm works correctly because we make the
recursive call first on line 6, then we add the string representation of the remainder. If we reversed returning the convertString
lookup and returning the
toStr
call, the resulting string would be backward! But by delaying the concatenation operation until after the recursive call has returned, we get the result in the proper
order. This should remind you of our discussion of stacks back in the previous chapter.
6. Stack Frames: Implementing Recursion
Suppose that instead of concatenating the result of the recursive call to toStr
with the string from convertString
,
we modified our algorithm to push the strings onto a stack instead of making the recursive call. The code for this modified algorithm is shown in
ActiveCode 1.
xxxxxxxxxx
//Example of the toStr function using a stack instead of recursion.
using namespace std;
stack<char> rStack;
string toStr(int n, int base) {
string convertString = "0123456789ABCDEF";
while (n > 0) {
if (n < base) {
rStack.push(convertString[n]); //pushes string n to the stack
} else {
rStack.push(convertString[n % base]); //pushes string n modulo base to the stack.
}
n = n/base;
}
string res;
while (!rStack.empty()) {
//combines all the items in the stack into a full string.
res = res + (string(1, rStack.top()));
rStack.pop();
xxxxxxxxxx
#Example of the toStr function using a stack instead of recursion.
from pythonds.basic.stack import Stack
rStack = Stack()
def toStr(n,base):
convertString = "0123456789ABCDEF"
while n > 0:
if n < base:
rStack.push(convertString[n]) #adds string n to the stack.
else:
rStack.push(convertString[n % base]) #adds string n modulo base to the stack.
n = n // base
res = ""
while not rStack.isEmpty():
#combines all the items in the stack to make the full string.
res = res + str(rStack.pop())
return res
def main():
print(toStr(1453,16))
main()
Each time we make a call to toStr
, we push a character on the stack. Returning to the previous example we can see that after the fourth call to toStr
the stack would look like Figure 5. Notice that now we can simply pop the characters off the stack and concatenate them into the final result, "1010"
.
The previous example gives us some insight into how C++ implements a recursive function call. When a function is called in Python, a stack frame is allocated to handle the local variables of the function. When the function returns, the return value is left on top of the stack for the calling function to access. Figure 6 illustrates the call stack after the return statement on line 4.
Notice that the call to toStr(2//2,2)
leaves a return value of
"1"
on the stack. This return value is then used in place of the function call (toStr(1,2)
) in the
expression "1" + convertString[2%2]
, which will leave the string "10"
on the top of the stack. In this way, the C++ call stack takes the place of the stack we used explicitly in Listing 4. In our list summing example, you can
think of the return value on the stack taking the place of an accumulator variable.
The stack frames also provide a scope for the variables used by the function. Even though we are calling the same function over and over, each call creates a new scope for the variables that are local to the function.
7. Introduction: Visualizing Recursion
In the previous section we looked at some problems that were easy to solve using recursion; however, it can still be difficult to find a mental model or a way of visualizing what is happening in a recursive function. This can make recursion difficult for people to grasp. In this section we will look at a couple of examples of using recursion to draw some interesting pictures. As you watch these pictures take shape you will get some new insight into the recursive process that may be helpful in cementing your understanding of recursion.
The conventional tool used for these kinds of illustrations is Python's turtle graphics module called turtle
. The turtle
module is standard with all versions of Python and is very easy to use. The metaphor is quite simple. You can create a turtle and the turtle can move forward, backward, turn left, turn right, etc. The turtle can have its tail up or down. When
the turtle's tail is down and the turtle moves it draws a line as it moves. To increase the artistic value of the turtle you can change the width of the tail as well as the color of the ink the tail is dipped in.
Here is a simple example to illustrate some turtle graphics basics. We will use the turtle module to draw a spiral recursively.
ActiveCode 1 shows how it is done. After importing the turtle
module we create a turtle.
When the turtle is created it also creates a window for itself to draw in. Next we define the drawSpiral function. The base case for this simple function is when the length of the line we want to draw, as given by the len
parameter, is reduced to zero or less. If the length of the line is longer than zero we instruct the turtle to go forward by len
units and then turn right 90 degrees. The
recursive step is when we call drawSpiral again with a reduced length.
xxxxxxxxxx
namespace ct = cturtle;
void spiral(ct::Turtle& turtle, int length) {
if (length > 0) {
turtle.forward(length);
turtle.right(90);
spiral(turtle, length - 5);
}
}
int main() {
ct::TurtleScreen screen;
ct::Turtle turtle(screen);
spiral(turtle, 100);
screen.bye();
return 0;
}
xxxxxxxxxx
#Creates an inward spiral through recursion.
import turtle
def drawSpiral(myTurtle, lineLen):
if lineLen > 0:
myTurtle.forward(lineLen)
myTurtle.right(90)
drawSpiral(myTurtle,lineLen-5) #function makes recursive call.
def main():
myTurtle = turtle.Turtle()
myWin = turtle.Screen()
drawSpiral(myTurtle,100)
myWin.exitonclick()
main()
That is really about all the turtle graphics you need to know in order to make some pretty impressive drawings. For our next program we are going to draw a fractal tree. Fractals come from a branch of mathematics, and have much in common with recursion. The definition of a fractal is that when you look at it the fractal has the same basic shape no matter how much you magnify it. Some examples from nature are the coastlines of continents, snowflakes, mountains, and even trees or shrubs. The fractal nature of many of these natural phenomenon makes it possible for programmers to generate very realistic looking scenery for computer generated movies. In our next example we will generate a fractal tree.
To understand how this is going to work it is helpful to think of how we might describe a tree using a fractal vocabulary. Remember that we said above that a fractal is something that looks the same at all different levels of magnification. If we translate this to trees and shrubs we might say that even a small twig has the same shape and characteristics as a whole tree. Using this idea we could say that a tree is a trunk, with a smaller tree going off to the right and another smaller tree going off to the left. If you think of this definition recursively it means that we will apply the recursive definition of a tree to both of the smaller left and right trees.
Let's translate this idea to some C++ code. Listing 1 shows how we can use Python with our turtle to generate a fractal tree. Let's look at the code a bit more
closely. You will see that on lines 5 and 7 we are making a recursive call. On line 5 we make the recursive call right after the turtle turns to the right by 20 degrees; this is the right tree mentioned above. Then in line 7 the turtle makes another
recursive call, but this time after turning left by 40 degrees. The reason the turtle must turn left by 40 degrees is that it needs to undo the original 20 degree turn to the right and then do an additional 20 degree turn to the left in order
to draw the left tree. Also notice that each time we make a recursive call to tree
we subtract some amount from the branchLen
parameter; this is to make sure that the recursive trees get smaller and smaller. You should also recognize the initial
if
statement on line 2 as a check for the base case of branchLen
getting too small. The C++ equivalent
to this function is shown below and exists in "Turtle.cpp".
Listing 1
1 2 3 4 5 6 7 8 9 |
def tree(branchLen,t):
|
The complete program for this tree example is shown in ActiveCode 2. Before you run the code think about how you expect to see the tree take shape. Look at the recursive calls and think about how this tree will unfold. Will it be drawn symmetrically with the right and left halves of the tree taking shape simultaneously? Will it be drawn right side first then left side?
xxxxxxxxxx
namespace ct = cturtle;
void tree(ct::Turtle& turtle, int length) {
if(length > 5){
turtle.forward(length);
turtle.right(20);
tree(turtle, length - 15);
turtle.left(40);
tree(turtle, length - 15);
turtle.right(20);
turtle.back(length);
}
}
int main() {
ct::TurtleScreen screen;
screen.tracer(3);//Draw faster.
ct::Turtle turtle(screen);
turtle.pencolor({"green"});
//Make the trees "grow" upwards
turtle.left(90);
xxxxxxxxxx
#Creates a tree by using recursion.
import turtle
def tree(branchLen,t):
if branchLen > 5:
t.forward(branchLen) #Turtle goes forward.
t.right(20)
tree(branchLen-15,t) #Recursive call
t.left(40)
tree(branchLen-15,t) #Recursive call
t.right(20)
t.backward(branchLen) #Turtle must go back the same distance
#as it went forward to draw the tree
#evenly.
def main():
t = turtle.Turtle()
myWin = turtle.Screen()
t.left(90)
t.up()
t.backward(100)
t.down()
t.color("green")
Notice how each branch point on the tree corresponds to a recursive call, and notice how the tree is drawn to the right all the way down to its shortest twig. You can see this in Figure 1. Now, notice how the program works its way back up the trunk until the entire right side of the tree is drawn. You can see the right half of the tree in Figure 2. Then the left side of the tree is drawn, but not by going as far out to the left as possible. Rather, once again the entire right side of the left tree is drawn until we finally make our way out to the smallest twig on the left.
This simple tree program is just a starting point for you, and you will notice that the tree does not look particularly realistic because nature is just not as symmetric as a computer program. The exercises at the end of the chapter will give you some ideas for how to explore some interesting options to make your tree look more realistic.
8. Sierpinski Triangle
Since we can continue to apply the algorithm indefinitely, what is the base case? We will see that the base case is set arbitrarily as the number of times we want to divide the triangle into pieces. Sometimes we call this number the "degree" of the fractal. Each time we make a recursive call, we subtract 1 from the degree until we reach 0. When we reach a degree of 0, we stop making recursive calls. The code that generated the Sierpinski Triangle in Figure 3 is shown in ActiveCode 1.
xxxxxxxxxx
namespace ct = cturtle;
void draw_triangle(ct::Point a, ct::Point b, ct::Point c, ct::Color color, ct::Turtle& myTurtle){
myTurtle.fillcolor(color);
myTurtle.penup();
myTurtle.goTo(a);
myTurtle.pendown();
myTurtle.begin_fill();
myTurtle.goTo(c);
myTurtle.goTo(b);
myTurtle.goTo(a);
myTurtle.end_fill();
}
//getMid already defined as "middle" function in C-Turtle namespace :)
void sierpinski(ct::Point a, ct::Point b, ct::Point c, int degree, ct::Turtle& myTurtle){
const std::string colormap[] = {"blue", "red", "green", "white", "yellow", "violet", "orange"};
draw_triangle(a,b,c, {colormap[degree]}, myTurtle);
if(degree > 0){
sierpinski(a, ct::middle(a, b), ct::middle(a, c), degree - 1, myTurtle);
sierpinski(b, ct::middle(a, b), ct::middle(b, c), degree - 1, myTurtle);
xxxxxxxxxx
#Recursive example of the Sierpinski Triangle.
import turtle
def drawTriangle(points,color,myTurtle):
#Draws a triangle using the diven points and color.
myTurtle.fillcolor(color)
myTurtle.up()
myTurtle.goto(points[0][0],points[0][1])
myTurtle.down()
myTurtle.begin_fill()
myTurtle.goto(points[1][0],points[1][1])
myTurtle.goto(points[2][0],points[2][1])
myTurtle.goto(points[0][0],points[0][1])
myTurtle.end_fill()
def getMid(p1,p2):
return ( (p1[0]+p2[0]) / 2, (p1[1] + p2[1]) / 2)
def sierpinski(points,degree,myTurtle):
colormap = ['blue','red','green','white','yellow',
'violet','orange']
drawTriangle(points,colormap[degree],myTurtle)
if degree > 0:
The program in ActiveCode 1 follows the ideas outlined above. The first thing sierpinski
does
is draw the outer triangle. Next, there are three recursive calls, one for each of the new corner triangles we get when we connect the midpoints. Once again we make use of the standard turtle module that comes with Python. You can learn all the
details of the methods available in the turtle module by using
help('turtle')
from the Python prompt.
Look at the code and think about the order in which the triangles will be drawn. While the exact order of the corners depends upon how the initial set is specified, let's assume that the corners are ordered lower left, top, lower right. Because of
the way the sierpinski
function calls itself, sierpinski
works its way to the smallest allowed
triangle in the lower-left corner, and then begins to fill out the rest of the triangles working back. Then it fills in the triangles in the top corner by working toward the smallest, topmost triangle. Finally, it fills in the lower-right corner,
working its way toward the smallest triangle in the lower right.
Sometimes it is helpful to think of a recursive algorithm in terms of a diagram of function calls. Figure 4 shows that the recursive calls are always made going to the left. The active functions are outlined in black, and the inactive function calls are in gray. The farther you go toward the bottom of Figure 4, the smaller the triangles. The function finishes drawing one level at a time; once it is finished with the bottom left it moves to the bottom middle, and so on.
The sierpinski
function relies heavily on the getMid
function.
getMid
takes as arguments two endpoints and returns the point halfway between them. In addition, ActiveCode 1 has a function that draws a filled triangle using the begin_fill
and end_fill
turtle methods.
Visual Studio can be used to create similar turtle-like graphics in C++ using the provided class "Turtle.cpp". Visual Studio files can be opened together with the as a .sln file. Try downloading and running the following code from GitHub.
Look at the Turtle.cpp file. Try changing the code within the turtle's draw loop and using the predefined functions.
9. Complex Recursive Problems
In the previous sections we looked at some problems that are relatively
easy to solve and some graphically interesting problems that can help us
gain a mental model of what is happening in a recursive algorithm. In
this section we will look at some problems that are really difficult to
solve using an iterative programming style but are very elegant and easy
to solve using recursion. We will finish up by looking at a deceptive
problem that at first looks like it has an elegant recursive solution
but in fact does not.
10. Tower of Hanoi
Although the legend is interesting, you need not worry about the world ending any time soon. The number of moves required to correctly move a tower of 64 disks is . At a rate of one move per second, that is years! Clearly there is more to this puzzle than meets the eye.
How do we go about solving this problem recursively? How would you go about solving this problem at all? What is our base case? Let's think about this problem from the bottom up. Suppose you have a tower of five disks, originally on peg one. If you already knew how to move a tower of four disks to peg two, you could then easily move the bottom disk to peg three, and then move the tower of four from peg two to peg three. But what if you do not know how to move a tower of height four? Suppose that you knew how to move a tower of height three to peg three; then it would be easy to move the fourth disk to peg two and move the three from peg three on top of it. But what if you do not know how to move a tower of three? How about moving a tower of two disks to peg two and then moving the third disk to peg three, and then moving the tower of height two on top of it? But what if you still do not know how to do this? Surely you would agree that moving a single disk to peg three is easy enough, trivial you might even say. This sounds like a base case in the making.
Here is a high-level outline of how to move a tower from the starting pole, to the goal pole, using an intermediate pole:
-
Move a tower of height-1 to an intermediate pole, using the final pole.
-
Move the remaining disk to the final pole.
-
Move the tower of height-1 from the intermediate pole to the final pole using the original pole.
As long as we always obey the rule that the larger disks remain on the bottom of the stack, we can use the three steps above recursively, treating any larger disks as though they were not even there. The only thing missing from the outline above is the identification of a base case. The simplest Tower of Hanoi problem is a tower of one disk. In this case, we need move only a single disk to its final destination. A tower of one disk will be our base case. In addition, the steps outlined above move us toward the base case by reducing the height of the tower in steps 1 and 3. Listing 1 shows the Python code to solve the Tower of Hanoi puzzle.
Listing 1
1 2 3 4 5 6 7 |
int moveTower(int height, char fromPole, char toPole, char withPole){
if (height >= 1){
moveTower(height-1, fromPole, withPole, toPole);
moveDisk(fromPole, toPole);
moveTower(height-1, withPole, toPole, fromPole);
}
}
|
Notice that the code in Listing 1 is almost identical to the English description. The key to the simplicity of the algorithm is that we make two different recursive
calls, one on line 3 and a second on line 5. On line 3 we move all but the bottom disk on the initial tower to an intermediate pole. The next line simply moves the bottom disk to its final resting place. Then on line 5 we move the tower from the
intermediate pole to the top of the largest disk. The base case is detected when the tower height is 0; in this case there is nothing to do, so the moveTower
function simply
returns. The important thing to remember about handling the base case this way is that simply returning from moveTower
is what finally allows the moveDisk
function to be called.
The function moveDisk
, shown in Listing 2, is very simple. All it does is print out
that it is moving a disk from one pole to another. If you type in and run the moveTower
program you can see that it gives you a very efficient solution to the puzzle.
Listing 2
int moveDisk(char fp, char tp){
cout << "moving disk from " << fp << " to " << tp << endl;
}
The program in ActiveCode 1 provides the entire solution for three disks.
xxxxxxxxxx
//Simulation of the tower of hanoi.
using namespace std;
void moveDisk(char fp, char tp){
cout << "moving disk from " << fp << " to " << tp << endl;
}
void moveTower(int height, char fromPole, char toPole, char withPole){
if (height >= 1){
moveTower(height-1, fromPole, withPole, toPole); //Recursive call
moveDisk(fromPole, toPole);
moveTower(height-1, withPole, toPole, fromPole); //Recursive call
}
}
int main() {
moveTower(3, 'A', 'B', 'C');
}
xxxxxxxxxx
#Simulation of the tower of hanoi.
def moveTower(height,fromPole, toPole, withPole):
if height >= 1:
moveTower(height-1,fromPole,withPole,toPole) #Recursive call
moveDisk(fromPole,toPole)
moveTower(height-1,withPole,toPole,fromPole) #Recursive call
def moveDisk(fp,tp):
print("moving disk from",fp,"to",tp)
def main():
moveTower(3,"A","B","C")
main()
Q-3: If you change the tower height in Line 17 from 3 to 6, how many moves must you make to complete the Hanoi tower? (hint, try implementing a counter to return the correct number)
Now that you have seen the code for both moveTower
and moveDisk
, you may be wondering why we do not have
a data structure that explicitly keeps track of what disks are on what poles. Here is a hint: if you were going to explicitly keep track of the disks, you would probably use three Stack
objects, one for each pole. The answer is that C++ provides the stacks that we need implicitly through the call stack.
11. Exploring a Maze
In this section we will look at a problem that has relevance to the expanding world of robotics: How do you find your way out of a maze? If you have a Roomba vacuum cleaner for your dorm room (don't all college students?) you will wish that you could reprogram it using what you have learned in this section. The problem we want to solve is to find an exit to a virtual maze when starting at a pre-defined location. The maze problem has roots as deep as the Greek myth about Theseus who was sent into a maze to kill the minotaur. Theseus used a ball of thread to help him find his way back out again once he had finished off the beast. In our problem we will assume that our starting position is dropped down somewhere into the middle of the maze, a fair distance from any exit. Look at Figure 2 to get an idea of where we are going in this section.
To make it easier for us we will assume that our maze is divided up into "squares". Each square of the maze is either open or occupied by a section of wall. We can only pass through the open squares of the maze. If we bump into a wall, we must try a different direction. We will require a systematic procedure to find our way out of the maze. Here is the procedure:
-
From our starting position we will first try going North one square and then recursively try our procedure from there.
-
If we are not successful by trying a Northern path as the first step then we will take a step to the South and recursively repeat our procedure.
-
If South does not work then we will try a step to the West as our first step and recursively apply our procedure.
-
If North, South, and West have not been successful then apply the procedure recursively from a position one step to our East.
-
If none of these directions works then there is no way to get out of the maze and we fail.
Now, that sounds pretty easy, but there are a couple of details to talk about first. Suppose we take our first recursive step by going North. By following our procedure our next step would also be to the North. But if the North is blocked by a wall we must look at the next step of the procedure and try going to the South. Unfortunately that step to the south brings us right back to our original starting place. If we apply the recursive procedure from there we will just go back one step to the North and be in an infinite loop. So, we must have a strategy to remember where we have been. In this case we will assume that we have a bag of bread crumbs we can drop along our way. If we take a step in a certain direction and find that there is a bread crumb already on that square, we know that we should immediately back up and try the next direction in our procedure. As we will see when we look at the code for this algorithm, backing up is as simple as returning from a recursive function call.
As we do for all recursive algorithms let us review the base cases. Some of them you may already have guessed based on the description in the previous paragraph. In this algorithm, there are four base cases to consider:
-
We have run into a wall. Since the square is occupied by a wall no further exploration can take place.
-
We have found a square that has already been explored. We do not want to continue exploring from this position or we will get into a loop.
-
We have found an outside edge, not occupied by a wall. In other words we have found an exit from the maze.
-
We have explored a square unsuccessfully in all four directions.
For our program to work we will need to have a way to represent the maze. In this instance, we will stick to a text-only representation (ASCII).
-
__init__
Initializes basic variables to default values, and callsreadMazeFile
-
readMazeFile
Reads the text of the maze text file, and callsfindStartPosition
. -
findStartPosition
Finds the row and column of the starting position. -
isOnEdge
Checks to see if the current position is on the edge, and therefore an exit. -
print
Prints the text of the maze to the screen.
The Maze
class also overloads the index operator []
so that our algorithm can easily access the status
of any particular square.
Let's examine the code for the search function which we call
searchFrom
. The code is shown in Listing 3. Notice that this function takes three
parameters: a maze object, the starting row, and the starting column. This is important because as a recursive function the search logically starts again with each recursive call.
def searchFrom(maze, startRow, startColumn): # Check for base cases (Steps 1, 2, and 3): # 1. We have run into an obstacle, return false if maze[startRow][startColumn] == MAZE_OBSTACLE: return False # 2. We have found a square that has already been explored if maze[startRow][startColumn] == MAZE_TRIED: return False # 3. Success, an outside edge not occupied by an obstacle if maze.isOnEdge(startRow, startColumn): maze[startRow][startColumn] = MAZE_PATH return True # 4. Indicate that the currently visited space has been tried. # Refer to step two. maze[startRow][startColumn] = MAZE_TRIED # 5. Otherwise, check each cardinal direction (North, south, east, and west). # We are checking one space in each direction, thus the plus or minus one below. found = searchFrom(maze, startRow - 1, startColumn) or \ searchFrom(maze, startRow + 1, startColumn) or \ searchFrom(maze, startRow, startColumn - 1) or \ searchFrom(maze, startRow, startColumn + 1) # 6. Mark the location as either part of the path or a dead end, # depending on whether or not an exit has been found. if found: maze[startRow][startColumn] = MAZE_PATH else: maze[startRow][startColumn] = MAZE_DEAD_END return found
As you look through the algorithm you will see that the first thing the code does (steps 1 and 2) is determine if the space should be visited. This is done by checking if the spot is an obstacle (MAZE_OBSTACLE
),
or has already been visited (MAZE_TRIED
). The algorithm then determines if it has found an exit (step 3). If none of these cases are true, it continues the search recursively.
You will notice that in the recursive step there are four recursive calls to searchFrom
. It is hard to predict how many of these recursive calls will be used since they are all
connected by or
statements. If the first call to searchFrom
returns True
then none of the last three calls would be needed. You can interpret this as meaning that a step to (row-1,column)
(or North if you want to think geographically) is on the
path leading out of the maze. If there is not a good path leading out of the maze to the North then the next recursive call is tried, this one to the South. If South fails then try West, and finally East. If all four recursive calls return false then
we have found a dead end. You should download or type in the whole program and experiment with it by changing the order of these calls.
The code for the Maze
class is shown in Listing 4, Listing 5,
and Listing 6. The __init__
method takes the name of a file as its only parameter. This file is a text file that represents a maze by
using “+” characters for walls, spaces for open squares, and the letter “S” to indicate the starting position. Figure 3 is an example of a maze data file. The internal
representation of the maze is a list of lists. Each row of the mazeList
instance variable is also a list. This secondary list contains one character per square using the characters
described above. For the data file in Figure 3 the internal representation looks like the following:
[['+','+','+','+',...,'+','+','+','+','+','+','+'],
['+',' ',' ',' ',...,' ',' ',' ','+',' ',' ',' '],
['+',' ','+',' ',...,'+','+',' ','+',' ','+','+'],
['+',' ','+',' ',...,' ',' ',' ','+',' ','+','+'],
['+','+','+',' ',...,'+','+',' ','+',' ',' ','+'],
['+',' ',' ',' ',...,'+','+',' ',' ',' ',' ','+'],
['+','+','+','+',...,'+','+','+','+','+',' ','+'],
['+',' ',' ',' ',...,'+','+',' ',' ','+',' ','+'],
['+',' ','+','+',...,' ',' ','+',' ',' ',' ','+'],
['+',' ',' ',' ',...,' ',' ','+',' ','+','+','+'],
['+','+','+','+',...,'+','+','+',' ','+','+','+']]
The searchFrom
method uses this internal representation to traverse throughout the maze.
Figure 3: An Example Maze Data File
++++++++++++++++++++++ + + ++ ++ + + + + +++ + ++ + + + ++ ++++ + ++ +++ ++++++ +++ + + + ++ ++ + +++++ ++++++ +++++ + + + +++++++ + + + +++++++ S + + + + +++ ++++++++++++++++++ +++
Finally, the isOnEdge
method uses our current position to test for an exit condition. An exit condition occurs whenever we have navigated to the edge of the maze, either row zero
or column zero, or the far right column or the bottom row.
Listing 4
MAZE_OBSTACLE = '+'
MAZE_START = 'S'
MAZE_PATH = 'O'
MAZE_DEAD_END = '-'
MAZE_TRIED = '.'
class Maze:
def __init__(self, mazeFileName):
# Initialize all of our default variables.
self.mazeList = []
self.totalRows = 0
self.totalColumns = 0
self.startRow = 0
self.startColumn = 0
# And read the maze file.
self.readMazeFile(mazeFileName)
def readMazeFile(self, mazeFileName):
# The maze list is a list of strings.
# Components of the maze are indicated by specific characters.
# These characters are listed at the top of the file.
# The line below says the following:
# For every line of text in our maze text file, add every single character to a list.
# The final result is a list of lists, where each element is a single character.
self.mazeList = [[char for char in line] for line in open(mazeFileName).read().split("\n")]
# The total number of rows is the total number of strings in the list.
self.totalRows = len(self.mazeList)
# The total number of columns is the length of a single line.
# We can assume all lines of text for the maze are the same length.
self.totalColumns = len(self.mazeList[0])
# Lastly, find the start position.
self.findStartPosition()
def findStartPosition(self):
# Iterate through every individual character in the maze list.
# If we come across the MAZE_START character ('S'),
# we save the row and column of where it was found, and stop looking.
# enumerate(...) is very much like using a typical list,
# except it gives you two pieces of information instead of one.
# It assumes the format of (index_of_item, item).
for (row, text) in enumerate(self.mazeList):
for(column, component) in enumerate(text):
if component == MAZE_START:
self.startRow = row
self.startColumn = column
return
def isOnEdge(self, row, column):
return (row == 0 or
row == self.totalRows - 1 or
column == 0 or
column == self.totalColumns - 1)
# This allows us to use the Maze class like a list, e.g, maze[index]
def __getitem__(self, index):
return self.mazeList[index]
The complete program is shown in ActiveCode 1. This program uses the data file maze1.txt
shown above. Feel free to also attempt to use maze2.txt
from up above. Note that it is a much more simple example file in that the exit is very close to the starting position.
++++++++++++++++++++++ + + ++ ++ + + ++++++++++ + + ++ ++++ +++ ++ + + + + ++ +++ + + ++ ++ + + +++++ + + ++ + + +++++ +++ + + ++ + + + + S+ + + +++++ + + + + + ++++++++++++++++++++++
xxxxxxxxxx
self.mazeList = [[char for char in line] for line in open(mazeFileName).read().split("\n")]
MAZE_OBSTACLE = '+'
MAZE_START = 'S'
MAZE_PATH = 'O'
MAZE_DEAD_END = '-'
MAZE_TRIED = '.'
class Maze:
def __init__(self, mazeFileName):
# Initialize all of our default variables.
self.mazeList = []
self.totalRows = 0
self.totalColumns = 0
self.startRow = 0
self.startColumn = 0
# And read the maze file.
self.readMazeFile(mazeFileName)
def readMazeFile(self, mazeFileName):
# The maze list is a list of strings.
# Components of the maze are indicated by specific characters.
# These characters are listed at the top of the file.
Self Check
Now that you're familiar with this simple maze exploring algorithm, use what you've learned about file handling, classes, and IO to implement this in C++! To visualize the exploration, print out the characters using cout
to create an ASCII representation of your cave. For example, your program should be able to read and operate from a file formatted as follows: You can also use CTurtle to visualize the traversal throughout the maze.
7 5
+++++++
+ + S+
+ + +
++
+++++++
12. Dynamic Programming
A classic example of an optimization problem involves making change using the fewest coins. Suppose you are a programmer for a vending machine manufacturer. Your company wants to streamline effort by giving out the fewest possible coins in change for each transaction. Suppose a customer puts in a dollar bill and purchases an item for 37 cents. What is the smallest number of coins you can use to make change? The answer is six coins: two quarters, one dime, and three pennies. How did we arrive at the answer of six coins? We start with the largest coin in our arsenal (a quarter) and use as many of those as possible, then we go to the next lowest coin value and use as many of those as possible. This first approach is called a greedy method because we try to solve as big a piece of the problem as possible right away.
The greedy method works fine when we are using U.S. coins, but suppose that your company decides to deploy its vending machines in Lower Elbonia where, in addition to the usual 1, 5, 10, and 25 cent coins they also have a 21 cent coin. In this instance our greedy method fails to find the optimal solution for 63 cents in change. With the addition of the 21 cent coin the greedy method would still find the solution to be six coins. However, the optimal answer is three 21 cent pieces.
Let's look at a method where we could be sure that we would find the optimal answer to the problem. Since this section is about recursion, you may have guessed that we will use a recursive solution. Let's start with identifying the base case. If we are trying to make change for the same amount as the value of one of our coins, the answer is easy, one coin.
If the amount does not match we have several options. What we want is the minimum of a penny plus the number of coins needed to make change for the original amount minus a penny, or a nickel plus the number of coins needed to make change for the original amount minus five cents, or a dime plus the number of coins needed to make change for the original amount minus ten cents, and so on. So the number of coins needed to make change for the original amount can be computed according to the following:
The algorithm for doing what we have just described is shown in Listing 7. In line 7 we are checking our base case; that is, we are trying to make change in the exact amount of one of our coins. If we do not have a coin equal to the amount of change, we make recursive calls for each different coin value less than the amount of change we are trying to make. Line 6 shows how we filter the list of coins to those less than the current value of change using a list comprehension. The recursive call also reduces the total amount of change we need to make by the value of the coin selected. The recursive call is made in line 14. Notice that on that same line we add 1 to our number of coins to account for the fact that we are using a coin. Just adding 1 is the same as if we had made a recursive call asking where we satisfy the base case condition immediately.
Listing 7
xxxxxxxxxx
//Recursive example of trying to get the least amount of coins to make up an amount of change.
using namespace std;
int recMC_greedy(vector<int> coinValueList, int change){
if (change == 0){ //base case if, change is 0, then the number of coins have been finalized
return 0;
}
else{
int cur_max =* max_element(coinValueList.begin(), coinValueList.end());//use the maximum in the list to see how many of these can be used to form the sum
int count=int(change/cur_max); //find how many of the max is needed to make the change so that the number of coins used is minimum
coinValueList.erase(std::remove(coinValueList.begin(), coinValueList.end(), cur_max), coinValueList.end()); //erasing the current max so that a different max can be
//used in next recursion and continue the greedy process
return count + recMC_greedy(coinValueList, change-cur_max * count); //returns the counts of the coins using recursion
}
}
int main() {
int arr2[] = {1, 5, 10, 21, 25};
vector<int> coinValueList(arr2, arr2 + (sizeof(arr2)/ sizeof(arr2[0]))); //Initializing vector
cout<<recMC_greedy(coinValueList, 63)<<endl; //using the greedy algorithm for the edge case 63 whose optimal solution is 3 coins of 21
xxxxxxxxxx
#Recursive example of trying to get the least amount of coins to make up an amount of change.
def recMC_greedy(coinValueList,change):
if change == 0: #base case if, change is 0, then the number of coins have been finalized
return 0
else:
cur_max = max(coinValueList) #use the maximum in the list to see how many of these can be used to form the sum
count = change//cur_max #find how many of the max is needed to make the change so that the number of coins used is minimum
index = coinValueList.index(cur_max)
del coinValueList[index] #erasing the current max so that a different max can be
#used in next recursion and continue the greedy process
return count + recMC_greedy(coinValueList, change-cur_max * count) #returns the counts of the coins using recursion
def main():
print(recMC_greedy([1, 5, 10, 21, 25], 63)) #using the greedy algorithm for the edge case 63 whose optimal solution is 3 coins of 21
#but greedy algorithm gives 6 coins which is not the most optimum solution
main()
The trouble with the algorithm in Listing 7 is that it is extremely inefficient. In fact, it takes 67,716,925 recursive calls to find the optimal solution to the 4 coins, 63 cents problem! To understand the fatal flaw in our approach look at Figure 5, which illustrates a small fraction of the 377 function calls needed to find the optimal set of coins to make change for 26 cents.
Each node in the graph corresponds to a call to recMC
. The label on the node indicates the amount of change for which we are computing the number of coins. The label on the
arrow indicates the coin that we just used. By following the graph we can see the combination of coins that got us to any point in the graph. The main problem is that we are re-doing too many calculations. For example, the graph shows that the
algorithm would recalculate the optimal number of coins to make change for 15 cents at least three times. Each of these computations to find the optimal number of coins for 15 cents itself takes 52 function calls. Clearly we are wasting a lot
of time and effort recalculating old results.
The key to cutting down on the amount of work we do is to remember some of the past results so we can avoid recomputing results we already know. A simple solution is to store the results for the minimum number of coins in a table when we find them. Then before we compute a new minimum, we first check the table to see if a result is already known. If there is already a result in the table, we use the value from the table rather than recomputing. This technique is called memoization and is a very useful method for speeding up frequent yet hardware-demanding function calls. ActiveCode 1 shows a modified algorithm to incorporate our table lookup scheme.
xxxxxxxxxx
//A different attempt at making the change algorithm.
using namespace std;
int recDC(vector<int> coinValueList, int change, int knownResults[]){
int minCoins, numCoins;
minCoins = change;
for (unsigned int i = 0; i< coinValueList.size(); i++){ //this loop contains the base case,
//as it returns items that are not
//returning a call to the recDC function.
if (coinValueList[i] == change){
knownResults[change] = 1;
return 1;
}
else if(knownResults[change] > 0){
return knownResults[change];
}
}
for (unsigned int y=0; y<coinValueList.size(); y++){
if (coinValueList[y] <= change){
numCoins = 1 + recDC(coinValueList, change - coinValueList[y], knownResults); //Recursive call
xxxxxxxxxx
#A different attempt at making the change algorithm.
def recDC(coinValueList, change, knownResults):
minCoins = change
if change in coinValueList: #base case
knownResults[change] = 1
return 1
elif knownResults[change] > 0: #base case
return knownResults[change]
else:
for i in [c for c in coinValueList if c <= change]:
numCoins = 1 + recDC(coinValueList, change - i, knownResults) #Recursive call.
if numCoins < minCoins:
minCoins = numCoins
knownResults[change] = minCoins
return minCoins
def main():
print(recDC([1, 5, 10, 21, 25], 63, [0]*64))
main()
Notice that in line 15 we have added a test to see if our table contains the minimum number of coins for a certain amount of change. If it does not, we compute the minimum recursively and store the computed minimum in the table. Using this modified algorithm reduces the number of recursive calls we need to make for the four coin, 63 cent problem to 221 calls!
Although the algorithm in AcitveCode 1 is correct, it looks and feels like a bit of a hack. Also, if we look at the knownResults
lists we can see that there are some holes in the table. In fact the term for what we have done is not dynamic programming but rather we have improved the performance of our program by using a technique known as "memoization," or more commonly
called "caching". Memoization uses what is sometimes called an opportunistic top-down approach. When you need the result of a computation, you check to see if you have already computed it, otherwise you do the new calculation and store the result.
A truly dynamic programming algorithm will take a more systematic bottom-up approach to the problem. Memoization and dynamic programming are both code optimization techniques that avoid recalculating duplicate work. Our dynamic programming solution is going to start with making change for one cent and systematically work its way up to the amount of change we require. This guarantees us that at each step of the algorithm we already know the minimum number of coins needed to make change for any smaller amount.
This is often called synamic programming with tabulation. Let's look at how we would fill in a table of minimum coins to use in making change for 11 cents. Figure 4 illustrates the process. We start with one cent. The only solution possible is one coin (a penny). The next row shows the minimum for one cent and two cents. Again, the only solution is two pennies. The fifth row is where things get interesting. Now we have two options to consider, five pennies or one nickel. How do we decide which is best? We consult the table and see that the number of coins needed to make change for four cents is four, plus one more penny to make five, equals five coins. Or we can look at zero cents plus one more nickel to make five cents equals 1 coin. Since the minimum of one and five is one we store 1 in the table. Fast forward again to the end of the table and consider 11 cents. Figure 5 shows the three options that we have to consider:
-
A penny plus the minimum number of coins to make change for cents (1)
-
A nickel plus the minimum number of coins to make change for cents (2)
-
A dime plus the minimum number of coins to make change for cent (1)
Either option 1 or 3 will give us a total of two coins which is the minimum number of coins for 11 cents.
Listing 8 is a dynamic programming algorithm to solve our change-making problem. dpMakeChange
takes three parameters: a list of valid coin values, the amount of change we want to make, and a list of the minimum number of coins needed to make each value. When the function is done minCoins
will contain the solution for all values from 0 to the value of change
.
Listing 8
xxxxxxxxxx
//Program that stores the solution for all possible amounts of change up to a given integer.
using namespace std;
int dpMakeChange(vector<int> coinValueList, int change, vector<int> minCoins){
for (int cents = 0 ; cents < change + 1; cents++){ //loop finds solution for all sets of change from 0 to int change.
int coinCount = cents;
for (int j : coinValueList){
if (j <= cents){
if (minCoins[cents-j] + 1 < coinCount){
coinCount = minCoins[cents-j] + 1; //assigns the number of coins that is used to make the change.
}
}
}
minCoins[cents] = coinCount;
}
return minCoins[change];
}
int main(){
vector<int> coinValueList = {1, 5, 10, 21, 25};
int change = 63;
xxxxxxxxxx
#Program that stores the solution for all possible amounts of change up to a given integer.
def dpMakeChange(coinValueList,change,minCoins):
for cents in range(change+1): #loops finds solution for all sets of change from 0 to change parameter.
coinCount = cents
for j in [c for c in coinValueList if c <= cents]:
if minCoins[cents-j] + 1 < coinCount:
coinCount = minCoins[cents-j] + 1 #assigns the number of coins that will be used to make the sum.
minCoins[cents] = coinCount
return minCoins[change]
def main():
print(dpMakeChange([1, 5, 10, 21, 25], 63, [0]*64))
main()
Note that dpMakeChange
is not a recursive function, even though we started with a recursive solution to this problem. It is important to realize that just because you can write
a recursive solution to a problem does not mean it is the best or most efficient solution. The bulk of the work in this function is done by the loop that starts on line 4. In this loop we consider using all possible coins to make change for
the amount specified by cents
. Like we did for the 11 cent example above, we remember the minimum value and store it in our
minCoins
list.
Although our making change algorithm does a good job of figuring out the minimum number of coins, it does not help us make change since we do not keep track of the coins we use. We can easily extend dpMakeChange
to keep track of the coins used by simply remembering the last coin we add for each entry in the minCoins
table. If we know the last coin added, we can simply subtract
the value of the coin to find a previous entry in the table that tells us the last coin we added to make that amount. We can keep tracing back through the table until we get to the beginning.
ActiveCode 2 shows the dpMakeChange
algorithm modified to keep track of the coins
used, along with a function
printCoins
that walks backward through the table to print out the value of each coin used. This shows the algorithm in action solving the problem for our friends in Lower
Elbonia. The first two lines of main
set the amount to be converted and create the list of coins used. The next two lines create the lists we need to store the results.
coinsUsed
is a list of the coins used to make change, and coinCount
is the minimum number of
coins used to make change for the amount corresponding to the position in the list.
Notice that the coins we print out come directly from the coinsUsed
array. For the first call we start at array position 63 and print 21. Then we take
and look at the 42nd element of the list. Once again we find a 21 stored there. Finally, element 21 of the array also contains 21, giving us the three 21 cent pieces.
xxxxxxxxxx
coinCount = minCoins[cents-j] + 1; //assigns the number of coins that have been used to make the sum.
//Addition to the precious program that finds the types of coins used and the process of doing it.
using namespace std;
int dpMakeChange(vector<int> coinValueList, int change, vector<int> minCoins, vector<int> coinsUsed){
//This function keeps track of the number of coins needed to create the change.
for (int cents = 0 ; cents < change+1; cents++){
int coinCount = cents;
int newCoin = 1;
for (int j : coinValueList){ //loop finds solution for all sets of change from 0 to int change.
if (j <= cents){
if (minCoins[cents-j] + 1 < coinCount){
coinCount = minCoins[cents-j] + 1; //assigns the number of coins used to make the sum.
newCoin = j; //assigns the type of coins that is used to find the sum.
}
}
}
minCoins[cents] = coinCount;
coinsUsed[cents] = newCoin;
}
xxxxxxxxxx
#Addition to the precious program that finds the types of coins used and the process of doing it.
def dpMakeChange(coinValueList,change,minCoins,coinsUsed):
for cents in range(change+1):
coinCount = cents
newCoin = 1
for j in [c for c in coinValueList if c <= cents]:
if minCoins[cents-j] + 1 < coinCount:
coinCount = minCoins[cents-j] + 1 #assigns the amount of coins used.
newCoin = j #assigns the type of coin used.
minCoins[cents] = coinCount
coinsUsed[cents] = newCoin
return minCoins[change]
def printCoins(coinsUsed,change):
coin = change
while coin > 0:
thisCoin = coinsUsed[coin]
print(thisCoin)
coin = coin - thisCoin
def main():
amnt = 63
clist = [1, 5, 10, 21, 25]
13. Summary
In this chapter we have looked at examples of several recursive algorithms. These algorithms were chosen to expose you to several different problems where recursion is an effective problem-solving technique. The key points to remember from this chapter are as follows:
All recursive algorithms must have a base case.
A recursive algorithm must change its state and make progress toward the base case.
A recursive algorithm must call itself (recursively).
Recursion can take the place of iteration in some cases.
Recursive algorithms often map very naturally to a formal expression of the problem you are trying to solve.
Recursion is not always the answer. Sometimes a recursive solution may be more computationally expensive than an alternative algorithm.
14. Discussion Questions
Draw a call stack for the Tower of Hanoi problem. Assume that you start with a stack of three disks.
Using the recursive rules as described, draw a Sierpinski triangle using paper and pencil.
Using the dynamic programming algorithm for making change, find the smallest number of coins that you can use to make 33 cents in change. In addition to the usual coins assume that you have an 8 cent coin.
15. Programming Exercises
-
Write a recursive function to compute the factorial of a number.
-
Write a recursive function to reverse a list.
-
Modify the recursive tree program using one or all of the following ideas:
-
Modify the thickness of the branches so that as the
branchLen
gets smaller, the line gets thinner. -
Modify the color of the branches so that as the
branchLen
gets very short it is colored like a leaf. -
Modify the angle used in turning the turtle so that at each branch point the angle is selected at random in some range. For example choose the angle between 15 and 45 degrees. Play around to see what looks good.
-
Modify the
branchLen
recursively so that instead of always subtracting the same amount you subtract a random amount in some range.
If you implement all of the above ideas you will have a very realistic looking tree.
-
-
Find or invent an algorithm for drawing a fractal mountain. Hint: One approach to this uses triangles again.
-
Write a recursive function to compute the Fibonacci sequence. How does the performance of the recursive function compare to that of an iterative version?
-
Implement a solution to the Tower of Hanoi using three stacks to keep track of the disks.
-
Using the turtle graphics module, write a recursive program to display a Hilbert curve.
-
Using the turtle graphics module, write a recursive program to display a Koch snowflake.
-
Write a program to solve the following problem: You have two jugs: a 4-gallon jug and a 3-gallon jug. Neither of the jugs have markings on them. There is a pump that can be used to fill the jugs with water. How can you get exactly two gallons of water in the 4-gallon jug?
-
Generalize the problem above so that the parameters to your solution include the sizes of each jug and the final amount of water to be left in the larger jug.
-
Write a program that solves the following problem: Three missionaries and three cannibals come to a river and find a boat that holds two people. Everyone must get across the river to continue on the journey. However, if the cannibals ever outnumber the missionaries on either bank, the missionaries will be eaten. Find a series of crossings that will get everyone safely to the other side of the river.
-
Modify the Tower of Hanoi program using turtle graphics to animate the movement of the disks. Hint: You can make multiple turtles and have them shaped like rectangles.
-
Pascal's triangle is a number triangle with numbers arranged in staggered rows such that
This equation is the equation for a binomial coefficient. You can build Pascal's triangle by adding the two numbers that are diagonally above a number in the triangle. An example of Pascal's triangle is shown below.
1 1 1 1 2 1 1 3 3 1 1 4 6 4 1
Write a program that prints out Pascal's triangle. Your program should accept a parameter that tells how many rows of the triangle to print.
-
Suppose you are a computer scientist/art thief who has broken into a major art gallery. All you have with you to haul out your stolen art is your knapsack which only holds pounds of art, but for every piece of art you know its value and its weight. Write a dynamic programming function to help you maximize your profit. Here is a sample problem for you to use to get started: Suppose your knapsack can hold a total weight of 20. You have 5 items as follows:
item weight value 1 2 3 2 3 4 3 4 8 4 5 8 5 9 10
-
This problem is called the string edit distance problem, and is quite useful in many areas of research. Suppose that you want to transform the word "algorithm" into the word "alligator". For each letter you can either copy the letter from one word to another at a cost of 5, you can delete a letter at cost of 20, or insert a letter at a cost of 20. The total cost to transform one word into another is used by spell check programs to provide suggestions for words that are close to one another. Use dynamic programming techniques to develop an algorithm that gives you the smallest edit distance between any two words.
16. Glossary
A branch of the conditional statement in a recursive function that does not give rise to further recursive calls.
An organization of data for the purpose of making it easier to use.
a way to solve complex problems by breaking it up, solving the smaller portions, and storing the results to avoid re-calculating them.
An error that occurs at runtime.
To prevent an exception from terminating a program by wrapping the block of code in a try
/ except
construct.
A data type which cannot be modified. Assignments to elements or slices of immutable types cause a runtime error.
A function that calls itself recursively without ever reaching the base case. Eventually, an infinite recursion causes a runtime error.
A data type which can be modified. All mutable types are compound types. Lists and dictionaries (see next chapter) are mutable data types; strings and tuples are not.
To cause an exception by using the raise
statement.
The process of calling the function that is already executing.
The statement that calls an already executing function. Recursion can even be indirect – function f can call g which calls h, and h could make a call back to f.
A definition which defines something in terms of itself. To be useful it must include base cases which are not recursive. In this way it differs from a circular definition. Recursive definitions often provide an elegant way to express complex data structures.
a stack that contains a "frame" or group of data. For a call stack, this would be a function and its arguments.
A data type that contains a sequence of elements of any type, like a list, but is immutable. Tuples can be used wherever an immutable type is required, such as a key in a dictionary (see next chapter).
An assignment to all of the elements in a tuple using a single assignment statement. Tuple assignment occurs in parallel rather than in sequence, making it useful for swapping values.