CS202 Study Guide
Site:  Saylor Academy 
Course:  CS202: Discrete Structures 
Book:  CS202 Study Guide 
Printed by:  Guest user 
Date:  Wednesday, November 30, 2022, 12:02 AM 
Navigating this Study Guide
Study Guide Structure
In this study guide, the sections in each unit (1a., 1b., etc.) are the learning outcomes of that unit.
Beneath each learning outcome are:
 questions for you to answer independently;
 a brief summary of the learning outcome topic;
 and resources related to the learning outcome.
At the end of each unit, there is also a list of suggested vocabulary words.
How to Use this Study Guide
 Review the entire course by reading the learning outcome summaries and suggested resources.
 Test your understanding of the course information by answering questions related to each unit learning outcome and defining and memorizing the vocabulary words at the end of each unit.
By clicking on the gear button on the top right of the screen, you can print the study guide. Then you can make notes, highlight, and underline as you work.
Through reviewing and completing the study guide, you should gain a deeper understanding of each learning outcome in the course and be better prepared for the final exam!
Unit 1: Sets, Set Relations, and Set Functions
1a. Define sets, operations on sets, and state important set properties
 Why bother with a formal notation in any field of work?
 In what way is a set different from a collection?
 How does one use setbuilder notation to translate equations into English to achieve general understanding?
German mathematician Georg Cantor (1845  1918) initiated set theory around 1870. He defined "set" as a collection of discrete distinguishable objects having a certain relationship. In some discussions, one states whether or not set members can be duplicated. In the course material, a "set" with repeating members is called a "collection". This distinction between sets and collections holds within computer languages. Since there are variations in definitions for different arenas, be careful to understand the difference within the current situation. Objects (elements, members) can be in more than one set at a time but they are entirely in a set or entirely not in a set.
As noted in the previous paragraph, formal definitions and notations are essential for communication. All parties have to agree on the meaning of words and symbols. Otherwise, confusion and nonunderstanding reign. Within discrete mathematics, setbuilder notation is often used as a shorthand for describing a set's universe, the relationship between members of the set. For instance, Q = {a/b  a,b ∈ Z, b≠0} is a settheory way of saying, "The membership of set Q is composed of elements that result from dividing a by b such that a and b are elements of set Z and b is not equal to zero".
Review Set Notation and Relations.
1b. Categorize sets into their various types (such as singleton, finite, infinite, equal, null, proper subset, and improper subset) and give a definition of each
 What is the relationship between terminology and symbology?
 What is the difference between null and empty?
 In what way do a universal set and a superset differ?
As discussed earlier, a commonly understood terminology and notation (symbology) must exist for a topic to be discussed. Terminology typically refers to words in natural language, English in this case. "Notation" is shorthand produced by the use of "symbolism", special characters that represent words in natural language. Words, notation, and symbolism are all ways of representing concepts and ideas that enter the human mind. We can consider our thoughts and ideas without having any kind of means of expressing them. But, to communicate them to others, and to have some way of having them understood by others, we need some common way of expressing them. Thus we have terminology, notation, and symbolism.
Sometimes, there is not a onetoone match between a given term and its notation. Onetomany is often seen. For instance, a null set is the same thing as an empty set. Such sets have various symbols, depending on the document ({ }, Ø, n(X) = 0, A = 0). Symbols can have different meanings, depending on the field of work at hand. For example, in other types of mathematics, A refers to the absolute value of A. In discrete mathematics, the topic we are studying, A refers to the number of elements in set A. The variations in terminology and symbols can lead to confusion so stay alert to the forms used.
Shades of meaning are also important. Definitions of terms and notations can appear to be the same at first but there is often a fine distinction. A good example is the universal set and superset. A superset contains all elements of a particular set but not just those elements. A universal set contains all elements and all sets under consideration. It is a set of all elements of all possible sets. The figure to the right illustrates this idea. Set F is the universal set. Set E is a superset.
Review Types of Sets.
1c. Describe set notation and how that notation is used to perform operations via symbol manipulation
 Given a certain operation on a group of sets, how can that operation be represented diagrammatically?
 Are there cases when an element cannot be a member of more than one set at a time? How so?
 Are partial set memberships allowed? How so?
Operations, characterized as equations, are found in all areas of mathematics. Set theory is no different since it allows for operations to be performed on sets and set members. These operations are strictly defined since there must be a common definition for useful discussion between practitioners.
As noted earlier, an element can be fully a member of more than one set at a time. This is true if we stick solely within the framework of set theory. However, remember that we have to create solutions to situations we find in the real world. Semantics, the true meaning of words within the context of the situation, matter in that world. Take for instance the words YES and NO. In many situations, there is no MAYBE and the answer cannot be YES and NO at the same time. The set "answer" cannot contain the elements "yes" and "no" at the same time. For instance, we cannot take an action and not take that action at the same time. For set theory to have any meaning in the world outside pure mathematics, it has to be interpreted within the context of realworld situations. This provides a means of transitioning theory to reality.
There are many ways to express operations in set theory. We can use equations, generalizations, and Venn diagrams. Venn diagrams use oval shapes to depict set operations. Many people learn and understand better pictorially. Plus, pictures are an aid to understanding what equations, and especially generalizations, are saying. The same is true, for instance, with equations that deal with data that arrives over time. A graph of how the equation behaves each time unit goes a long way to assist in understanding what the equation represents, its behavior over time.
Review:
1d. Apply set definitions, operations, and properties to demonstrate set membership within a specific context
 What are the symbols associated with various set operations and generalizations?
 How does one apply Venn diagrams to illustrate the results of set operations?
 What are important ways that set theory applies to reallife situations?
In other learning goals for this unit, we have seen how precise definitions of terms, symbols, operations, and generalizations make it possible for us to communicate on a given topic. This is also true with set theory. It is important to know the basic means of discussion. Definitions in common use for social conversation are woefully vague for this purpose.
Beyond precision in definition, there is understandability. If a long equation can be generalized into a shorter equation, that leads to better understanding. Another approach is to draw a picture, a Venn diagram in the case of set theory. Graphs of various kinds are often used in other fields of work. A picture that relates sets to the result of the operations between them can be very valuable for visualizing what equations are trying to convey.
If our learning is to have value, it has to make a positive difference in the real world. Knowing or developing theory, by itself, has little value unless it is applied to a worthwhile purpose. Here is an example: A great entrepreneurial opportunity is presented by the internet. One can easily set up a store that sells items, even if one never deals with the physical items themselves. (Beware. Running a store profitably is never easy. And, it is fraught with risk. Most stores fail.) Let's say the store carries electronic components, we could say that set S, the store, carries a finite number of components, {c_{1}, c_{2}, ..., c_{n}}. There may be several of each component available, {c_{11}, c_{12}, ... c_{1m}}, if c_{1} is used as an example. One thing that would help customers know what to buy for what purpose might be a series of welldocumented kits, {k_{1}, k_{2}, ..., k_{3}}, that put various components together. Set theory can be used to bring that idea forward. We could think of a generalized set equation such as K_{1} = {c_{12}, c_{31}, c_{1005}}. This a shorthand way of saying Kit #1 is composed of: two units of Component #1, one unit of Component #3, and five units of Component #100. The Venn diagram to the right illustrates this idea. There is no reason software could not be written to create such diagrams from inventories and parts lists, where the size of the circles represents the number of components available. Links embedded in the image could lead to all manner of detail. To get the number of parts in a kit via this generalization:
where:
 K_{in} refers to the number of parts (components) in kit I
 p refers to the part number
 y refers to the number of parts of that part number used in kit I
While this last equation may be too complex for the case at hand, it shows that complexity is available when needed.
Review:
Unit 1 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 collection
 elements
 empty set
 finite
 formal definitions
 formal notations
 members
 notation
 null set
 set
 setbuilder notation
 superset
 symbolism
 terminology
 universal set
 Venn diagram
Unit 2: Counting Theory
2a. Describe counting, permutation, and combination rules
 Why is it insufficient to only calculate the number of raw members of a given set?
 How is it possible to estimate the amount of work needed to account for a range of possibilities?
 How are counting techniques used to enumerate options? How so?
In Unit 1, we learned to calculate <set>, to count the number of members contained in a given set. While it is important to know how many members a set has, that is only the beginning. We can also enumerate options and evaluate the impact of those options on a given situation. For instance, let's take those members as the itemization of steps in a particular process. If each step can be taken independently of the other steps, it is possible to arrange their accomplishment in numerous ways (permutations), depending on the resources available at any given time. If there is only one stepperformance resource available, then only one step can be taken at a time. Yet, "independently" means that it does not matter which step happens at any given time, as long as each step happens only once. Thus, the sequence does not matter, as long as all steps are accomplished. The steps can be ranked according to timetoaccomplish and then do the quickest steps first in order to show rapid progress. If all steps take the same amount of time, then any order will do. It depends on the situation. Similar thinking applies to situations where more than one stepperformance resource is available.
Other performance criteria come into play in situations where order matters or when repetition is allowed. For example, consider combinations of steps and how the steps can be arranged if they are taken two at a time. The main point is that mathematics exists to calculate and enumerate the options. We can then study the options and decide which course of action to take. What is most needed in the beginning is to study carefully so that all members of the set, all factors impacting a domain, are known.
Review Introduction to the Rule of Products.
2b. Compute the number of permutations and combinations of the members of a given set
 What is the difference between the number of a set's subsets and the number of ways to order the members of that same set?
 The "key" to a database management system is a unique attribute value assigned to each record. How does that concept apply to the ordering of a set's members?
 Computing a factorial, N!, is often used in various forms of mathematics. How is it useful in set theory?
We continue our study of various ways to organize members of a set, and how to project the number of ways a set's members can be organized. We can count the members of a set, project the number of subsets of those members, and the number of ways those subset members can be arranged. We can also determine the method by which those members have to be arranged. Such is the benefit of set theory.
There is a certain unity in mathematics. For instance, taking the factorial of a given number, where 0! = 1, is used, for instance, in Taylor Series, a means of estimating the result of various equations. In set theory, factorials are used to calculate the number of ways the members of a set can be ordered, <set>!. This gives the number of unkeyed orderings. If we select a key, some criteria unique to each member, then there is only one ordering. But, if the key is not unique, then the number of possible orderings is ≤ <set>! Set theory provides for exact calculation but also allows for estimates, in this sense.
Review Ordering Elements of a Set.
2c. Determine the number of all possible outcomes of a collection of events
 What is the difference between counting groups of ordered elements and counting groups of unordered elements?
 What is an example of cost estimation when creating groups of unordered elements?
 Is it possible, using set theory, to estimate the number of members of a group under imperfect conditions? How so?
Remember it is possible to calculate the number of groupings of a set's members. However, we assumed that there was only one way to order each group. In our present area of learning, we assume that groups can be unordered, that a new group is formed if the order of membership of that group changes. First, determine the number of ordered groups and then determine the number of ways the members of each group can be arranged. (Note that some literature uses the term "subset" for "group").
The ability to list all groups and all options in a given situation can have serious implications for cost estimating. For example, let's take set members as a list of tasks, the sequence of which does not matter. If we list all unordered groups, we assume that each ordering has the same cost to carry out. However, each ordering can be assigned an estimated cost of carrying out that task sequence, if one task is performed at a time. We could, for instance, rank tasks in a subset by their ability to support other members of that subset. Then, of all the arrangements of the original ordered collection of subsets, we could select the one having the highestranked member as the first member. In this way, we evaluate all possibilities according to some clear metric. This moves our decision from subjectivity to objectivity, a very datacentric approach to things.
So much of "theory", including set theory, that inhibits its practical application is that theory assumes certain, sometimes unspoken or unrealized, conditions. An example from our readings is the flipping of a coin. For the examples to play out as written, one must assume the coin is evenly balanced so that there is always an equal chance of heads or tails being rolled. But what if the coin is not evenly balanced? What if there is a 75% chance of heads being rolled and 25% of tails. (We assume the coin is thin enough that it cannot land on its side.) We might multiply the count under perfect conditions by the count under imperfect conditions, as a start. One must consider all factors impacting the situation at hand.
Review:
Unit 2 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 combinations
 enumerate
 key
 factorials
 ordered
 permutations
 subset
 unkeyed orderings
 unordered
Unit 3: Mathematical Logic
3a. Describe declarative statements in math that have 100% "true" or 100% "false" values
 What is the difference between stating a proposition in propositional logic and in human language?
 What is the difference between "implication" and "consequence"?
 What is the worth of a truth table?
As it says in the readings, "Logic is the study of consequence". Mathematics is meant to be objective symbology and methods for determining facts and truth, given various circumstances. This topic, however, deals with facts that are either 100% true or 100% false.
Facts, when considered together, have consequences. Sometimes this is because of physics. Other times it is due to civil or criminal law. Other times, it is something all agreed upon. For instance, in color science, colors are named according to their mixture of pure Red, Green, and Blue, the color's RGB representation. To call anything other than RGB = (255,0,0) "RED" is incorrect, according to the tight definitions of color science. International tables of RGB triplets and their associated color names exist so that there is some agreement. Here is where a truth table comes in handy. A colorname truth table might appear such as shown here:
Red (R) 
Green (G) 
Blue (B) 
Color Name 
255 
0 
0 
RED 
. . . 
In a table of this type, a color name can only appear once in the farright column and a triplet of RGB values can only appear once. We might say, "If R =255, and is G = 0, and is B =0; then the color's name is 'RED'". Another way to view this is more akin to the truth tables we are studying::
RED = 255 
Green = 0 
Blue = 0 
R & G & B 
1 (true) 
0 (false) 
0 
1 
1 
0 
1 
0 
1 
1 
0 
0 
1 
1 
1 
0 
0 
0 
0 
0 
0 
0 
1 
0 
0 
1 
0 
0 
0 
1 
1 
0 
Using the second form of the table is much harder to represent and understand all the facts (the RGB values and associated color names). But, it underlies the first table. Clearly, in either case, a measurement of RGB values is needed if we are to determine the consequence, the name of the color. The facts must be known exactly. While that is not always possible in human situations, we must know what needs to be known, what facts affect a situation. Then we need to investigate so we can discover what the values of those facts are, what the complete truth is.
Notice that a consequence results from the facts. We determine consequences only from what is known and assume that everything that needs to be known is indeed known and that associated data is accurately determined. For instance: in the notation of our subject, the triplet (R, G, B) is measured data. We know that (R,G,B) implies some color name, (R,G,B) → (color name) or "Measurements of R, G, B imply a specific color name. According to the truth table, we know that this implication is a fact, (255,0,0) → (RED). Here we write it in a formal argument to show how data and facts lead to a conclusion (consequence).
Contrary to what this implies, we absolutely do care that the data is accurate. We care what is contained in every implication. If the data is not correct, the implication is not correct. We cannot assume that an implication, however boldly stated, is true. Further, we cannot accept implications drawn from conclusions or facts without verifying the implications are true. This is fundamental to propositional logic. As explained in our readings, we might find that certain implications and facts are known and correct. But, if other implications and facts are not part of the investigation, then the assumption cannot be taken as valid.
Review Propositional Logic.
3b. Classify sequences of logical statements as yielding true or false conclusions, given the sequence of statements and the values of individual statement parameters (arguments)
 Explain the truth table for the conditional proposition, "if p then q". How can the truth of that statement be true if p is false and q is true?
 If we propose that x → y, do we necessarily propose, by implication, that y → x? How so?
 How does the clarity of a definition affect the truth of a proposition?
Propositional logic offers a means of considering statements and determining if they are true or false. Our readings make this appear to be an easy thing because the examples have clear definitions for the terms in the proposition. But, clarity is not always easy to achieve. For example, we might say true or false is easy to determine for the statement, "The number 5 is an odd number". But, as we apply propositional logic to practical applications, what does "odd" mean? If we limit the discussion to pure mathematics, we know the definition of "odd" and can say the statement is true. But, if we make the statement in the domain of temperature control and are expecting a reading of 5, then that number would not be odd at all, in the sense of the domain and expectations for readings. Basically, the domain of discussion matters a great deal.
We also have to think clearly about what is said and what is not said as we determine whether or not a statement is true. Consider the situation described in the readings regarding the statement, "if p then q". Its truth table is repeated here:
If p and q are both false, then the statement is true since p is supposed to directly lead to q. If p is true and q is false, for the same reason, we know the statement to be false. And so it is also true when p and q are both true. But, how is it that if p is false and q is true that the statement is true? There is nothing that says q cannot be true if p is false. Put another way, just because product p has not been delivered does not mean product q cannot be produced. In this situation, product p may already be on hand in sufficient quantity. As discussed in the readings, other factors may come into play when making a decision. As a lawyer once said to this writer, any reasonably intelligent person can understand what is said. It takes a lawyer to understand what is not said. The same is true in computer programming and other fields of work. Definition, clarity, situation, and logic matter.
Another area of confusion is that many people, colloquially, assume that if x implies y then y also implies x. This is false and causes much confusion because correlation does not mean causation. Another way of looking at it is to say that, because a blue bottle is a good bottle, all good bottles are blue. An assembly line may be set up to produce bottles that filter out certain wavelengths of light. Blue might be the best filter, and so bottles are accepted in that color. But, lesser shades of blue might also be sufficient filtering, and the difference may not be visible to the human eye. Thus, not all good bottles are necessarily blue, in the strict sense of the definition of "blue".
Review Propositions and Logical Operators.
3c. Write a logic equation based on the naturallanguage statement of a problem
 As statements become more complex, is there a way to summarize them so that the summary is logically equivalent to the original statement?
 Why do we use specialized notation instead of just writing out what we want to say?
 In what way do logic operations compare to arithmetic operations?
Like arithmetic operations, logic operations have strict definitions. Each symbol (notation) has a certain meaning that is exactly specified. The specification is exact so that there is no confusion with colloquial languages. And, arithmetic and logic symbols tie their meaning to certain other symbols. For example, in both systems, arithmetic and logic, we can write, "A + B". Without knowing which system we are speaking of, we cannot determine the meaning of the variables A, B, nor the operator +. In arithmetic, we know that A, B refer to numbers and that the operator means we are to determine the new value when A, B are added. Numeric addition is exactly specified and we know how to carry out that operation. In logic, "A + B" means we are to take the OR of A,B. Since we are in the logic system, not the arithmetic system, we know that A, B are binary values, {0, 1}, {false, true). We know how to carry out the OR operation since OR has a precise definition. (Of course, there are other systems of logic that are not binary but discrete multivalued systems. Still, in our sense of the word, "logic" refers to binary).
Because of the exact specification of logic operations and variables, we can often translate colloquial languages into equations, if we can overcome the vagueness of colloquial words. This is important because it allows understanding to be achieved where there is only confusion. At times too, simplification can be achieved that enables more effective and efficient achievement in practical situations. Consider this situation:
Reviewing our complex project plan, our goal will be achieved if we reach milestones X and Y; or if we reach milestones Y and Z, while at the same time milestones Y or Z are reached.
Since complexity can lead to confusion, let's try to simplify things. Using the syntax (notation) of logic equations, the colloquial statement could be written as G = (X and Y) or ((Y and Z) and (Y or Z)).
Sometimes it is the case that complex logical statements can be summarized into an equivalent form. This leads to better understanding and easier transition to practical application. Here is an example:
Reviewing our project plan, our goal will be achieved if we reach milestones X and Y; or if we reach milestones Y and Z, while at the same time milestones Y or Z are reached.
Complexity can lead to confusion. Let's try to simplify things. Using the syntax of logic equations (logic notation), this could be written as G = (X and Y) or ((Y and Z) and (Y or Z)). Using a simplified version of the equation, we get G = XY + (YZ(Y + Z)). By the rules of manipulating logic equations, we can then write: XY + YYZ + YZZ, which becomes XY + YZ + YZ, which we easily reduce to XY + YZ, which reduces further to Y(X + Z). Looking back at the original colloquial statement, we see that if we accomplish milestone Y and either milestone X or Z, then the goal is accomplished. This reduces the complexity of the original project plan and improves our understanding of what must be accomplished. Such an approach can also reduce cost in various categories.
Objective logical thought can help transition plans to accomplishment. The same is true of transitioning theory to reality.
Review:
3d. Calculate logical expressions of multiple variables assigned values according to a given circumstance
 Why is it important for logical laws to be proposed and proven?
 What is the role of truth tables in the context of logical laws?
 Describe the importance of investigation.
We bring our study from pieces to a whole in this learning outcome. Laws of logic are proposed and proven so that it is not necessary to start from scratch when examining a complex statement or proving a more involved basic law. Truth tables can be a guide to this effort since they allow us to chart basic facts and show how they combine to ultimately yield true or false for the final statement. This is possible because facts, in binary logic (what we are studying), can only be true or false. Ultimately, complicated laws of logic are constructed from basic building blocks, less complex laws, and those from statements that are completely true or completely false.
Investigation is essential to the application of the laws of logic. Every law of logic depends on certain facts being true or false. But, is the proposed fact actually true, or is it actually false? Does a shade of colloquial meaning matter? For example, if we say that the color RED is composed of 255 units of RED visual frequency and 0 of any other frequency, (R,G,B) = (255,0,0), Is it still RED if (R,G,B) = (254,0,1)? Technically, the answer is no. But, is it RED enough to be called RED for the purpose at hand? That has to be determined via investigation and is not a matter of personal opinion.
Review:
Unit 3 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 assumption
 binary logic
 causation
 conclusion
 consequence
 correlation
 equations
 equivalent form
 formal argument
 implication
 logic operations
 proposition
 propositional logic
 simplification
 syntax
 truth table
 valid
Unit 4: Mathematical Induction and Proofs
4a. Discuss how one conclusion leads to another, in the mathematical sense, for a given situation
 What is the difference between a theorem and an axiom?
 Why is investigation important to mathematical systems?
 How do specific definitions impact a mathematical system?
Mathematical systems are evolved through investigation, what is termed "research" in the figure taken from the first reading in this unit. Yet, it is not always topdown. Sometimes, through intuition, imagination, or some other creative process, it is possible that a theory will be postulated. But then, research (investigation) must take place to prove or disprove that theory. Once proven, a theory becomes an axiom, something that can be used to prove more complex theories.
Underlying even research are clear and specific definitions. We must know exactly what we are talking about. If we say "straight line", what does that mean within the context of the mathematical system in which we are engaged? If we use the symbol "+", what exactly does that mean? If we say, "the color is RED", what do we mean by that? Context is important. And, we must be exact within that context so as to transition colloquial speech, which is vague, to technical speech, which is precise.
Review Mathematical Systems.
4b. Write the terms of a logical sequence as a generalized formula
 Why are truth tables not always the best approach to proving a statement to be true or false?
 How can we overcome the limits of truth tables?
 What are two approaches to generalized proofs?
Truth tables are a fine approach to logical proof when there are few variables (facts) involved. But, as the number of variables increases, the size of the truth table increases as does 2V, where V is the number of variables. The table will have a minimum of V columns and 2V rows. For most industrial applications, the use of such a table is intractable, given the time and machinery available. Thus, we need a different approach. This involves the use of Direct and Indirect Proofs.
Direct and Indirect Proofs follow a series of steps that begin with basic axioms and use those as building blocks to build to a final proof or disproof. Direct Proof seeks to demonstrate that a theorem is true. Indirect Proof seeks to prove the theorem is false. We followed the same basic approach when we built truth tables to prove the truth or falsehood of a statement. We started with variables and then added logical statements until the final statement was shown to be true or false under all possible circumstances. Direct and Indirect Proofs generalize this approach.
Review:
4c. Use mathematical induction to prove the validity of logical sequences
 Why is mathematical induction important?
 What are some examples of how induction can explain complex situations?
 Explain the difference between mathematical induction and strong induction?
Mathematical induction and its cousin, strong induction, allow us to overcome the possibility that a situation may contain an infinite number of variables. Strong induction extends mathematical induction in that it allows us to demonstrate additional assertions that are true if a given assertion is proven true by mathematical induction. Real life contains an unending collection of complex situations. Many of these contain variables that are based on previous versions of earlier variables in a neverending stream. Let's examine one of those.
Zeno's Paradox describes an archer shooting at a target that is some distance, D, away. Once the arrow, A, is set loose, it travels to the halfway point to the target. Then it flies to a point half of the remaining distance. This continues indefinitely, for each step, T. The arrow is always heading for the next halfway point. That being the case, how does the arrow ever reach the target? (We know that it does, eventually, if the archer is accurate.) Here is a solution modeled after the readings' examples. It is drawn from the problem description.
A(0) = D * (1/2^{0}) base case, distance to target prior to arrow being fired.
A(1) = D * (1/2^{1}) situational properties
A(2) = D * (1/2^{2})
...
A(10) = D * (1/2^{10})
A(T) → A(T + 1) inductive step
A(∞) = D * (1/2^{T})_{lim T }→ ∞ calculus
A(∞) = D * 0 = 0 final distance to target is 0, the arrow strikes the target
Notice that, in this philosophical context, infinite steps do not necessarily take infinite time. Mathematically, the solution is reasonable. But, is it reasonable from a physics perspective? Can we really take infinite steps in finite time? Calculus, by the way, offers integration as another approach. We can always integrate using this equation: . Thus we show that all the small distances add up to the whole distance. (Integration allows us to perform addition of an infinite number of values. (Recall from the readings that ∑ is the Greek letter for summation and S is the Latin letter for summation. So, this is where ∫comes from). In any case, we have two clear and provable indications that the arrow does indeed reach the target, in finite time, even though there appears to be an infinite number of steps in the process.
Review:
4d. employ inductive reasoning to situations where there is sufficient evidence to warrant generalization
 What is the end result of Peano Postulates?
 Give another name for "strong induction".
 What should we look for when we try to engage inductive reasoning?
To paraphrase the last section of this unit, inductive reasoning's foundation began to be laid in the 1800s with the work of Richard Dedekind. He developed a set of axioms that describe the positive integers. Later that same century, Giuseppe Peano refined these axioms, created a logical interpretation, and generalized them into the Peano Postulates. These are the cornerstone of mathematical induction.
Another name for "strong induction" is "courseofvalues induction". From that perspective, we can attempt to apply inductive reasoning by being alert for patterns in an evolving situation or a sequence of events. For instance, Zeno's Paradox is based on the sequence of milestones reached by the arrows. The progress of the situation can be identified by the event leading to each milestone, traveling the distance from the prior milestone to the present milestone, the arrow traveling from the prior halfway point to the present halfway point. We finally realize that A(T) → A(T + 1). From this general statement, we can write a general solution to the problem that does not depend on the specifics (distance to target or any milestone). We can even generalize to the extent that A(T) → A(T + n), where n is any step in the process.
The detection of patterns can be very important to problemsolving. But, so can the lack of discernible patterns. For instance, some computer languages allow memory references beyond predefined vectors and arrays, "memory overruns". This leads to erratic behavior in computer programs. It is rare that any pattern will be observed. That is one reason programmers will restart a program and run it several times under the exact same conditions. If no pattern occurs, they know to look for memory overruns.
Each type of behavior requires a different approach to solve. Apply the approach that leads to the best solution for the situation at hand.
Review:
Unit 4 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 axiom
 base case
 courseofvalues induction
 Direct Proof
 Indirect Proof
 mathematical induction
 memory overruns
 inductive reasoning
 integration
 Peano Postulates
 strong induction
 theorem
 variables
 Zeno's Paradox
Unit 5: Probability
5a. State the definitions of terms that apply to probability within the context of this course
 What is the essential goal of probability?
 Why is a tree of possibilities important?
 How does our ability to count play in probability?
 What are the steps of a basic analysis process?
Fundamentally, probability asks the question, "Out of all possible outcomes, what percentage of the time could a specific outcome, or group of outcomes, be expected to occur, under the specified circumstances?" This is an important consideration, for instance, when deciding on how to allocate resources as a team tries to account for what is "most likely" to happen. This consideration is critical because reaching infinite perfection in many situations requires infinite resources. Yet, resources are always finite. So, we try to rank possibilities by their likelihood of occurring and then account for those.
We are often faced with situations where all the variables are not known. But, we still must make the best choice possible within that situation, although it may turn out we make the wrong choice when all is said and done. (This is not our fault, just the way things turned out.) One approach to making the best choice possible is to build a decision tree based on what we know. Here is an example of such a tree:
Notice that not all paths through the tree are shown and not all paths are labeled. This avoids clutter while adding clarity. The tree shows how one event can lead to another with a certain probability. Player 1 makes a choice (the tree root), then Player 2 can make a choice until no other choices can be made. A player cannot make a choice that has already been made. If we make purely random choices, the tree shows the chance that a particular choice will be made. Once no further choices can be made (a leaf of the tree), we have reached the final event and can calculate the probability of that event. Looking at the tree, the final event D has a probability of . It is thus possible to calculate the probability of subsequent events at any tree node. Clearly, probabilities are not always uniform when considering all possible outcomes.
Being able to count possible combinations of items allows us to calculate the probability that a certain combination will occur. Specific combinations have limited opportunity to occur given a limited universe of items that can be combined in various ways. For example, what is the chance that we will draw four aces from a deck of 52 playing cards? Begin by recalling that refers to selecting groups of k objects from a base group of n objects, without replacement. That syntax is also written as C(n,k) in various books, papers, and articles. It is calculated as . Dealing four aces from a total of four aces can only be done in C(4,4) = 1 way. Dealing any nonace card from the rest of the 48 cards can be done by C(48,1) ways. Any five cards can be drawn from a deck of 52 cards in C(52,5) ways, the total number of different fivecard hands. Thus, the probability of dealing the four aces is .
Following a formal process of calculation allows us to overcome the limitations of human intuition. Here is a good general process:
 Identify all possible outcomes.
 Identify all desired outcomes.
 Calculate probability of each possible outcome.
 Calculate probability of each desired outcome.
Review Introduction to Discrete Probability.
5b. Calculate conditional probabilities (the probability of event Y occurring if event X has already occurred)
 What is the difference between dependent and independent events?
 Describe a decision tree and show its value.
 What does "conditional probability" mean?
 How does one calculate a conditional probability?
Conditional probability gives the probability that an event will occur after some other event has occurred. Decision trees offer a structure that allows us to compute probabilities for some chain of dependent events. Thus, as trials combine to create events, we can overcome the limits of human intuition using an objective process coupled with a clear framework.
Let's dig deeper into conditional probability by examining an elementary example. Imagine that there are four experiments that we want to conduct, given our research into a particular topic. Because of our limited budget, we can only conduct two of those experiments. We have to choose two of the four possibilities and then proceed to conduct those two. We decide that there is an equal chance of choosing any two of the four experiments (under a uniform distribution) and then an equal chance (under a uniform distribution) of one of the two experiments being successful. What is the chance that a given pair of experiments will be selected? What is the chance that a particular experiment will be successful?
A solution, for this very simple situation, can be visualized with a decision tree, as shown in this figure:
There are only six pairs of experiments that can be selected. Thus, the chance of selecting a given pair is 17%. For each pair, the chance of success is 50%. Thus, the conditional probability of a given experiment's success is 0.50 x 0.17 = 0.09 = 9%, given the situation described. Of course, in reality, scientific experiments may well all fail entirely. But, even then, value is attained since it is important to know what does not work.
Review Conditional Probability.
5c. Compute the probability of independent events
 How does the probability of an independent event affect the probability of a dependent event?
 What is the difference between P(A ∩ B) and P(A  B)?
 What does P(B) mean?
If we ask the value of P(A  B), the conditional probability of EventA occurring given that independent EventB has already occurred, we first have to know the probability of independent EventB, P(B). EventA is dependent upon EventB for this calculation. P(B) is calculated as {B} / {all possible events}.
As an example of an independent event, P(B): if there are 25 blue products out of 100 total products then the probability of a random (uniform distribution) product selection being blue equals 25/100 = 0.25, 25%.
P(A ∩ B) is the probability that EventA and EventB occur at the same time, the intersection of EventA and EventB. If EventA and EventB cannot occur at the same time, they are mutually exclusive,
so P(A ∩ B) = 0.
An example of a dependent event, P(A  B) = P(A ∩ B) / P(B): Assume there are 50 yellow products. Now we ask, what is the chance that a blue product is picked (EventA) from the total collection of 100 products, knowing that a blue or yellow product was
picked (EventB)? We first have to determine how many products overlap in the total collection of blue and yellow products,  A ∩ B . Clearly,
{blue} ∩ {blue, yellow} = {blue}. We already know that P(blue) = 0.25. P(blue U yellow) = (25 + 50)
/ 100 = 0.75. Thus we come to the result that P(A ∩ B) / P(A U B) = 0.25 / 0.75 = 1/3 = P(A  B).
In this way, we arrive at the probability of a dependent event happening as a result of an independent event. This bears greatly on applications such as calculating the probability of a fault occurring in a critical piece of equipment. We start all the independent failures that can occur and then calculate the probability that certain failures can occur at the same time. Clearly, we want to do this objectively and carefully, without worrying about whether or not we would like such things to occur or if we can afford for them to occur or afford to keep them from occurring. Rather, what is important is that we know failure rates and their impacts. If we cannot afford to deal with all possibilities safely, we should not undertake the task. A good contrary example is the report of a failed oildrilling platform causing death and billions of dollars in heavy pollution because funding was refused by "management" for maintenance and repairs.
Review:
5d. Estimate the chance of occurrence of a specific event within a collection of events.
 What do we mean by "a collection of events"?
 How does set theory aid us in calculating probabilities?
 What is the difference between discrete and continuous probability?
Discrete probability estimates the probability that a specific (discrete) event will occur. Discrete events are those with a finite number of outcomes. Examples are: rolling a specific number with standard dice or getting heads or tails after flipping a coin. When we flip a coin, there are only two possible outcomes: heads or tails. When we roll a sixsided die, we can only obtain one of six possible outcomes, 1, 2, 3, 4, 5, or 6.
Continuous probability deals with variables such as the height of a human, values which can vary over an infinitely continuous scale. Whereas discrete probability deals with countable standalone events, continuous probability deals with values on a continuous scale. This is much like dealing with integers, which can take on only certain values, vs. floatingpoint numbers, which can take on an infinite number of values between any two integer values. We do not address continuous probability in this course.
A collection of events is when we refer to some member(s), {M} of a given set of all possible events, {E}. (Note that {M} is a subset of {E}.) For example, the roll of a standard sixsided die can only result in six events, {E} = {1, 2, 3, 4, 5, 6}. We can ask the probability of some set of events occurring. An example is P(M) = P(1) so that {M} = {1}. We can also ask about the independent probability of an even value being rolled, P(M) = P(2, 4, 5) so that {M} = {2, 4, 6}. P(M) = {M} / {E}
Using set theory we can determine {M1} ∩ {M2} and {M3} U {M4}, both of which are needed to calculate dependent probabilities. P(A  B) = {A ∩ B} / {A U B}.
Review Sample Spaces, Events, and Their Probabilities.
Unit 5 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 collection of events
 conditional probability
 continuous probability
 decision tree
 dependent events
 discrete events
 discrete probability
 independent events
 leaf
 probability
 random
 root
 trials
 uniform
Unit 6: Recursion
6a. Define linear recurrence and give examples
 What is recursion? Refine that definition for linear recursion.
 When is recursion an appropriate implementation approach?
 What are the three drawbacks of recursion?
Recursion is a technique often used in computer science to define a function by itself. To do this, the recursive function first partially solves the given problem or divides it into several subproblems and then calls itself with these new subproblems. This continues until the problem is completely solved or the solution is broken down into trivial individual parts.
Recursive solutions are often more elegant, easier to find, easier to check for correctness (e.g. using mathematical induction), and easier to explain than iterative approaches. But, they are less efficient, from a computational perspective, due to the many function calls and attendant memory use, both of which are difficult to predict when recursion is employed. That is why, in the case of performancecritical programs, especially in hardreal time embedded systems having limited resources, it often pays to use purely iterative approaches rather than recursion. Since recursive and iterative functions are equally powerful, there is an iterative equivalent for each recursive implementation.
Linear recursions are recursions in which there is a maximum of one recursive call per recursion step. The calculation runs along a single chain of calls. Linear recursions can be resolved very easily. As the number of recursive calls per step increases, however, conversion to pureiterative becomes more difficult.
Be careful when choosing recursion vs. iteration. This writer has found that embedded systems should generally not use recursion, whereas nonembedded systems tend to have the resources to support recursion and not have strict timing constraints.
Review:
6b. Explain important recursive functions
 Where does one find recursion applied in a practical way?
 Within mathematics and computer science, how is recursion defined?
 What are the three requirements for recursion to be applied effectively?
Recursion appears in many fields of work, including art and literature, not just mathematics and computer science. One can see this in the general definition, "Recursion is the process of repeating something in a selfsimilar way". One usually sees recursion defined in terms of computer science as a function that calls itself. While that is true, in its own limited way, it does not embrace recursion's extensive application.
From the perspective of mathematics and computer science, "Recursion is the process of describing an action in terms of itself. Generally, one has functions that call themselves, an initialization case that describes conditions at the start, and a base case that gives a specific situation when the recursion will cease and a final result produced".
Review:
6c. Produce recursively defined sequences
 Why is practice important when learning to apply theory?
 Why is the transition of theory to application necessary?
 What is the essential nature of an expert?
Practice makes perfect. An old colloquial sentiment but very very true nonetheless. It is often the case that ancient advice is the best advice. This learning goal encourages us to consider how various equations can be cast in a recursive manner. By looking at a wide range of examples, the theory behind recursion will begin to "click", to come together in clarity and usefulness.
Clarity and usefulness are essential to the transition of theory to practice. Given the imperfect world in which we live, the infinite accuracy of mathematics cannot be fully implemented. For instance, a computer cannot even divide 1 by 3 accurately. Yet, we still need programs that reflect theory. We still need bridges to stand up to heavy use. We still have to make justifiable implementation decisions.
Review Explanation and Examples.
6d. Write recursive relations that are defined in terms of themselves at increasingly earlier moments in time
 In what way do data sequences relate to data over time?
 What are potential benefits of gathering data on a temporal basis?
 What are the limits of human observation of temporal data?
The acronym IoT stands for "Internet of Things". What that comes down to is the generation, transmission, analysis, and application of data over time within a specific situation. We have studied data sequences. In this case, we are talking about data
that arrives over time,
(D_{t0}, Dt_{1}, Dt_{2}, ...).
An example is determining machine faults via sound. This figure shows data derived over time during a machine's operation. The left graph is due to digitized sound. The right graph is an analysis that shows abnormal variations in that sound. Once variations are analyzed, it is possible to determine whether or not the machine needs maintenance or is about to fail.
It is rarely possible to have someone listening to a machine or watching an evolving graph with a long enough attention span to catch every variation. This unlikeliness increases as the number of machines increases. The same is true of complex systems with numerous data generators. An example is a modern aircraft and its plethora of internal sensors. A home is another example. For instance: smoke, fire, gas, and temperature sensors that can tell inhabitants well ahead of time that something is wrong.
We can easily see that data over time is important if we are going to be aware of important situations and take appropriate action.
Review Predicting Data Sample Values.
Unit 6 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 attendant memory use
 embedded systems
 function calls
 iterative approaches
 iterative equivalent
 linear recursions
 nonembedded systems
 recursion
 recursive function
 recursive implementation
 recursive solutions
 sequences
Unit 7: Graphs
7a. State the components of a graph
 What is the basic notion of a graph?
 What is the difference between graphs and trees?
 What is the difference between a directed and undirected graph?
 What are isomorphic graphs?
A basic graph consists of a nonempty set of vertices (nodes, elements), V, and a set of edges, E, which is a subset of the set V × V (every vertex connected to every other vertex). Note that although a set can be empty, a graph cannot exist without at least one vertex. However, a graph CAN exist without edges (links between nodes).
Graphs are a superset of trees and have applications in many domains, not just technical ones (for example, the management domain). We will talk more about trees in the next unit. For now, the difference between a graph and a tree is that (1) graphs have no unique root, while trees have a unique root; (2) graphs can have cycles, while trees cannot have cycles.
Directed and undirected graphs are the two most basic types. When an edge (path) exists between two nodes, a directed graph can only traverse the path in one direction. An undirected graph allows traversal in both directions.
Isomorphic graphs are bijective. There is a onetoone correspondence between vertices and between edges. There are three features of isomorphic graphs:
 The order of the graphs are the same, V = V'
 The size of the graphs are the same, E_{V} = E_{V}'
 They share the same degree of sequence, corresponding nodes have the same number of edges.
Review Basic Graph Theory and An Introduction to Graphs.
7b. Describe the two principal graph traversal paradigms
 What are the two main graph traversals discussed in this unit?
 Why are those two approaches important?
 What is one condition that must be met for the two main graph traversals to be accomplished?
 What is an ncube? Draw an example.
There are two important methods of moving through (traversing) a graph, Eulerian and Hamiltonian. Both refer to undirected graphs. Eulerian traversals touch every edge but only once each. Hamiltonian traversals touch every vertex but only once each. For either traversal type to be used, the graph has to be fully connected, all vertices have to be part of one network. Not every graph is capable of Eulerian or Hamiltonian traversal. So, one cannot depend only on those. One can start designing that way as a means of gaining efficiency. But, to gain effectiveness, traversing the entire graph, one sometimes has to depart from pure Eulerian or Hamiltonian.
In the readings, there is a good example of a graph that does not have an Eulerian traversal, the wellknown Koenigsberg Bridge Problem. A good example of a graph that does not have a Hamiltonian traversal is shown in Figure 1.
Figure 1. Example of graph for which there is no Hamiltonian traversal.
An ndimensional hypercube is also called an ncube. Basically, this is a graph that can be best visualized in multiple dimensions. The ncube is typically denoted Qn, where n represents the number of dimensions. Two examples are shown in Figure 2.
Figure 2. Examples of multidimensional graph representations.
A brief application can be found in analogtodigital conversion. In this case, the number of dimensions is 3, the number of bits used in the conversion. As the arrow moves across the dial of a gauge, it covers a special coating underneath. In so doing, it causes a binary string, a digital signal, to be sent to a computer. This string is interpreted within the computer as a way of converting the angle of the arrow into a 3bit binary number. An adjacency of readings can be represented by a graph. This approach is illustrated in Figure 3.
(inner ring ignored since it is always 0) 
Figure 3. Various representations of a 3cube, Q3, and its use in analogtodigital conversion.
Review Traversals: Eulerian and Hamiltonian Graphs.
7c. Explain the difference between directed and undirected graph
 What is a graph?
 Draw examples of directed and undirected graphs.
 What is a common notation for listing vertices and edges of graphs?
 What is "direction" an example of, relative to graphs?
As it says in our readings, "A graph is a formal mathematical representation of a network, which is itself a collection of objects connected in some fashion". There are two kinds of graphs, directed and undirected. Directed graphs have arrows to show possible transitions from one vertice (node) to another. If an arrow does not point from Node A to Node B, for instance, a transition from Node A to Node B is not allowed. On the other hand, an undirected graph has no directional arrows to indicate allowed transitions. The assumption is that transition is allowed along both directions of all edges. Figure 1 compares undirected and directed graphs. It also shows a common notation for listing a graph's vertices and edges.
Figure 1. Examples of undirected and directed graphs. Notice the common notation for Vertices (V) and Edges (E).
Direction of transition is one way of adding additional information to a graph. Another way is to show what is needed for a transition to take place in a system whose phases are represented by a graph. For instance, a number along an edge can be used to represent a variable's value that triggers a transition from one system phase (vertice) to a subsequent phase. This is illustrated in Figure 2.
Figure 2. Example of a graph being used to represent system states.
(Notice that there are two states to which the system could be initialized.)
Review Basic Graph Theory.
7d. Use graphs to solve applications involving associated events
 What is graph traversal optimization?
 Why do we care about graphtraversal optimization?
 What are some approaches to graphtraversal optimization?
 Is it reasonable to expect graph traversal to be perfectly optimized in practical situations?
Graph traversal refers to visiting each node in the network. Optimization refers to the cost or benefit of carrying out the traversal. All good planning aims for both efficiency and effectiveness. Efficiency leads to minimizing cost or maximizing benefit. Effectiveness refers to getting the task accomplished according to the task's requirements. It is insufficient to be efficient if effectiveness is not achieved.
Consider the undirected graph in Figure 1. It is but a simple example. Typical examples in reality are much larger, containing many more nodes and edges.
Figure 1. Weighted graph and its associated adjacency matrix.
W(X,Y) represents the weight assigned to the edge between nodes X and Y. Notice that, in this case,
W(X,Y) = W(Y,X), and that W(Y,Y) ≠ 0. The weight refers to the cost or benefit of moving from one node to another along that edge. In an example cost
minimization task, we have to visit all nodes, one at a time, at minimum , where e is the number of edges employed. For benefit optimization, we would want to maximize the summation. We can only visit a node by moving along an
edge from an adjacent node. What can make this a difficult task is an increase in the number of nodes or the graph being directed so that it is not possible to move in both directions from one node to another. Further, it may be the case that not
all nodes are connected to all other nodes. Bruteforce optimization is not always possible given the time available.
There are various heuristics for optimizing graph traversals. The easiest is Nearest/Furthest Neighbor. If one wants to minimize the summation of weights, one follows the lowestweighted edge to an adjacent unvisited node. To maximize, the highestweighted edge is followed. That is the simplest description, but it does not say what to do when there are unvisited nodes in the graph but there are no unvisited nodes adjacent to the presently occupied node. At that point, one has to begin visiting nodes twice in an attempt to visit the remaining nodes.
Review Graph Optimization.
Unit 7 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 adjacency matrix
 adjacent
 bijective
 directed graph
 edges
 elements
 Eulerian traversals
 graph
 graph traversal
 Hamiltonian traversals
 isomorphic graphs
 Koenigsberg Bridge Problem
 ncube
 nodes
 optimization
 path
 superset
 trees
 undirected graph
 vertex
 vertices
 weighted graph
Unit 8: Trees
8a. Define the basic components of a tree
 What is a tree?
 What are the components of a tree?
 What does acyclic refer to?
 What types of applications can be represented by trees?
A tree is an undirected graph in which any two vertices are connected by only one path. The best way to explain the basic components of a tree is to use a figure. Such a figure is shown in Figure 1.
Figure 1. Tree labeled with its basic components
There are other ways to form a tree, although all trees are, in a sense, hierarchical. In Figure 2, Graphs i, ii, and iii are all trees, while graphs iv, v, and vi are not trees.
Figure 2. Examples of trees and nottrees.
Notice that trees are acyclic, there are no cycles in a tree. However, also notice that trees are undirected graphs. So, it is possible to return to the root from a leaf if one follows the return path. This does not make for a cycle. A cycle is illustrated in Figure 3.
Figure 3. Transforming a tree into a cyclic graph.
Trees are quite useful in many applications. An example is providing an ability to undo prior actions. A tree can represent those actions as they progress so that justprior actions can be used to replace present actions. Recall what game trees are. These amount to showing what actions are possible given the present situation.
Review:
8b. Explain trees as a special case of graphs
 What is the difference between a graph and a tree?
 What is the maximum number of edges in a tree?
 Is a tree always undirected?
 What is meant by a single path from one vertex to another?
Trees are graphs but are a special case, a subset of the class "graph". Our readings give a good explanation of that subset.
A tree is an undirected graph, G, that satisfies any of the following equivalent conditions:
 G is connected and acyclic (contains no cycles).
 G is acyclic, and a simple cycle is formed if any edge is added to G.
 G is connected, but would become disconnected if any single edge is removed from G.
 G is connected and the 3vertex complete graph K3 is not a minor of G.
 Any two vertices in G can be connected by a unique simple path.
If G has finitely many vertices, say n of them, then the above statements are also equivalent to any of the following conditions:
 G is connected and has n − 1 edges.
 G is connected, every subgraph of G includes at least one vertex with zero or one incident edge.
 G has no simple cycles and has n − 1 edges.
Contrary to the way trees are typically drawn, trees can be directed or undirected. For instance, it is not always possible to follow an exact return path. If there is a tree (or other graph, for that matter) with some edges showing direction and some not, edges without direction notations are assumed to be bidirectional.
Trees can have only one path between nodes. A simple example of a graph that is not a tree because of dual paths between nodes is shown here:
Being special cases of graphs, trees can be manipulated and employed in ways that graphs cannot. While reading through the course material, be alert for those differences.
Review:
8c. Discriminate between binary and nonbinary trees
 What is a "binary tree"?
 What are standard ways to traverse a binary tree?
 What are ordered rooted trees?
 What are two important applications of binary trees?
A binary tree is a special case of rooted trees. A rooted tree has a labeled node called the root of the tree. This node generally has no ancestors. Each node of a binary tree has no more than two children. This type of tree is very much used in realistic problemsolving since so much data can be placed into a binary structure.
There are four standard ways to traverse a binary tree: Preorder, Inorder, Postorder. (these fall into the category of depthfirst traversals), and breadthfirst. The course material examines depthfirst in detail. BreadthFirst starts at the root and visits each node on each subsequent level before moving on to the next lower level.
When we look deeply at the various traversal methods, we see that they all depend on rooted ordered trees. A rooted ordered tree is a tree that has a root and whose subtrees are arranged in a specific order and which themselves are rooted ordered trees. A good example is shown in the figure. Note that, while both trees contain the same data on the same levels, with the same descendants and ancestors, they are not the same tree. And, the same traversal scheme applied to both trees will deliver different results.
Two important applications are discussed in the course material: sorting and the representation of numeric expressions. Take, for example, the game, "guess the number"? A range of numbers is given, a guess is made, and the response is either "high", "low", or "equal". That is exactly what a binary sort tree does with numbers as they arrive over time. It thereby builds a tree of numbers that can be easily searched to see if the number in question is present in the tree. If one takes "the number" as a unique key for a data record, the binary tree becomes an excellent means of creating a structured database for storing and searching those records.
For mathematical expressions, these are arranged in a way that facilitates certain types of processing. Infix notation is the notation commonly used in arithmetical and logical statements. It is characterized by the placement of operators between operands, "infixed operators", such as the plus sign in 2 + 2. We use this approach with most handcalculators. Reverse Polish notation, also known as Polish postfix notation or simply postfix notation, is a mathematical notation in which operators follow their operands. An example is 2 2 +. There are handcalculators that use this notation. Polish notation, also known as normal Polish notation, Łukasiewicz notation, Warsaw notation, Polish prefix notation or simply prefix notation, is a mathematical notation in which operators precede their operands. An example is + 2 2. Postfix notation is used in many stackoriented programming languages like PostScript and Forth. CoffeeScript syntax also allows functions to be called using prefix notation, while still supporting the unary postfix syntax common in other languages.
Review Binary Trees.
8d. Identify subgroups within a specific graph that includes all vertices and a minimum number of edges
 What is a spanning tree?
 Suggest applications of spanning trees?
 What is a method of counting spanning trees within a graph?
A spanning tree, T, of an undirected graph, G, is a subgraph that is a tree that includes all of vertices in G, using the minimum number of edges. Spanning trees can also be defined as a maximal set of edges of G that contains no cycle, or as a minimal set of edges that connect all vertices, without forming a cycle. A graph can contain more than one spanning tree. This definition has many implications regarding the optimization of tree structure and traversal.
"Optimization" and "traversal" are very situationally dependent. Spanning trees are especially useful if we want to minimize the number of "lines" between destinations. This problem occurs in networks of all kinds. For instance, establishing lines for power, telecommunications, internet connectivity, or piping between various destinations. In all of these, we want to avoid cycles, returning to somewhere already visited. We may even want to maximize the benefit of traveling along a given path while visiting each node only once.
That there may be more than one spanning tree available for consideration is important because one tree may be more expensive to implement, or yield less benefit than another. How do we know how many spanning trees exist in a given undirected graph? How many such options do we have for finding a minimum or maximum optimization, while satisfying all requirements? Our course material gives us a simple approach to that:
To compute t(G), the number of spanning trees within a graph, G, construct a square matrix in which the rows and columns are both indexed by the vertices in G. The entry in row i and column j is one of three values:
 The degree of vertex i, if i = j
 −1, if vertices i and j are connected
 0, if vertices i and j are different from each other but not connected
The resulting matrix is singular, so its determinant is zero. However, deleting the row and column for an arbitrarily chosen vertex leads to a smaller matrix whose determinant is exactly t(G).
Let's examine the simple example posed by this graph: 

If we build the matrix described above, we get this table. The determinant of this table is indeed zero. 
Node Names A B C A 2 1 1 B 1 1 0 C 1 0 1 
Here is the result after culling an arbitrary node from the table. The determinant of this table is one, exactly what we would expect from observation. 
Node Names B C B 1 0 C 0 1 
Following this same procedure with a fournode graph where all nodes are connected to all other nodes, we find that there are sixteen possible spanning trees. In the case of multiple spanning trees being available, we can eliminate all other edges, once a spanning tree has been selected. (Note: Calculating the determinant of a matrix can be rather involved. That calculation is outside our scope. However, those who are interested can usually find a function for that within their favorite spreadsheet. For instance, Excel uses the MDETERM builtin function).
What is wrong with "optimizing" in some situations, for instance, data network minimization? Reality may dictate criticality of connectivity. Reliability of communication may be necessary so that a lack of dataflow only lasts for a small amount of time. Thus, redundancy may be more important than data network minimization. This may require two spanning trees to be installed and the acceptance of cycles. Be sure to not optimize into nonperformance.
Review:
Unit 8 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 acyclic
 binary tree
 bidirectional
 game trees
 hierarchical
 infix notation
 leaf
 Polish notation
 postfix notation
 return path
 reverse Polish notation
 root
 rooted tree
 spanning tree
 traverse
 trees
Unit 9: FiniteState Automata
9a. Illustrate abstract machines that can be in exactly one of a finite number of states at any given time
 What is a machine "state"?
 What are applications of finitestate automata?
 What are approaches to visualizing such machines?
 What are the various terms used for "finitestate machine" that one finds in the literature?
As stated in the introduction to this unit: A finitestate machine (FSM), finitestate automaton (FSA), finite automaton, or state machine, is a model that describes an abstract machine. That machine can be in only one of a finite number of states at any point in time. The FSM can change from one state to another as it responds to data inputs that indicate when some condition is satisfied. The change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the conditions for each transition. Often, state machines are illustrated as graphs whose nodes are the states and whose links are the transition conditions.
A "state" is the behavior of a machine at a given moment in time. Machines are capable of only a finite set of behaviors. A state can also be described as the status of a system that is waiting to execute a transition. A transition is a set of actions (behaviors) to be executed when a condition is fulfilled or when an event is detected in the data received. The actions (machine behaviors) triggered by a transition continue until the next transition occurs.
Simple applications of finitestate automata are found in vending machines, which dispense products when the proper combination of coins is deposited; elevators, whose sequence of stops is determined by the floors requested by riders; traffic lights, which change sequence when cars are waiting; and combination locks, which require the input of a sequence of numbers in the proper order.
There are numerous ways to picture (or represent) a state machine: Unified Modeling Language, Specification and Description Language, state/event tables, graphs, etc. Choose the approach that is most understandable for the project at hand. Another choice basis is ready implementation of a model in an automated tool. Preferably, a tool is chosen whose depictions are readily understandable by project personnel.
Review Finite State Machine Overview.
9b. Analyze systems that recognize input patterns, accepting or rejecting an input depending on whether a given pattern occurs
 What is an important personal attribute during technical projects?
 If logical statements can be reduced to an equivalent, is that possible with state machines? How?
 Why is reduction of logical statements and state machines important?
 What are the components of a good state machine specification?
The deeper one goes into technical subjects, the more precise one's terminology must become, and the greater is the attention to detail that one must have. Otherwise, understanding and exposition become increasingly difficult, if not impossible. This is true of the material in this course, and for all technical topics.
We have already seen how logical statements can be reduced so that they are simpler, while yielding the same results. The same is true of state machines. There are two conditions that must hold: 1) each data input while the states are active must lead to equivalent state transitions, and 2) equivalent states will always have the same outputs once the transition occurs. Reduction of logic statements and state machines is important from an implementation, and therefore, cost perspective. A tradeoff with human understandability is made so that cost of construction and ownership can be reduced. However, there is no reason that baseline designs cannot be performed in nonreduced forms, with reduction being made prior to construction.
The course material talks about the components of a state machine specification that can be used for implementation. This author makes some additions from practical experience:
 A set of states, behaviors, the system has to exhibit.
 A set of expected system inputs that are clearly specified.
 A set of outputs the system is supposed to exhibit for each system state.
 Transition rules that cause the system to switch from one state to another.
 A state the system is in when first initialized.
 A clearly specified endstate, if there is one.
 States that exist to handle unexpected inputs, errors, at any given state.
Review Putting the Basics to Use.
9c. Construct discretestate systems
 What is the difference between Moore and Mealy machines?
 What is similar about Moore and Mealy machines?
 Are the reduction requirements for Moore and Mealy machines the same?
Moore machines have an output that is constant. Once the state is achieved, there can only be one output. However, the state in question can only be achieved when a certain system input is acquired while the system exists in a different specified state. Mealy machines, on the other hand, do not have a constant output. Instead, a state's output depends upon the system input received, while the system is in a prior specific state.
In a very real sense, Moore machines are a special case of Mealy machines. Moore machines accept only one input and deliver one output. Mealy machines accept multiple inputs and deliver multiple outputs. So, Mealy machines are the general case. Moore machines can be developed using the same techniques. Both Mealy and Moore machines can only be in one state at a time, thus the term discretestate systems.
Optimization for both is the same. One seeks states where: 1) each data input while the states are active must lead to equivalent state transitions, 2) equivalent states will always have the same outputs once the transition occurs. Those states can be combined.
Review Putting the Basics to Use.
9d. Predict how a given system will enter different states as new data is input to the system over time
 What must be true for a state machine design to be valid?
 What are two important ways to represent a state machine?
 What is a possible disconnect between state machine design and physical implementation?
 What is the rate of growth of a state machine's complexity as more bits are added to a binary system input sequence?
There are two ways that state machines are usually represented: graphs and transition tables. These are used for various purposes. The visualization of a graph is typical when human understanding is important. It gives a visual sense of what the system does at any point in time. Both graphs and tables are easily represented in a computer program. So, their analysis and manipulation are readily automated.
At a minimum, a state machine specification must have the following for the specification to be valid. This is not just a theoretical curiosity. There are many implications for practical application:
 An initial state.
 A separate state for each step of the proper data sequence received from the system input.
 The ability to handle all possible system input data sequences, including those unexpected.
 Mutual Exclusivity: Only one transition from a state can be had for any system input.
 Collective Exhaustivity: Each state must specify what is to be done with ALL possible system inputs.
There is often a disconnect between theory and reality, and between design "on paper" and implementation in reality. For example, let us assume that we are using binary numbers to represent system input data sequences. The digits (bits) making up the number get set according to what the system generates as input. That is the mathematical theory, numbers written in binary, numbers in Base 2, N_{2} instead of N_{10.} The binary number is called a "word" in digital electronics.
Digital electronics standardizes on certain word sizes (number of bits in a binary number). One finds word sizes in 4, 8, 16, 32, and 64 bits. Custom word sizes can be constructed but that is a lot more expensive. Most designtoimplementation uses standard word sizes. Let's do just that. Now, assume the state machine we want to implement uses only three bits. It is great to optimize a state machine design since that can lead to lower implementation cost. But, the smallest fourbits is the smallest word size we can purchase from standard components, and we want to stick to standard components. So, all our memory, and words will be fourbit oriented. Data flow will be four, not three, bits. Memory will be four bits. Memory addresses will be able to access ALL memory, not just that addressable by three bits. Be sure to accommodate memory address errors from unexpected words being generated as system input. Also be sure state transitions account for those too.
When just considering binary words, the complexity of a state machine can increase rapidly as a larger range of possible system inputs become possible. Just adding one bit to a word causes the system data input potential to double. Be aware of that and be careful to fully accommodate all possibilities.
Review State Transition Diagrams.
Unit 9 Vocabulary
This vocabulary list includes terms you will need to know to successfully complete the final exam.
 abstract machine
 discretestate system
 finite automaton
 finitestate automaton (FSA)
 finitestate machine (FSM)
 initial state
 Mealy machines
 Moore machines
 optimization
 state
 state machine
 state transitions
 transition
 transition tables