CS201 Study Guide

Site:	Saylor Academy
Course:	CS201: Elementary Data Structures
Book:	CS201 Study Guide

Printed by:	Guest user
Date:	Tuesday, 15 April 2025, 11:48 AM

Navigating the Study Guide
Unit 1: Abstract Data Types and Arrays in C++
Unit 2: Introduction to Stacks and Queues
Unit 3: Pointers and References in C++
Unit 4: Dynamic Memory Allocation
Unit 5: Linked Stacks, Queues, and Lists
Unit 6: Algorithm Efficiency
Unit 7: Searching and Sorting Algorithms
Unit 8: Hash Tables, Graphs, and Trees

Navigating the Study Guide

Study Guide Structure

In this study guide, the sections in each unit (1a., 1b., etc.) are the learning outcomes of that unit.

Beneath each learning outcome are:

questions for you to answer independently
and a brief summary of the learning outcome topic with linked resources.

At the end of each unit, there is also a list of suggested vocabulary words.

How to Use the Study Guide

Review the entire course by reading the learning outcome summaries and the linked resources.
Test your understanding of the course information by answering questions related to each unit learning outcome and defining and memorizing the vocabulary words at the end of each unit.

By clicking on the gear button on the top right of the screen, you can print the study guide. Then you can make notes, highlight, and underline as you work.

Through reviewing and completing the study guide, you should gain a deeper understanding of each learning outcome in the course and be better prepared for the final exam!

Unit 1: Abstract Data Types and Arrays in C++

1a. Solve program implementation requirements using multidimensional arrays

Describe the difference between contiguous and non-contiguous memory for any object.
Demonstrate various means of implementing arrays.
Show the difference between variable addresses and pointers.
What is a linked-list, and what are its components?
How does one create a linked-list?

Understanding how to arrange data in a computer's active and persistent memory is an important component of your study of computer science. The choice of arrangement depends very much on the application at hand; even so, it should never be so inflexible that it can't evolve with that application's requirements. Many "standards" in software engineering and software project management assume, if only implicitly, that a project's requirements are static. In the real world, that is never the case.

In this subunit, we focus on the most common data structure, arrays. These are native to most languages. Since we are dealing with C/C++, be sure you understand the rudiments of native array implementation in that language. Additionally, be sure you know the difference between contiguous and non-contiguous implementation. Although we do not cover it in this course, it is worth noting that "non-contiguous" can also refer to components of an array (or any data structure) that do not exist on the same physical computer that the access/manipulation software is running on.

Access to non-contiguous arrays and other data structures is facilitated by memory pointers. Memory pointers use memory addressing to locate the various components of a data structure.

Linked-lists are also part of the discussion on arrays since arrays can be implemented using nodes that point to other nodes. Each node can be interpreted as an array cell. As we have seen in the course, linked-lists are the basis for many other useful data structures. There are many approaches to the implementation of linked-lists; you should be familiar with those.

1b. Relate C/C++ structures and classes to ADTs

What is the Standard Template Library (STL), and why is it important?
Define "Abstract Data Type" (ADT).
How are classes and structures and arrays related to ADTs?
What are the important attributes of ADTs?

Abstract data types (ADTs) are the general case of specific types, such as arrays and linked-lists. Be sure you are familiar with the general concept of ADTs (in this video from 44:30). You should also understand the relationship between the general concept and C/C++ classes and structures.

The Standard Template Library (STL) is now part of the C++ language. Review the STL (in this video from 21:08-29:25) since it is applied regularly in professional practice.

1c. Apply the general concept of ADTs to lists and arrays accessed via pointers

Review author implementations of linked-lists. How could they be improved?
Review author implementations of arrays. How could they be improved?
Why is it important to release memory allocated at runtime after it is no longer needed?
Explain how memory can be available but unusable.

Use your favorite IDE to practice with arrays and linked-lists. Practice is essential. You need experience to be successful in your professional work. Programs used by various authors are presented as illustrations throughout the course. It is well worth your while to load these into your IDE and get them to work. As you do this, study them to understand their logic and how it translates to C/C++ syntax.

Unit 1 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included. Understanding linked-lists, arrays, and how they relate to ADTs lays the foundation for much of what is done in the application of data structures. Here are some terms you should be familiar with:

Node
Cell
Array
Vector
Linked-list
Abstract data type
Contiguous
Non-contiguous
Standard Template Library
Pointer
Address

Unit 2: Introduction to Stacks and Queues

2a. Recognize and visualize stacks and queues

What is a stack?
What is a queue?
What is the essential difference between stacks and queues?
Recognize diagrams that illustrate stacks and queues.
Be able to modify a stack ADT so that it functions like a queue.

We mentioned lists earlier. That discussion continues in this unit; see pages 96-99 of this book. Lists form the foundation of stacks and queues. There is only a subtle difference between stacks and queues. Both add newly-arriving elements to the same end of the list. However, the stack processes the most recently-added element. It removes the element from the end of the list to which elements are added. Queues, on the other hand, remove elements from the other end of the list from which elements are added. In this way, stacks process the element that has been on the list the least amount of time (last-in/first-out, LIFO). Queues process the element that has been on the list the longest amount of time (first-in/first-out, FIFO). Thus, code that implements either abstract data type can be easily modified to implement the other type. Stacks can be easily modified to function as queues and vice versa.

2b. Solve software requirements that require stacks or queues

What are the various means of implementing stacks and queues?
How do an application's requirements lead to the selection of stack or queue ADTs?
What are the various means of implementing lists?
How can a vector be implemented as a linked-list?
What should one look for to tell if an ADT's code implements a stack or queue?

Stacks are most commonly implemented using linked-lists (as shown on pages 120-127 of this book), as are queues (pages 127-144). There are means of implementing stacks using arrays that can be readily modified to act as queues. Understanding which applications are suitable for stacks and which are suitable for queues will help you see the practical side of these two abstract data types.

Unit 2 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

stack
queue
list
linked-list
array
push
pop
element
object
peek
buffer
rotate
head
tail
empty
insert
LIFO
FIFO
linear data structure
sequential access
top element
bottom element
circular list
list overflow

Unit 3: Pointers and References in C++

3a. Interpret native C/C++ variable declarations relative to data organization and access

How do bytes compose words in computer hardware?
What are various native data types and the amount of memory each occupies?
How can we extend the accuracy of numeric data types using extended declarations?
What are the means by which C/C++ programs can report the actual memory occupied by a given native variable type and by contiguous abstract data types?

Although this is not a language course, we presented a brief review. Understanding the syntax of C/C++ is required before studying memory pointers, and knowledge of pointers is essential to working with abstract data types and data structures. C/C++ has numerous native data types. These can be combined into more complex abstract data types, along with manipulation and access methods. Each native data type and each abstract data type occupies memory, which, along with runtime, is another resource that is consumed whenever a program is running.

3b. Choose approaches to using pointer variables for memory access within scalars, vectors, and arrays

How does one declare a pointer to a native data type or abstract data type?
What is necessary to be done to obtain the address of a variable?
To what does a memory pointer actually refer?
By what arithmetic process can the address of cells within contiguous-memory vectors and arrays and the elements of abstract data types be obtained?
Do vectors, arrays, and other abstract data types necessarily occupy contiguous memory? Explain.

Be certain that you understand memory pointers (watch this video from 35:51 until the end). These are fundamental to data structures. Review how a pointer's value is incremented and decremented and how pointers can be manipulated to obtain the value of data elements. For more detail, look into pointer arithmetic. This is a practical matter that applies to any contiguous-memory object and is also useful within objects occupying non-contiguous memory. Be sure to get plenty of practice since pointers and pointer arithmetic are usually not well understood by those just beginning their study of data structures.

Unit 3 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

sizeof
pointer
address
extended data type
compound data type
word
byte
width
bit
initialization
type qualifier
pointer incrementing
pointer arithmetic
indirection
dereferencing

Unit 4: Dynamic Memory Allocation

4a. Construct and deconstruct memory-based CDTs

What are the various approaches to runtime memory allocations?
For each approach to runtime memory allocation, what is the appropriate approach to deallocation?
Why is it important to deallocate runtime-allocated memory as soon as it is no longer needed?
Is memory allocated at runtime for a given ADT always contiguous?
Is a single deallocation statement always sufficient to deallocate an ADT's runtime memory?
Is setting a pointer's value to NULL sufficient to deallocate the memory referenced by the pointer?

We often do not know how many variables or which variable types will be needed while an application is running. C/C++ provides various native types, and we can create compound data types (CDTs) as needed. The application will give us a global view of the limits on types. For instance, if the application never operates on an "unsigned long int", then we know that type will never be needed. But, if the application operates on "long int", then we know that type is needed – but not always how many. This is how dynamic (runtime) memory allocation and deallocation comes into play. We can allocate memory to variables of different data types as needed and deallocate that memory (that is, return it for other uses) when it is no longer needed. Memory allocation is introduced in slide 13 of these notes. This page extends that discussion to ADTs in general. Keep in mind that allocating memory is not sufficient; you have to also manage it. Pay special attention to the methods, terminology, and exercises on this page.

4b. Choose appropriate means of making the best use of available memory in dynamic applications

What are the two major issues that can arise when a program employs dynamic memory?
How is it possible for a program to run out of memory even though the OS reports plenty of memory?
How can a program fail a dynamic allocation when there is more than enough free memory available?
How can the memory of one data type be converted into the memory of another data type?
Why is it not a good idea to interleave allocations of large and small blocks of memory?
What is a good way of calculating the amount of memory to allocate? Why do we do that?

Memory fragmentation is an interesting problem, and is important to consider when employing dynamic memory. There are ways to deal with this issue that are described in this article. Some even go so far as to write the own allocation routines. All of this is part of managing dynamic memory, which is discussed as part of learning outcome 4a. Interestingly, it is possible to reuse memory without deallocation or reallocation. Variables in C/C++ can be recast as long as both variable types employ exactly the same amount of memory. This is an important approach and is not unsafe as long as you are careful.

Unit 4 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

"unsafe" variable
casting
allocation
deallocation
fragmentation
orphaned memory
NULL pointer
ADT destructor
ADT constructor
scope
memory leak
dynamic memory
static memory
memory lifetime
memory reuse

Unit 5: Linked Stacks, Queues, and Lists

5a. Show how lists, stacks, and queues are used to implement specific requirements

What is a linked-list?
How does a linked-list differ from a vector list?
In what way does list implementation impact the list's abstract functionality?
What is the basic functionality that has to be added to a list to create a queue?
What is the basic functionality that has to be added to a list to create a stack?
What are requirements specifically suitable for stacks?
What are requirements specifically suitable for queues?

We continue our discussion of linked-lists on pages 103-120 of this book. Linked stacks and queues are built from linked-lists. Nodes (objects, elements) are the linked components that point to each other within linked-lists, as described in this article. Nodes are formed from structures or classes as objects. The objects generally contain pointers to other objects of the same class. However, it is not necessarily the case that all nodes of a linked-list are of the same class. Keep that generalization of the concept behind linked-lists in mind.

5b. Construct software to employ lists, stacks, and queues within larger applications

What are singly-linked-lists?
What are doubly-linked-lists?
How does a singly-linked-list differ from a doubly-linked-list?
If a stack or queue were implemented as a doubly-linked-list, what might its advantages be?
What are applications within which stacks could be employed?
What are applications within which queues could be employed?

Begin your review by exploring singly-linked and doubly-linked-lists. Selecting the particular implementation within applications is important since it is not unusual for the basic stack and queue functionalities to be insufficient. Be sure you understand linked stacks on pages 123-125 of this book, and their implementation, as described in this article. The same goes for linked queues (on page 133) and their implementation, as described here. Examine, build, and run these code examples using your favorite IDE.

Unit 5 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

pointer
linked-list
singly-linked
doubly-linked
stack
queue
dynamic memory allocation
pointer to next
pointer to previous
head
current
tail
push
pop
insert
LIFO
FIFO

Unit 6: Algorithm Efficiency

6a. Examine algorithms relative to their resource utilization

What are the various resources consumed when a program is running?
Why is it important to balance resource utilization when examining program efficiency?
What is an effective empirical approach to resource analysis?
Why are considerations of algorithm efficiency are important?
What are the classical equations that describe the efficiency of various resource utilizations?
Why does one better concentrate on the underlying efficiency equations rather than their magnitude?

An understanding of algorithm efficiency is part of creating effective applications. Realizing their importance (as described in this video from 31:30 to the end), you can guide algorithm development to achieve the most satisfying results. Many applications depend on code modules running within a given length of time, even if the application only requires soft-real-time. A major topic within this domain is Big-O analysis. This is a means of discovering an algorithm implementation's efficiency measure. Most discussion in this arena focuses on runtime. However, you must also consider other resources. An example is the tradeoff between memory and time. For distributed systems, you must also consider the volume of data on the network.

Unit 6 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included

Big-O analysis
cO(f(n))
efficiency vs. effectiveness
space/time tradeoff
constant, linear, logarithmic, quadratic resource consumption growth rates
difference between best, worst, average cases
resource utilization growth with increasing data loads
hardware/software tradeoffs
upper bound
lower bound
empirical analysis

Unit 7: Searching and Sorting Algorithms

7a. Understand common search methods

Which are the commonly-used search methods?
What is a process for determining which search method is best for a given application?
In what way does data search depend on data sorting?

We began our discussion of search by using vectors for illustration. Do not forget that linked-lists can be used as vectors. First, however, you should consider the algorithm's efficiency and effectiveness. In our study of search, we concentrated on linear, binary, and Fibonacci searches. While linear search is the least efficient in the Big-O sense, the data does not have to be pre-sorted. Plus, it is a very good approach when there is only a small volume of data.

7b. Understand common sort methods

Which are the commonly-used sorting methods?
What is a process for determining which sorting method is best for a given application?
In what way does search impact sorting efficiency?

The ability to sort data allows us to organize it for various uses and search for specific items. We spent a lot of time on divide-and-conquer methods, especially merge sort. Divide-and-conquer allows for the best use of multi-core and multi-node networked systems when large data volumes are involved. Besides merge sort, we also discussed insertion sort; bubble and selection sorts; and quicksort (review this video until 18:44).

7c. Plan efficient and effective search and sort algorithms

Why is it necessary to balance efficiency and effectiveness?
Under what circumstances do we see similar resource utilization for all search or sort methods?
Do the units of efficiency always have to do only with the computer itself? Explain.

Review the difference between recursive and nonrecursive implementations of quicksort. You will notice that the recursive version is much easier to understand. That does have value. However, it is difficult to determine a recursive implementation's memory requirements. For many modern systems, especially embedded systems, you cannot assume the availability of sufficient memory to accommodate whatever is required. From another perspective, cloud-memory is typically far more expensive than runtime. You must be able to calculate a cost-budget rather than assume whatever money is needed will be available. That simply will not work in commercial situations.

7d. Judge searching and sorting algorithms relative to their application efficiency

What are the average runtime O(f(n)) of merge, insertion, binary, bubble, selection, and quicksort?
What are the average runtime O(f(n)) of linear, binary, and Fibonacci search methods?
State means of carrying out Big-O analysis of various approaches to search and sort.

There are general approaches (described on pages 55-85) to analyzing an algorithm's efficiency. To see two good examples, look at Big-O analyses of merge sort and insertion sort.

7e. Rate search and sort algorithms relative to an application's needs

How would one balance ease of system integration with O(f(n)) for a given algorithm's implementation?
Why care about code warnings, memory leaks, and memory fragmentation if the implementation "works"?
For what reason might we not care about a resource's average O(f(n)) when choosing an algorithm?

Having studied search and sort algorithms to determine their average resource O(f(n)), it is time now to look at specific implementations. It is true that an algorithm is not the same as the algorithm's implementation, although many people will hand you a piece of code and refer to it as an algorithm. Always review the code carefully. I like to use static and dynamic review tools since manual approaches can miss subtle problems. Do not be fooled by people telling you that code warnings do not matter. Not only do they obscure problems, but they also have unrecognized interactions that can cause system failures. For this particular subunit, review the code-level expressions of merge, insertion, bubble, and selection sorting algorithms. You should also review the code for binary search.

Unit 7 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

merge sort
insertion sort
bubble sort
selection sort
quicksort
linear search
binary search
Fibonacci search
O(f(n))
Big-O analysis
divide-and-conquer
recursive
nonrecursive
worst, average, best O(f(n)) case

Unit 8: Hash Tables, Graphs, and Trees

8a. Demonstrate effective hash tables

How does hashing determine the position of an object/element in the data storage space?
What are the means of handling collisions?
Is it necessary for the hash key to be unique across all key/data pairs?
In what way does the statistical distribution of key values affect hashing effectiveness?
How can data search be a part of a hashing capability?
How can data sorting be part of a hashing capability?

Hashing goes beyond searching and sorting to calculate the position of a key/value pair in a data storage space. No system is ever perfect. In hashing, it is possible that two different key/value pairs will try to occupy the same location in the data storage space. There are numerous ways to handle that situation, as discussed in this lecture and this follow-on lecture. Other approaches are described here on pages 324-345, with a focus on practical implementation.

8b. Demonstrate effective graphs

What is the difference between a graph and a tree?
How is the efficiency and effectiveness of a graph traversal algorithm related to a graph's topology?
In what ways may a graph be represented by a computer program?
Which are the two most common ways to visit all nodes in a graph?
Which are the two most commonly-required graph paths?

In computer science, the term "graph" refers to a set of vertices (points or nodes) and of edges (lines) that connect the vertices. The creation and traversal of graphs is an important topic when learning to solve complex problems that involve numerous relationships. For an excellent discussion on graphs and graph traversal, read Chapter 11 of this open-access text. As in all things, we need efficiency and effectiveness. Both are defined relative to the application at hand, although it is true that O(f(n)) can be estimated in general for various paradigms. We studied various graph traversals. Each is useful for different applications: finding non-cyclical paths, depth-first search, breadth-first search, lowest-cost path, minimal spanning tree.

8c. Demonstrate effective trees

What is the difference between binary and non-binary trees?
What is the difference between partial and complete trees?
How might one compare trees to graphs?
What are the basic elements of a tree?
What are the two major approaches to visiting every node of a tree?
How are infix and postfix notations related to tree-building and traversal?

Trees are a special case of graphs. A tree is a graph whose nodes do not have multiple references. Plus, tree traversal is only in one direction, downward, from root (top) to leaves (bottom). Review this page for basic definitions and illustrations. Many applications can be satisfied by binary trees (review chapter 5). However, do not be satisfied with being able to respond to common requests. A deeper understanding of trees enables solutions to a wider range of requirements. Continue your study of trees by reviewing non-binary trees (review chapter 6).

Unit 8 Vocabulary

This vocabulary list includes terms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

Relative to Hashing

hashing
hash function
hash table
key/value pair
collision
chaining
open addressing
bucket
overflow
linear probing
load factor

Relative to Graphs

graph
digraph
labeled graph
weighted graph
vertex
path
depth-first
breadth-first
spanning tree
verticy
edge
undirected graph
adjacent
simple path
cycle
simple cycle
subgraph
acyclic graph
traversal
shortest path
minimal spanning tree
lowest-cost path

Relative to Trees

root
leaf
node
branch
parent
child
level
ancestor
descendent
depth
height
size
incomplete tree
complete tree
infix
postfix
red/black tree
binary search tree
Huffman coding tree
subtree
depth-first
width-first
post-order
pre-order

CS201 Study Guide

Table of contents

Navigating the Study Guide

Study Guide Structure

How to Use the Study Guide

Unit 1: Abstract Data Types and Arrays in C++

1a. Solve program implementation requirements using multidimensional arrays

1b. Relate C/C++ structures and classes to ADTs

1c. Apply the general concept of ADTs to lists and arrays accessed via pointers

Unit 1 Vocabulary

Unit 2: Introduction to Stacks and Queues

2a. Recognize and visualize stacks and queues

2b. Solve software requirements that require stacks or queues

Unit 2 Vocabulary

Unit 3: Pointers and References in C++

3a. Interpret native C/C++ variable declarations relative to data organization and access

3b. Choose approaches to using pointer variables for memory access within scalars, vectors, and arrays

Unit 3 Vocabulary

Unit 4: Dynamic Memory Allocation

4a. Construct and deconstruct memory-based CDTs

4b. Choose appropriate means of making the best use of available memory in dynamic applications

Unit 4 Vocabulary

Unit 5: Linked Stacks, Queues, and Lists

5a. Show how lists, stacks, and queues are used to implement specific requirements

5b. Construct software to employ lists, stacks, and queues within larger applications

Unit 5 Vocabulary

Unit 6: Algorithm Efficiency

6a. Examine algorithms relative to their resource utilization

Unit 6 Vocabulary

Unit 7: Searching and Sorting Algorithms

7a. Understand common search methods

7b. Understand common sort methods

7c. Plan efficient and effective search and sort algorithms

7d. Judge searching and sorting algorithms relative to their application efficiency

7e. Rate search and sort algorithms relative to an application's needs

Unit 7 Vocabulary

Unit 8: Hash Tables, Graphs, and Trees

8a. Demonstrate effective hash tables

8b. Demonstrate effective graphs

8c. Demonstrate effective trees

Unit 8 Vocabulary