Topic | Name | Description |
---|---|---|
Course Syllabus | ||
1.1: Characteristics of Databases | Before the Advent of Database Systems | Read chapter 1, which begins by discussing fundamental data concepts. The study of databases is an extension of the study of data in programming. The study of databases as a discipline was motivated by a variety of factors: the increasing size and complexity of software systems, the need to share data, and the need to be able to maintain and secure data effectively. Data is a collection of symbols used to represent numbers, text, pictures, videos, audio, and so on. How this data is represented gives the meaning (or semantics) for the symbols. Additional semantics are provided by the relationships that data has with other data. In programming, the semantics of data is provided by program documentation, as well as the programming language used to create the program. In databases, the semantics of the data is provided by a data model, which includes the representation of the data, relationships among the data, and metadata (which is data that defines other data). Semantics makes data useful, and we define useful data as information. Sometimes, the terms 'data' and 'information' appear as synonyms. However, they are different: all information is data, but not all data is information. The type and amount of semantics determines the usefulness of some given data. Usefulness is also relative to a given user. Data may not be useful to some users, but very useful to others. Identifying data and information and organizing them into a data model typically occurs in software requirements analysis and design. Current work in the field of databases addresses techniques that support building and storing data models, and using them to discover meaning (or information) in large volumes of data. |
Fundamental Concepts and Benefits of a Database | Read chapters 2 and 3. Chapter 2 explores the fundamental concepts of databases and their properties. Chapter 3 discusses the characteristics of databases that give them their unique advantages. From a database development perspective, these characteristics are known as "requirements". Be sure to complete the short exercises at the end of each chapter. |
|
Introduction to Database Systems | Read chapter 1, which begins with a description of database management systems (DBMS). DBMSes are software systems that enable and support the use of databases. |
|
1.2: Database Environments | Database Models | Read this article on the types of databases and their functions. The article introduces a data model, which is the conceptual design for a database from a development perspective. |
Types of Data Models and Data Modeling | Read chapters 4 and 5, which discuss the types of data models, their properties, and their levels of abstraction: external, conceptual, internal, and physical. We will later refer to these levels of abstraction as 'schema levels'. |
|
Elements of Database Systems | Read chapter 2, which continues the top-down discussion on DBMSes. It describes the elements of a DBMS, functional requirements and necessary characteristics (which corresponds to the external view of the 3-level schema architecture perspective), design (which corresponds to the conceptual model view), and implementation (which corresponds to the internal/physical model view). It also gives an overview of database and DBMS processes, including planning, development techniques and methods, roles and responsibilities, stakeholders, and database and DBMS maintenance and configuration control (which is how to control changes to the database or DBMS). |
|
1.3: Classifying Database Management Systems | Classification of Database Management Systems | Read chapter 6, which takes a bottom-up approach, going from data to databases to DBMSes. Previous chapters addressed data and databases, while this chapter discusses DBMSes and classifies them using several criteria. |
Comparison of NoSQL Database Management Systems and Models | This course primarily covers traditional databases, but this article gives an overview of some non-traditional databases and classifies non-SQL databases according to their operational model. Non-SQL databases are schema-less, and not based on a single data model. |
|
Classifying Database Management Systems: Regular and NoSQL | Read this article on how to classify traditional and non-traditional databases based on internal and implementation models. |
|
2.1: Databases and the Three-Schema Architecture | Three-Schema Architecture | This article presents the three-schema architecture, which aims to provide data independence. The three schemas are external, conceptual, and internal. Each provides a level of independence: application independence from the external schema, external schema independence from the conceptual schema, conceptual schema independence from the internal schema, and internal schema independence and thus application independence from the physical data itself. |
Data Modeling and a Framework for Database System Design | Read chapters 3 and 4. In chapter 3, the three ANSI/SPARC schemas are presented as phases of database development: the requirements phase, the high-level design phase, the detailed design phase, and the implementation phase. These correspond to the external model, the internal model, and the physical model, respectively. This chapter also describes entity-relationship modeling as a means of representing a conceptual model. Chapter 4 discusses conceptual modeling for hierarchical, network, and relational databases. |
|
2.2: Tables, Views, and Indexes | The Relational Data Model | Read chapter 7, which introduces the relational database model. Relational databases are one of the most commonly-used database models. The chapter includes a mathematical foundation for the model in relational algebra and relational calculus. These are also the basis of the standard database language SQL. Most current database systems support the relational model. |
Relational Databases | You have probably used spreadsheet software like Microsoft Excel or OpenOffice. If so, you have likely used tables, views, and indexes while manipulating your spreadsheet data. These are all concepts used in relational databases. This video explains the concepts of relational databases using Microsoft Excel. A "primary key" is a special type of index that is unique, and the main (or primary) value used to locate an entry in a database. |
|
Introduction to Relational Databases | This lecture explains the basic concepts of relational databases, including tables, columns, primary keys, foreign keys, and referential integrity. It then illustrates them using a customer order database. The latter part of the lecture uses Microsoft Access to implement a relational database. |
|
3.1: Before Computers | A Brief History of Databases | This article makes several key points, and notes how the evolution of database technology reflects the evolution of how we model the world around us and is driven by the need to address the complexities that result from a proliferation of data. The growing demand for data and better data accessibility has led to a surge in the amount and quality of data available to people and organizations, databases have become so common that organizations are structured to reflect the model of their data. |
History of Databases | Watch this lecture, which gives an overview of the evolution of both traditional and non-traditional databases. |
|
3.2: The Four Revelations | Database Models | This article defines explains some types of database models and describes data modeling in the context of the development of a database subsystem by focusing on its requirements, high level design, detailed design, and implementation. The first diagram depicts 5 categories of data models: flat files, early data models (network and tree), relational, and post-relational. |
4.1: Why Use an E-R Model? | Conceptual Designs | The lecture explains why entity-relationship (or E-R) models are used, and gives a high-level view of the development process for a database. It describes the database development process as logically parallel to the development process of an application, such as an information system, that uses a database. The development cycle for the application includes requirements analysis, design, and implementation. The development cycle for databases are similar, having a corresponding requirements analysis phase (called the conceptual schema) that is a technical specification of the database requirements. A popular representation of the conceptual schema is an E-R model. |
4.2: E-R Elements and Symbols | The Entity-Relationship Data Model | Read chapter 8, which describes an E-R model. An E-R model is conceptually similar to a relational model: E-R entities correspond to relational tables; entity attributes correspond to table columns. An E-R model can easily be translated to a relational model. |
Entity-Relationships | As you are studying databases, strive to read as many database examples as you can. This resource includes an example E-R model for a SQL movie database. |
|
Data Modeling: Entity-Relationship (E-R) | This short description of E-R models demonstrates them from the perspective of application types: OLTP (Online Transaction Processing) and DSS (Decision Support Systems). It also relates E-R notations to other notations. |
|
Data Models | Review these slides, which give an E-R example in the form of an aircraft database. The notes also give a list of rules for interpreting the model. |
|
Introduction to the Entity-Relationship Model | These short videos present an E-R model for a music database from the perspective of the database's requirements and design process. |
|
4.3: The Enhanced E-R Model | The Enhanced Entity-Relationship Model | For some complex applications, an E-R model may not be able to accurately represent the data requirements. To do so, extensions such as specialization, generalization, and inheritance have to be defined. The resulting models are called "enhanced E-R models". |
Advanced Topics in E-R Modeling | This tutorial goes into some detail on the enhanced E-R modeling concepts that will be helpful when we transition to relational modeling. |
|
5.1: The Relational Model at a Glance | The Relational Data Model | Review chapter 7, which we looked at previously in Unit 2. The relational model uses tables for relations, table columns for attributes, and table rows for records. Some references may refer to the product of data analysis as a logical model. Here we refer to the product of data analysis as a conceptual model and the product of database design as a logical model. In either case, the database design model addresses how data is used. It is more detailed than the data analysis model, which is concerned with the meaning of the data and its structure. |
The Relational Database Model | Read chapter 6, which supplements the chapter you just reviewed. |
|
5.2: Mapping an E-R Diagram to a Relation | ER Model to Relational Mapping | For software development, it is generally desirable to use methods and models for each phase that are compatible, which eases the transition from one phase to another. So too with databases. This video maps an E-R model to a relational model that supports the requirements analysis and design of a database, and covers the transition between the two. |
Integrity Rules and Constraints | Read chapter 9, which presents constraints on attributes or tables. A constraint is a rule for incorporating semantics of the application data into the relational model. There are constraints for database integrity, application semantics, business rules, |
|
6.1: Basic Operations | Relational Algebra | The elements of relational algebra are 'relations', which are tables. Operations on a table or set of tables result in a table. A relational algebra expression is a combination of operations that specify the order in which the operations are performed. In SQL terminology, this is the order in which basic SQL commands are performed to execute the query. This page explains the basic relational algebra operations. |
The Relational Database Model | Relational algebra is used to express a query as a sequence of basic operations which, when executed, will result in the relation that gives the query result. Relational algebra specifies 'how' to obtain the result. Relational calculus, on the other hand,
is used to specify the desired result. This is 'what' the result is. The distinction is a difference of perspectives: relational algebra focuses on operations for constructing sequences of operations to produce a desired table, and relational calculus focuses on a relational expression that defines the desired table. Review pages 6–7 to 6–12 and study the case examples. Tables for the examples are included at the end of the chapter. |
|
Relational Algebra | This lecture explains the division operation and demonstrates an implementation using fundamental operations. |
|
6.2: Derived Operations | Relational Algebra and Relational Calculus | The first 4 minutes of this video characterize relational algebra relative to relational calculus. Relational algebra has a set of basic operations, as well as derived operations, like Join, which can be written as a composition of the basic operations. Since these operations are used frequently, they are made into a single command, the derived operation. The last 10 minutes introduce set operations, which are used by relational calculus statements. The middle 20 minutes or so of the video work through several small examples of relational algebra. In this resource and other resources, the terms set, multi-set, and bag are used. A multi-set is simply a set that includes duplicates. A bag is just another name for a multi-set. |
SQL Data Manipulation Language | Read chapter 16, which explains SQL DML (data manipulation language), which is used for making queries to a database. Queries specify operations on relations that result in desired information. A combination of relational operations is called a derived operation. SQL DDL (data definition language) is used to create a database. A DML query is a SQL statement or command that is a high-level programming language representation for a relational calculus expression. The relational calculus expression is translated by the SQL optimizer (the query planner) into a composition, or a derived operation, of relational algebra operations that specify how the SQL statement is executed. |
|
6.3: Set Operations | The Relational Database Model | Review pages 6-12 to 6-19 and study the examples. These pages discuss relational calculus, which utilizes mathematical set operations. Basic relational algebra set operations are discussed earlier in the chapter, on pages 6-1 to 6-12. Tables for the examples are included at the end of the chapter. |
6.4: Aggregate Functions | The Structured Query Language | Read chapter 7, which discusses SQL statements. It also introduces aggregate functions, which return a single numeric value when given a single column of numeric values. Study the aggregate function section. |
7.1: Defining Normalization | Database Normalization | This article is a concise overview of normalization: what it is, why it's done, its pros and cons, its benefits and costs, the normalization process, and 2NF to 3NF transformation. |
Normalization | Chapter 12 explains normalization, gives examples showing dependencies that are removed by each level of normalization, gives a dependency diagram, and introduces BCNF normalization, which is a special case of the third normal form. |
|
Normalized Database Design | Chapter 5 continues the explanation of normalization and discusses various types of anomalies and dependencies, the decomposition of poor database designs, the six levels of normalization, and normalization side effects. |
|
Entity Relationship Diagrams | These videos work through a design example. Normalization is presented as part of the design. Beginning with an initial table for the design, rules are applied that transform the design from 0NF to 1NF to 2NF and then to 3NF. |
|
7.2: Anomalies | Data Anomalies | This short article discusses update, deletion, and insertion anomalies. |
7.3: Functional Dependencies | Functional Dependencies | Chapter 11 defines function dependency, illustrates dependencies using a sample table, and gives rules for inferring all functional dependencies. |
7.4: From 1NF to BCNF | Normalizing a Relation to BCNF Based on Functional Dependencies | This article uses functional dependencies to transform a set of relations to BCNF. |
8.1: What is SQL? | Introduction to SQL | This resource is a nice concise summary of SQL Data Definition, Data Manipulation, and Data Creation commands. |
Another Introduction to SQL | This presentation discusses querying a database. It is presented in the style of a workshop, and gives practice in SQL DML (Data Manipulation Language). It also includes exercises. |
|
8.2: Data Definition Language (DDL) | SQL Structured Query Language | Chapter 15 gives a good treatment of SQL DDL, incuding data types and constraints, and SQL DML, like the SELECT and JOIN statements. |
SQL Tutorial | SQL consists of a Data Definition Language (DDL), a Data Manipulation Language (DML), and a Data Control Language (DCL). Use this resource as a reference for these languages. For now, spend some time to review the following pages in this tutorial: SELECT, JOIN, CREATE/ALTER/DROP SCHEMA, CREATE/ALTER DATABASE, SQL DROP, and CREATE/DROP/ALTER TABLE. |
|
Drop Table | This video illustrates the DROP TABLE command. |
|
Alter Table | These videos illustrate the ALTER TABLE command. |
|
8.3: Column- and Table-Level Constraints | SQL Tutorial | Review the following pages in this tutorial: CONSTRAINTS, PRIMARY KEY, and FOREIGN KEY. Constraints are used in DDL commands to incorporate rules for a table. Constraints are defined with Create or Alter Table commands. The NOT NULL constraint defines a rule that a column cannot store a null value. Null and NOT NULL are also used in the Select command to find NULL or NOT NULL values in fields or columns. This rule specifies that the value in a field or column of a table is different for each record or row. Primary Key is a constraint that defines a field or column that uniquely identifies each row (is NOT NULL and is Unique). A foreign key is a column in a table that contains a value that is a primary key of another table. Thus, a foreign key links a referenced table to a referencing table. Default specifies a default value for a column. This constraint checks that a value in a column satisfies a specified condition. |
8.4: Data Manipulation Language (DML) | SQL Tutorial | Review the following pages in this tutorial: INSERT STATEMENT, DELETE STATEMENT, and SELECT STATEMENT. We are using the SQL 2003 ANSI Standard to illustrate the SQL commands; the other databases, MySQL and OpenBase for example, have similar commands. The Insert statement adds a record into a table. Delete removes records from a table. Select is the SQL command for querying a database. |
SQL Insert, Update, and Delete Commands | This video illustrates the INSERT, UPDATE, and DELETE commands. |
|
9.1: Basic Select Statements | SQL Tutorial | Review the following pages in this tutorial: SELECT with DISTINCT (under SQL SELECT STATEMENT), WHERE CLAUSE, and SQL AGGREGATE FUNCTIONS. DISTINCT is a clause of the Select statement, which specifies that the data retrieved from the database be distinct. Suppose a student database contained data on student computer science courses, and the desired data are the names of all the students taking a computer science course. If a student was taking 2 computer science courses, there would be 2 course records in the database, each having that students name. The DISTINCT clause specifies that the student name be returned just once, not twice. The SELECT statement can be used to find specific data in specific records. It can also be used to search or browse a database. The WHERE clause is used to specify conditions on the data that is queried. Fewer conditions result in more data found; more conditions result in less data being found. Logical operators, such as AND, NOT, are used to combine conditions to place more restrictions on the data desired, and, thus, filter out more undesired data. The SQL DML language manipulates data from a database. It is used to sort data, arrange data, or perform calculations on the data. Aggregate functions are built in DML functions used in the Select statement to summarize an amount of data, and return a calculated or summary value, instead of returning all of the data found. For example, COUNT, AVERAGE, SUM, MAX, and MIN are aggregate functions. Note that a function takes arguments, written in parentheses after the function name. |
PostgreSQL Tutorial: AND, OR, NOT | There are several SQL languages widely used in practice. PostgreSQL is one of those. This resource includes the PostgreSQL description of logical operators and their use with the SELECT statement. |
|
9.2: Special Operators | SQL LIKE Operator | LIKE is typically used to find all records that have a column value that matches a specific pattern. Special characters are used to specify the format of the pattern. |
SQL BETWEEN Operator | BETWEEN is used to find all records that have a specific column value in a specified range. |
|
SQL: Using WHERE Clause with NOT LIKE Operator and Escape Character | Sometimes, a column value may not have to be in a given range but rather in a given set of values. The IN statement is used to specify a set of values that a column value must be in. |
|
MySQL SELECT Statement | The TOP clause limits the number of rows returned by a Select query. TOP is not supported by all SQL languages. Others use LIMIT or ROWNUM. |
|
Select (SQL) | Read "Limiting result rows" to learn how various SQL implementations restrict the number of rows returned by the SELECT command. |
|
SQL ORDER BY Clause | The ORDER BY clause is used to sort the records resulting from a query according to the values of a column or set of columns. |
|
10.1: Introduction | INNER JOIN | This video introduces the JOIN clause using an example of a typical scenario. There are several types of joins, some of which are illustrated in this video. |
SQL Data Manipulation Language | Review chapter 16 for some more examples of the JOIN syntax. Take some time to try out the exercises at the end of the chapter. |
|
SQL JOINS | It is helpful to visualize the result (that is, the semantics) of SQL statements and clauses. Review this depiction of the results of the types of join clauses using set diagrams. |
|
Aggregates and JOINs | We use combinations of SQL commands, clauses, and functions to obtain information from a database. Typically, tables are combined to create temporary relations, and we apply select statements with clauses, constraints, and functions to obtain the desired information from a database. Watch this example of a query involving select, inner join, and aggregate. |
|
10.2: Inner Joins | Inner vs. Outer Joins in SQL | Another type of join in SQL is the outer join. This video explains and illustrates the difference between it and other types of joins. |
SQL INNER JOIN | Review this example of inner joins. |
|
10.3: Outer Joins | SQL OUTER JOIN | Review this example of an outer join. |
SQL RIGHT JOIN | Review this page, which describes and demonstrates a right join. |
|
SQL Full Outer Join | This page demonstrates how a full outer join works. It does so by comparing an outer join SQL query with another query that involves a left outer join, a right outer join, and a union. As you'll see, each of these produces the same result. |
|
10.4: The Group By Clause | GROUP BY | The GROUP BY clause is used with aggregate functions, such as MIN() and MAX(), and with the WHERE, HAVING, and ORDER BY clauses. The examples in this video use full column names, aliases, and column numbers. The HAVING clause filters the results of the GROUP BY and the aggregate functions. The clauses are specified in SQL in a specific order. |
SQL GROUP BY Clause | Read this page, which goes into more detail on the GROUP BY clause. |
|
10.5: Assessment | SQL Exercises | If you'd like more practice with SQL, attempt some of these exercises. |
Study Guide | CS403 Study Guide | |
Course Feedback Survey | Course Feedback Survey |