BUS206 Study Guide

Unit 3: Data and Databases

3a. Define metadata

Metadata is "data about data". When you look up a person's birth date, social security number, student ID number, name, etc., in a database, the data may say 1990, but the metadata about that would be the field name, "year of birth". Another example is a document file; file name, date modified, file type, and file size are metadata.

  • Data dictionaries hold metadata. Data dictionaries define databases' structure and data fields.
  • If you were to create a works cited page for an essay in your College Writing course, the author name, publisher, year of publication, page numbers, volume number, etc. are metadata about the book or article in the works cited page.
  • List a few other examples of metadata.

Prepare for the final exam by reading "Sidebar: What is Metadata?" in Data and Databases.


3b. Describe the differences between data, information, and knowledge

Data is raw bits of information such as plain numbers or names. Information is a context given to data. For instance, being told "1955, 1972, 1966, and 1960" are birth years of leaders in an organization would be information. After information is analyzed and aggregated to make decisions, this information produces knowledge.

  • Word processing programs, databases, and spreadsheets are all used to create and manipulate data.
  • Databases organize data in a collection where it is described and associated with other data.
  • Study the DIKW pyramid to understand the hierarchy between data and wisdom.

Prepare for the final exam by reading the sections "Data, Information, and Knowledge", "Examples of Data", and "Databases" in Data and Databases for a better understanding of the differences between data, information, and knowledge.


3c. Define the term database and identify the steps to creating one

1. Databases organize related information and associate it with other data.

  • All database information should be related; unrelated information should be filed to separate databases. For example, a database that contains information about employees should not also hold information about company stock values.
  • When designing a database, consider the data you need to separate into tables and how those tables can relate to each other.
  • Design a set of four or five tables for a database used for an organization of your choosing. Consider creating a table for each department or function. Provide a list of data for each table.

Prepare for the final exam by reading the following sections in Data and Databases: "Databases", "Relational Databases", "Designing a database", and "Sidebar: The difference between a database and a spreadsheet". Focus on material that still seems difficult to you.


2. When creating relationships in databases, a primary key must be selected for each table. The key serves as the table's unique identifier and is often represented as a number by default unless a key is specified.

  • The primary key in a table can never change. Consider selecting data that will not change to serve as a primary key, for example, a social security number.
  • When designing tables within a database, make sure you create the proper key per table to allow you to relate multiple fields to multiple tables as required.


3. The concept of normalization in databases means the design should reduce duplication of data between tables and add flexibility to the database.

  • Add fields in separate tables that can relate to each other to reduce redundancy. If one entry may be listed in multiple tables, create a field in one table that can be related to the other.
  • How would you apply normalization to a database with tables for a club? Consider creating a membership table and member information table.

Read the section "Normalization" in Data and Databases.


3d. Describe the purpose of a database management system

1. Database management systems help users create a database, change a database structure, and perform analysis within a database. Database management systems provide the interface to view and change a database's design, create queries, and generate reports.

  • Microsoft Access is a database application where users can create, modify, and analyze data.
  • In relational databases, tables are made from columns and rows where columns define the data type of a field, and rows are data sets for a single item.
  • ACID properties of a relational database help ensure the consistency of data transactions.

If you feel less confident in this area, watch Relational vs. NoSQL Databases to prepare for the final exam. Read the "Database Management Systems" section in Data and Databases.


2. Enterprise databases are large scale databases accessed by millions of people over the Internet. These databases are sometimes stored on a single computer or can be installed over multiple servers.

  • A relational database like Microsoft Access meets the needs of organizations with smaller amounts of data to manage, but this kind of database is not as effective with large datasets.
  • NoSQL is an ideal model to follow for large-scale database solutions. Oracle Coherence is an example of a NoSQL database designed to offer reliability and scalability to organizations needing to store large amounts of data.

To prepare for the final exam, read the section "Enterprise Databases" in Data and Databases.


3e. Describe the characteristics of a data warehouse

Data warehouses serve as storage for an organization's historical data. In an organization's daily operations, it is not feasible to analyze large amounts of data collected over time.

  • Data warehouses should use non-operational, time-variant, and standardized data extracted from active databases within an organization.
  • Bottom-up and top-down are the two main approaches when designing a data warehouse.
  • Create a data warehouse process using both top-down and bottom-up approaches.

Read the "Warehouse" and "Benefits of Data Warehouses" sections from Data and Databases.


3f. Define data mining and describe its role in an organization

Data mining involves analyzing data to find trends, patterns, and associations to make decisions. Data mining is accomplished through automated means in large datasets, such as those in a data warehouse.

  • Retailers may analyze sales on a certain product's purchase frequency at a certain time or day of the year. A grocery store could determine the need to stock more eggs the week of Easter through data mining analysis. This is the process of multidimensional sales analysis.
  • Meaningful data and patterns are extracted from aggregate data in a data warehouse.

Read the sections "Example – Facebook" and "Retail Industry" in Data Warehouses and Data Mining, and "Data Mining" from Data and Databases to strengthen your understanding of these concepts.


3g. List the components of knowledge management

Knowledge management formalizes the capture, indexing, and storage of a company's knowledge. Knowledge management helps companies benefit from insights gleaned from the data it has collected over the course of the company's existence.

  • Companies can benefit from the vast amount of knowledge that has been accumulated over the course of their existence.
  • How can knowledge management help a company make decisions in the future?

Review the section "Knowledge Management" from Data and Databases.


Unit 3 Vocabulary

This vocabulary list includes terms that might help you answer some of the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

  • ACID
  • Data
  • Data mining
  • Data warehouse
  • Database management system
  • Information
  • Knowledge
  • Knowledge management
  • Metadata
  • Normalization
  • Primary key
  • Relational database