### Unit 6: Linear Regression

In this unit, we will discuss situations in which the mean of a population, treated as a variable, depends on the value of another variable. One of the main reasons why we conduct such analyses is to understand how two variables are related to each other. The most common type of relationship is a linear relationship. For example, you may want to know what happens to one variable when you increase or decrease the other variable. You want to answer questions such as, "Does one variable increase as the other increases, or does the variable decrease?” For example, you may want to determine how the mean reaction time of rats depends on the amount of drug in bloodstream.

In this unit, you will also learn to measure the degree of a relationship between two or more variables. Both correlation and regression are measures for comparing variables. Correlation quantifies the strength of a relationship between two variables and is a measure of existing data. On the other hand, regression is the study of the strength of a linear relationship between an independent and dependent variable and can be used to predict the value of the dependent variable when the value of the independent variable is known.

**Completing this unit should take you approximately 12 hours.**

Upon successful completion of this unit, you will be able to:

- discuss and apply basic ideas of linear regression and correlation;
- identify the assumptions that inferential statistics in regression are based on;
- compute the standard error of a slope;
- test a slope for significance;
- construct a confidence interval on a slope; and
- calculate and interpret the coefficient of determination and the correlation coefficient.

### 6.1: The Regression Model

### 6.1.1: Scatter Plot of Two Variables and Regression Line

Read section 2 from Chapter 14. Also, answer the questions at the end. Section 2 defines simple linear regression, introduces scatter plot to reveal linear patterns, and then talks about prediction error. This section also talks about how to compute regression line by minimizing squared errors.

Read these sections on linear regression. Linear regression, the simplest form of regression, is used to obtain a linear relationship between two variables.

Be sure to click "next" and read each section.

### 6.1.2: Correlation Coefficient

Read these sections on correlation. You will learn the interpretation and calculation of the correlation coefficient, how to test its significance, and the relation between correlation and causation.

Be sure to click "next" and read each section.

Read section 2 of Chapter 10 for a discussion on linear correlation. You will learn what the linear correlation coefficient is, how to compute it, and what it tells us about the relationship between two variables x and y.

### 6.1.3: Sums of Squares

Read section 4 from Chapter 14. Also, answer the questions at the end of the section. Section 4 further discusses the sums of squares, including partitioning sum of squares into sums of squares predicted and sum of squares error.

Watch these videos, which discuss the regression line.

### 6.2: Fitting the Model

### 6.2.1: Standard Errors of the Least Squares Estimates

Read section 5 from Chapter 14. Also, answer the questions at the end. Section 5 discusses how to compute the standard error of the estimate based on errors of prediction as well as how to compute the standard error of the estimate based on a sample.

### 6.2.2: Statistical Inference for the Slope and Correlation

Read section 6 from Chapter 14. Also, answer the questions at the end. Section 6 starts with assumptions on the errors that are necessary for statistical inference. Then, this reading shows an example of a significance test for the slope. This section also talks about constructing confidence intervals for the slope. Then, it closes with a significance test for the correlation.

Read section 5 from Chapter 10. This section further details two types of inferences on the slope parameter, considering both confidence intervals and hypothesis testing. Complete the odd-numbered exercises at the end of the section before checking your answers.

### 6.2.3: Influential Observations

Read section 7 from Chapter 14. Also, answer the questions at the end. Section 7 discusses the notion of influence and describes what makes a point influential. It further introduces the concepts of leverage and distance, which are useful to detect influential observations.

Read section 8 from Chapter 10. This section presents a complete example on linear regression, starting from presenting the data, then proceeds to a scatter plot to identify the linear pattern, and fits a linear model using least squares estimation. This reading also addresses some statistical inferences on both correlation coefficient and slope parameter. Complete the odd-numbered exercises at the end of the section before checking the answers.

### 6.3: ANOVA

This optional subunit will teach you about "Analysis of Variance" (abbreviated ANOVA), which is used for hypothesis tests involving more than two averages. ANOVA is about examining the amount of variability in the y variable and trying to see where that variability is coming from. You will study the simplest form of ANOVA, called single factor or one-way ANOVA. Finally, you will briefly study the F distribution, used for ANOVA, and the test of two variances.

Watch these videos, which discuss each of the steps in ANOVA. While these videos are optional, studying ANOVA may help you if you are interested in taking the credit-aligned exam that is linked with this course.

Read this chapter and complete the questions at the end of each section. While these sections are optional, studying ANOVA may help you if you are interested in taking the credit-aligned exam that is linked with this course.

### Unit 6 Assessment

Take this assessment to see how well you understood this unit.

- This assessment
**does not count towards your grade**. It is just for practice! - You will see the correct answers when you submit your answers. Use this to help you study for the final exam!
- You can take this assessment as many times as you want, whenever you want.

- This assessment