Simple Linear Regression

Now that data mining algorithms and, in particular, supervised learning concepts have been covered, it is time to address the construction of statistical models. The subject of linear regression has been mentioned in a perfunctory way at several points throughout the course. In this unit, we will delve more deeply into this technique. In its simplest form, the goal is to optimally identify the slope and intercept for empirical data assumed to depend linearly upon some independent variable. Linear regression is a statistical supervised learning technique because training data for the independent variable is mapped to data associated with the dependent variable. Once the linear model is created, obtaining estimates for data not contained within the training set becomes possible. Ensure you understand the examples and associated calculations in the video, such as residuals, the correlation coefficient, and the coefficient of determination. Additionally, if necessary, you may want to review hypothesis testing and tests for significance introduced in the statistics unit. After this video, you will learn how to implement this technique using scikit-learn. However, as a programming exercise, you should feel confident in writing code to implement the regression equations.



Source: Linda Weiser Friedman, https://www.youtube.com/watch?v=g94Q6fKmavA
Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License.

Last modified: Wednesday, September 28, 2022, 12:02 PM