PRDV420: Practice: Model Basics | Saylor Academy

Exercises

One downside of the linear model is that it is sensitive to unusual values because the distance incorporates a squared term. Fit a linear model to the simulated data below, and visualise the results. Rerun a few times to generate different simulated datasets. What do you notice about the model?
```
sim1a <- tibble(
  x = rep(1:10, each = 3),
  y = x * 1.5 + 6 + rt(length(x), df = 2)
)
```

One way to make linear models more robust is to use a different distance measure. For example, instead of root-mean-squared distance, you could use mean-absolute distance:

measure_distance <- function(mod, data) {
  diff <- data$y - model1(mod, data)
  mean(abs(diff))
}

Use optim() to fit this model to the simulated data above and compare it to the linear model.
One challenge with performing numerical optimisation is that it's only guaranteed to find one local optimum. What's the problem with optimising a three parameter model like this?
```
model1 <- function(a, data) {
  a[1] + data$x * a[2] + a[3]
}
```

Source: H. Wickham and G. Grolemund, https://r4ds.had.co.nz/model-basics.html
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Last modified: Sunday, 13 November 2022, 3:58 PM

Course Introduction

Course Syllabus

Unit 1: Introduction to R and RStudio

1.1: R and Coding Environments

Overview of R

Introduction to R and RStudio

1.2: Installing and Setting Up R and RStudio

Installing R and RStudio

Setting up RStudio

Updating Software

1.3: Command Line and Script

Using R as a Calculator

Practice: Calculator

1.4: Functions and Packages

Functions

Practice: Functions

Packages

Updating R and Its Packages

Practice: Functions and Packages

1.5: Management of Code and Other Files

R Projects and Files in a Project

Practice: R Projects

Best Practices for Writing R Code

Unit 1 Assessment

Unit 1 Assessment

Unit 2: Basic Object Types and Operations in R

2.1: Data Types

Basic Data Types and Data Structures in R

Practice: Data Types

Strings

Practice: Strings

Factors

Practice: Factors

2.2: Vectors

Vectors and Simple Manipulations

Vectors and Type Coercion

Practice: Vectors

2.3: Arrays and Matrices

What is the Difference Between Arrays and Matrices?

Arrays in R

Matrices in R

Practice: Arrays and Matrices

2.4: Lists and Data Frames

Lists and Data Frames

Practice: Base-R Lists and Data Frames

The Tibble Format

Practice: Tibbles

The data.table Format

Practice: Data Tables

Unit 2 Assessment

Unit 2 Assessment

Unit 3: Data Import and Export

3.1: Data Input via Keyboard or Number Generation

Entering Data

Data Sets in Base R

Practice: Built-in Datasets

Pseudo-Random Number Generation

Practice: Random Number Generation

Reproducible Simulations

3.2: Loading External Files

Data Loading and Viewing

Base R: Reading Plain-Text Files

Tidyverse: Reading Plain-Text Files

Practice: read_csv

Parsing a Vector

Practice: Parsing a Vector

Parsing a File

Using the readxl Package to Read Excel Files

Loading Files From Other Programs

3.3: Data Export and Reusing R Data

Saving and Reloading Data in R Format

Practice: Export and Reuse

Base R: Writing to a CSV File

Tidyverse: Writing to a CSV File

Practice: Export to a CSV File

Practice: Data Manipulation in a Project

Unit 3 Assessment

Unit 3 Assessment

Unit 4: Data Visualization

4.1: Base-R and ggplot2 Graphics