• Course Introduction

        • Time: 13 hours
        • Free Certificate
        Data analysis is essential for discovering trends and correlations and making informed decisions. The R programming language and software environment offer a free and ever-growing data analysis and visualization resource collection. Nowadays, R is used in governmental organizations, academia, and industry (that is, everywhere) for everything from sales forecasting and evaluating the impact of a marketing campaign to studying new health treatments.

        The course provides hands-on experience for learning R language basics and engages students in programming in this open-source language for statistical computing. This course is for all new R users and does not require prior programming experience. You will learn the foundations – how to install R and load data into it – and continue with data manipulation, visualization, and implementation of standard statistical functions. By the end of the course, you will be able to find relevant R resources (packages), read R code, and write your code to visualize and analyze your data.

        • Course Syllabus

          First, read the course syllabus. Then, enroll in the course by clicking "Enroll me". Click Unit 1 to read its introduction and learning outcomes. You will then see the learning materials and instructions on how to use them.

        • Unit 1: Introduction to R and RStudio

          R is a language for statistical computing and free, open-source software provided by The R Foundation for Statistical Computing. The software comes with a free R editor, which is an interface that allows accessing R functionality and writing and executing R code. However, many other code editors and integrated development environments (IDEs), both free and commercial, extend the standard editor functionality and are often more convenient. In this unit, we start exploring R and RStudio IDE and introduce basic practices for coding and organizing your files.

          Completing this unit should take you approximately 2 hours.

        • Unit 2: Basic Object Types and Operations in R

          Data surround us in every shape and form in our daily life: not just as numbers we may see in a weather report but also as sounds, text, and images. By representing data in a standard way in R, we can apply various R functions for data analysis. This unit introduces commonly used data types and explains how the data can be organized as objects in our coding environment. We also discuss how to subset and join several such objects and change the object type.

          Completing this unit should take you approximately 3 hours.

        • Unit 3: Data Import and Export

          Data for analysis can be created or simulated within R or loaded from an external file. R can generate regular sequences and samples from probability distributions (random numbers) often used in simulation-based inference. However, most applied tasks require loading existing data in R from some external file or a database. R has several built-in functions to load data; additional packages expand R functionality and allow us to load data saved in special formats like Excel, Matlab, or Network Common Data Form (NetCDF). Besides loading data of different types, this module demonstrates ways to save R outputs in a format like CSV or RDS.

          Completing this unit should take you approximately 2 hours.

        • Unit 4: Data Visualization

          Visualization is essential for story-telling with data and communicating the results of your analysis. Graphs help us efficiently assess prominent features in the data (trends, variability), irregularities (changepoints, outliers), and relationships between variables and compare those across different samples. R has powerful tools for creating scientific graphs. Commonly used tools belong to two groups: built-in R functions (base-R) and functions from the package ggplot2 following a bit different grammar of graphics. This unit introduces the syntax for both these approaches and how to export a publication-quality graph from R.

          Completing this unit should take you approximately 3 hours.

        • Unit 5: Common Statistical Functions

          The most significant advantage of R is probably the availability of functions for statistical analysis. The ultimate goal of most R courses is to give the learners access to this toolbox. Data and derived inference help shape our decisions; hence it is imperative to do the analysis right. This unit introduces built-in R functions for statistical analysis, from summarizing data and applying simple statistical tests to regression analysis. We will also see how to find additional R functions (packages) for certain types of analysis.

          Completing this unit should take you approximately 3 hours.

        • Course Feedback Survey

          Please take a few minutes to give us feedback about this course. We appreciate your feedback, whether you completed the whole course or even just a few resources. Your feedback will help us make our courses better, and we use your feedback each time we make updates to our courses.

          If you come across any urgent problems, email contact@saylor.org or post in our discussion forum.

        • Certificate Final Exam

          Take this exam if you want to earn a free Course Completion Certificate.

          To receive a free Course Completion Certificate, you will need to earn a grade of 70% or higher on this final exam. Your grade for the exam will be calculated as soon as you complete it. If you do not pass the exam on your first try, you can take it again as many times as you want, with a 7-day waiting period between each attempt.

          Once you pass this final exam, you will be awarded a free Course Completion Certificate.