• Unit 3: Data Import and Export

    Data for analysis can be created or simulated within R or loaded from an external file. R can generate regular sequences and samples from probability distributions (random numbers) often used in simulation-based inference. However, most applied tasks require loading existing data in R from some external file or a database. R has several built-in functions to load data; additional packages expand R functionality and allow us to load data saved in special formats like Excel, Matlab, or Network Common Data Form (NetCDF). Besides loading data of different types, this module demonstrates ways to save R outputs in a format like CSV or RDS.

    Completing this unit should take you approximately 2 hours.

    • 3.1: Data Input via Keyboard or Number Generation

      This section presents tools for creating data within R without dependence on external sources. These options are quick and convenient, and you have used some of them to create your first R objects.

    • 3.2: Loading External Files

      Loading the data is probably the first step to starting your analysis. The simplest option is if the dataset is saved in a clean plain format like CSV, but R can deal with many other formats. This section demonstrates tools of the base R and user-contributed packages for loading files of commonly-used types.

    • 3.3: Data Export and Reusing R Data

      We might need to save our data to take a break from coding or to share the data with others. This section teaches how to save and reload data in R and plain-text formats like CSV. Packages for loading data in other formats often contain functions for saving in those formats – see the help pages for the specific package. For example, the package R.matlab has functions for reading and saving MAT files.

    • Unit 3 Assessment

      • Receive a grade