R Projects and Files in a Project

First, watch the video, then read about the projects and R working directories. The video demonstrates one of the ways you can efficiently manage your files in a project. The discussed file structure will work in many cases but may need to be revised when large data are used and it is impossible or impractical to move the data to the local data folders. Also, the video assumes that each script file (for data loading, cleaning, plotting, etc.) is relatively large; hence it makes sense to keep the code separately so it is more manageable. Suppose each file (for data loading, cleaning, visualizing, and statistical analysis) contains just a few lines. In that case, it might be more practical to keep the codes together in a single script – you are free to decide based on the needs and size of your project.

Workflow: projects

What is real?

As a beginning R user, it's OK to consider your environment (i.e., the objects listed in the environment pane) "real." However, in the long run, you'll be much better off if you consider your R scripts as "real."

You can recreate the environment with your R scripts (and your data files). It's much harder to recreate your R scripts from your environment! You'll either have to retype a lot of code from memory (making mistakes all the way), or you'll have to mine your R history carefully.

To foster this behavior, we highly recommend that you instruct RStudio not to preserve your workspace between sessions:

rstudio-workspace

This will cause you some short-term pain because now when you restart RStudio, it will not remember the results of the code that you ran last time. But this short-term pain will save you long-term agony because it forces you to capture all important interactions in your code. There's nothing worse than discovering three months after the fact that you've only stored the results of an important calculation in your workspace, not the calculation itself in your code.

There is a great pair of keyboard shortcuts that will work together to make sure you've captured the important parts of your code in the editor:

  1. Press Cmd/Ctrl +  Shift +  F10 to restart RStudio.
  2. Press Cmd/Ctrl +  Shift +  S to rerun the current script. t.

I use this pattern hundreds of times a week.