Factors are the way categorical variables are stored in R. For example,
treatment levels in ANOVA (analysis of variance) are considered factors;
months or quarters of the year can be represented as factors for
modeling seasonality. You should learn how to create factors, rename and
reorder factor levels for convenience, and correct analysis (for
example, the control treatment usually should be the first level of a
factor because, by default, other levels are compared to the first one
in linear models).
Introduction
In R, factors are used to work with categorical variables, variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order.
Historically, factors were much easier to work with than characters. As a result, many of the functions in base R automatically convert characters to factors. This means that factors often crop up in places that are not helpful. Fortunately, you don't need to worry about that in the tidyverse, and can focus on situations where factors are genuinely useful.
Prerequisites
To work with factors, we'll use the forcats package, which is part of the core tidyverse. It provides tools for dealing with categorical variables (and it's an anagram of factors!) using a wide range of helpers for working with factors.
library (tidyverse)
Source: H. Wickham and G. Grolemund, https://r4ds.had.co.nz/factors.html This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.