Basic Data Types and Data Structures in R

These are the data types we encounter in everyday work in R. You should learn about their differences and how to access their basic attributes. Most often, we need to know whether the data are in the correct format (such as numeric instead of character) and the size of the R object (use functions length(x) and dim(x) for that).

Understanding Basic Data Types and Data Structures in R

To make the best of the R language, you'll need a strong understanding of the basic data types and data structures and how to operate on them.

Data structures are very important to understand because these are the objects you will manipulate on a day-to-day basis in R. Dealing with object conversions is one of the most common sources of frustration for beginners.

Everything in R is an object.

R has six basic data types. (In addition to the five listed below, there is also raw, which will not be discussed in this workshop.)

  • character
  • numeric (real or decimal)
  • integer
  • logical
  • complex

Elements of these data types may be combined to form data structures, such as atomic vectors. When we call a vector atomic, we mean that the vector only holds data of a single data type. Below are examples of atomic character vectors, numeric vectors, integer vectors, etc.

  • character: "a", "swc"
  • numeric: 2, 15.5
  • integer: 2L (the L tells R to store this as an integer)
  • logical: TRUE, FALSE
  • complex: 1+4i (complex numbers with real and imaginary parts)

R provides many functions to examine features of vectors and other objects, for example.

  • class() - what kind of object is it (high-level)?
  • typeof() - what is the object's data type (low-level)?
  • length() - how long is it? What about two-dimensional objects?
  • attributes() - does it have any metadata?
# Example
x <- "dataset"
typeof(x)
Output
[1] "character"
attributes(x)
Output
NULL
y <- 1:10
y
Output
 [1]  1  2  3  4  5  6  7  8  9 10
typeof(y)
Output
[1] "integer"
length(y)
Output
[1] 10
z <- as.numeric(y)
z
Output
 [1]  1  2  3  4  5  6  7  8  9 10
typeof(z)
Output
[1] "double"


R has many data structures. These include

  • atomic vector
  • list
  • matrix
  • data frame
  • factors


Source: The Carpentries, https://swcarpentry.github.io/r-novice-inflammation/13-supp-data-structures/index.html
Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License.