Vectors and Simple Manipulations

This section introduces the basic operations on vectors, most of which are done element-wise. Please pay attention to the recycling of vectors (usually, recycling doesn't generate an error or a warning, so it is easy to miss if it was unintended), missing values (NA), and logical vectors often used for data subsetting.

Missing values

In some cases the components of a vector may not be completely known. When an element or value is "not available" or a "missing value" in the statistical sense, a place within a vector may be reserved for it by assigning it the special value NA. In general any operation on an NA becomes an NA. The motivation for this rule is simply that if the specification of an operation is incomplete, the result cannot be known and hence is not available.

The function is.na(x) gives a logical vector of the same size as x with value TRUE if and only if the corresponding element in x is NA.

> z <- c(1:3,NA);  ind <- is.na(z)

Notice that the logical expression x == NA is quite different from is.na(x) since NA is not really a value but a marker for a quantity that is not available. Thus x == NA is a vector of the same length as x all of whose values are NA as the logical expression itself is incomplete and hence undecidable.

Note that there is a second kind of "missing" values which are produced by numerical computation, the so-called Not a Number, NaN, values. Examples are

> 0/0

or

> Inf - Inf

which both give NaN since the result cannot be defined sensibly.

In summary, is.na(xx) is TRUE both for NA and NaN values. To differentiate these, is.nan(xx) is only TRUE for NaNs.

Missing values are sometimes printed as <NA> when character vectors are printed without quotes.