Vectors and Simple Manipulations
This section introduces the basic operations on vectors, most of which are done element-wise. Please pay attention to the recycling of vectors (usually, recycling doesn't generate an error or a warning, so it is easy to miss if it was unintended), missing values (NA), and logical vectors often used for data subsetting.
Missing values
In some cases the components of a vector may not be completely
known. When an element or value is "not available" or a "missing
value" in the statistical sense, a place within a vector may be
reserved for it by assigning it the special value NA
.
In general any operation on an NA
becomes an NA
. The
motivation for this rule is simply that if the specification of an
operation is incomplete, the result cannot be known and hence is not
available.
The function is.na(x)
gives a logical vector of the same size as
x
with value TRUE
if and only if the corresponding element
in x
is NA
.
> z <- c(1:3,NA); ind <- is.na(z)
Notice that the logical expression x == NA
is quite different
from is.na(x)
since NA
is not really a value but a marker
for a quantity that is not available. Thus x == NA
is a vector
of the same length as x
all of whose values are NA
as the logical expression itself is incomplete and hence undecidable.
Note that there is a second kind of "missing" values which are
produced by numerical computation, the so-called Not a Number,
NaN
,
values. Examples are
> 0/0
or
> Inf - Inf
which both give NaN
since the result cannot be defined sensibly.
In summary, is.na(xx)
is TRUE
both for NA
and NaN
values. To differentiate these, is.nan(xx)
is only
TRUE
for NaN
s.
Missing values are sometimes printed as <NA>
when character
vectors are printed without quotes.