## Time Series Basics

A time series is a set of points that are ordered in time. Each time point is usually assigned an integer index to indicate its position within the series. For example, you can construct a time series by measuring and computing an average daily temperature. When the outcome of the next point in a time series is unknown, the time series is said to be random or "stochastic" in nature. A simple example would be creating a time series from a sequential set of coin flips with outcomes of either heads or tails. A more practical example is the time series of prices of a given stock.

When the unconditional joint probability distribution of the series does not change with time (it is time-invariant), the stochastic process generating the time series is said to be stationary. Under these circumstances, parameters such as the mean and standard deviation do not change over time. Assuming the same coin for each flip, the coin flip is an example of a stationary process. On the other hand, stock price data is not a stationary process.

This unit aims to use your knowledge of statistics to model time series data for random processes. Even though the outcome of the next time point is unknown, given the time series statistics, it should be possible to make inferences if you can create a model. The concept of a stationary random process is central to statistical model building. Since nonstationary processes require a bit more sophistication than stationary processes, it is important to understand what type of time series is being modeled. Our first step in this direction is to introduce the autoregressive (AR) model. This linear model can be used to estimate current time series values based on known past time series values. Read through this article which introduces the idea behind AR models and additionally explains the autocorrelation function (ACF).

### Sample ACF and Properties of AR(1) Model

### Stationary Series

As a preliminary, we define an important concept, that of a stationary series. For an ACF to make sense, the series must be a*weakly stationary*series. This means that the autocorrelation for any particular lag is the same regardless of where we are in time.

##### (Weakly) Stationary Series

A series is said to be**(weakly) stationary**if it satisfies the following properties:

- The mean is the same for all .
- The variance of is the same for all .
- The covariance (and also correlation) between and is the same for all at each lag = 1, 2, 3, etc.

##### Autocorrelation Function (ACF)

The denominator in the second formula occurs because the standard deviation of a stationary series is the same at all times.

The last property of a weakly stationary series says that the theoretical value of autocorrelation of particular lag is the same across the whole series. An interesting property of a stationary series is that, theoretically, it has the same structure forwards as it does backward.

### The First-order Autoregression Model

##### Assumptions

- , meaning that the errors are independently distributed with a normal distribution that has mean 0 and constant variance.
- Properties of the errors are independent of .
- The series is (weakly) stationary. A requirement for a stationary AR(1) is that . We'll see why below.

##### Properties of the AR(1)

Formulas for the mean, variance, and ACF for a time series process with an AR(1) model follow.

- The (theoretical)
**mean**of is

- The
**variance**of is

- The
**correlation**between observations time periods apart is

##### Note!

is the slope in the AR(1) model, and we now see that it is also the lag 1 autocorrelation.##### Pattern of ACF for AR(1) Model

The ACF property defines a distinct pattern for autocorrelations. For a positive value of , the ACF exponentially decreases to 0 as the lag increases. For negative , the ACF also exponentially decays to 0 as the lag increases, but the algebraic signs for the autocorrelations alternate between positive and negative.*Note!*

The tapering pattern:

*Note!*

The alternating and tapering pattern.

##### Example 1-3

In Example 1 of Lesson 1.1, we used an AR(1) model for annual earthquakes in the world with a seismic magnitude greater than 7. Here's the sample ACF of the series:Lag. | ACF |
---|---|

1. | 0.541733 |

2. | 0.418884 |

3. | 0.397955 |

4. | 0.324047 |

5. | 0.237164 |

6. | 0.171794 |

7. | 0.190228 |

8. | 0.061202 |

9. | -0.048505 |

10. | -0.106730 |

11. | -0.043271 |

12. | -0.072305 |

The sample autocorrelations taper, although not as fast as they should for an AR(1). For instance, theoretically, the lag 2 autocorrelation for an AR(1) = squared value of lag 1 autocorrelation. Here, the observed lag 2 autocorrelation = .418884. That's somewhat greater than the squared value of the first lag autocorrelation (.541733

^{2}= 0.293). But, we managed to do okay (in Lesson 1.1) with an AR(1) model for the data. For instance, the residuals looked okay. This brings up an important point – the sample ACF will rarely fit a perfect theoretical pattern. A lot of the time, you just have to try a few models to see what fits.

We'll study the ACF patterns of other ARIMA models during the next three weeks. Each model has a different pattern for its ACF, but in practice, the interpretation of a sample ACF is not always so clear-cut.

**reminder**: Residuals usually are theoretically assumed to have an ACF that has correlation = 0 for all lags.

##### Example 1-4

Here's a time series of the daily cardiovascular mortality rate in Los Angeles County, 1970-1979There is a slight downward trend, so the series may not be stationary. To create a (possibly) stationary series, we'll examine the

**first differences**. This is a common time series method for creating a de-trended series and, thus, potentially a stationary series. Think about a straight line – there are constant differences in average for each change of 1 unit in .

The time series plot of the first differences is the following:

The following plot is the sample estimate of the autocorrelation function of 1st differences:

Lag. | ACF |
---|---|

1. | -0.506029 |

2. | 0.205100 |

3. | -0.126110 |

4. | 0.062476 |

5. | -0.015190 |

*Note!*

The lag 2 correlation is roughly equal to the squared value of the lag 1 correlation. The lag 3 correlation is nearly exactly equal to the cubed value of the lag 1 correlation, and the lag 4 correlation nearly equals the fourth power of the lag 1 correlation. Thus an AR(1) model may be a suitable model for the first differences .

Let denote the first differences, so that and . We can write this AR(1) model as

### Appendix Derivations of Properties of AR(1)

Generally, you won't be responsible for reproducing theoretical derivations, but interested students may want to see the derivations for the theoretical properties of an AR(1).The algebraic expression of the model is as follows:

##### Assumptions

- , meaning that the errors are independently distributed with a normal distribution that has mean 0 and constant variance.
- Properties of the errors are independent of .
- The series is (weakly) stationary. A requirement for a stationary AR(1) is that . We'll see why below.

##### Mean

##### Variance

### Autocorrelation Function (ACF)

To start, assume the data have mean 0, which happens when , and . In practice, this isn't necessary, but it simplifies matters. Values of variances, covariances, and correlations are not affected by the specific value of the mean.Let , the covariance observations time periods apart (when the mean = 0). Let = correlation between observations that are time periods apart.

*Covariance and correlation between observations one time period apart*

*Covariance and correlation between observations*time periods apart