Forecasting Daily Demand in Cash Supply Chains

Read this article. It is important because seasonal demand is addressed as the authors attempt to successfully predict demand. In your experience, what are some seasonal products or services that you purchase?

MATERIALS AND METHODS

Data: The data for this study is obtained from a midsize European bank with total customer deposits in 2009 of approximately EUR 25 billion and contains cash dispense records of a randomly selected region with 20 ATMs serving approximately 30000 people. Each time series comprises 759 daily cash withdrawals covering the period March 21st 2007 to April 17th 2009. The size of the region provides a reasonable tradeoff between tractability and data requirements. In comparison, the same number of ATMs serves on average 15000 people in Spain, 29000 people in France, 32500 in Germany, 64000 people in Finland or 26000 people in the euro area.

The data is analyzed first using a visual inspection of the time series plots. The data displays no long-run trend, but clearly a seasonal pattern and cyclical behavior in daily cash withdrawals. Most noticeable in all sample time series is that daily cash withdrawals seem to increase in December and drop in the months January and February. Daily, weekly and monthly patterns appear to be stable over time. However, each series seems to be affected by local events like festivals, promotions and technical failures of dispense devices, which may explain some of the unusual spikes and drops not related to calendar effects. These events alter the flow of people and consequently impact cash withdrawals. Figure 1 depicts daily cash withdrawals of one ATM in the data set and is exemplary for the other time series considered in this study.

Further evidence on seasonality and calendar effects in daily cash withdrawals is presented by a grouped box plot analysis. The box plot provides a graphical representation of summary statistics. The values are the minimum, 1st, 5th and 25th percentile, median, mean, 75th, 95th and 99th percentile and the maximum. Figure 2 shows day-of the-week, month-of-the-year and holiday effects. The box plot displays larger daily cash withdrawals during the months March, October and December. Cash withdrawals are lower for the months January, February and April as well as on public holidays.

Fig. 1: Daily cash withdrawals. Notes: ATM cash dispense record contains daily cash withdrawals in the time March 21st, 2007-April 17th, 2009


Fig. 2: Day-of-the-week, month-of-the-year and holiday effect


Fig. 3: Day-of the-month effect

In addition, the box plot reveals variations in daily cash withdrawals over the course of a week. Daily cash withdrawals tend to reach their peak on Thursdays and respective lows on Mondays with only small fluctuations occurring between Saturday and Tuesday. Furthermore, Figure 3 indicates a day-of-the-month effect with relatively more cash being dispensed towards the end of the month.

The findings seem to confirm earlier reports of seasonality and calendar effects with respect to daily demand for cash. Cabrero et al. identify weekly, monthly and annual seasonal patterns, which resemble a certain regularity in payments and behavior. Similarly, the trading day effect results in an increase in the amount of banknotes in circulation just before the weekend that reverses after the weekend. Likewise, the number of banknotes in circulation decreases before the middle of the month and increases towards the end of the month. The authors further report a strong impact of holidays on the demand for cash.

Brentnall et al. reach similar conclusions for the occurrence of cash withdrawals for individuals and report a weekly cyclical pattern as well as holiday effects. Although the occurrence rate of cash withdrawals seems to differ largely among individuals, the long-term rate is fairly constant. However, no information regarding the amounts per withdrawal is provided.

In order to forecast demand in cash supply chains, a framework is needed that captures seasonal and calendar effects of the individual time series as well as possible co-variability among the time series. Next, two approaches are contrasted and presented: a seasonal ARIMA model and a vector time series model.

Seasonal ARIMA model: The seasonal Box-Jenkins model considered in this study is generalizing the ARIMA model to time series containing stochastic seasonal periodic components. The multiplicative seasonal ARIMA model is said to be of the type SARIMA (p,d,q)×(P,D,Q)s:

\phi(\mathrm{B}) \Phi(\mathrm{B}) \nabla^{\mathrm{d}} \nabla_{\mathrm{s}}^{\mathrm{D}} \mathrm{y}_{\mathrm{t}}=\theta(\mathrm{B}) \Theta(\mathrm{B}) \mathrm{a}_{\mathrm{t}}                                                                                             (1)

Where:

\phi(\mathrm{B}) = The regular autoregressive polynomial of order p

\Phi(\mathrm{B}) = The seasonal autoregressive polynomial of order P

\theta(\mathrm{B}) = The regular moving average polynomial of order q

\Theta(B) = The seasonal moving average polynomial of order Q

The differentiating operator \nabla^{d} and seasonal differentiating operator \nabla_{\mathrm{s}}^{\mathrm{D}} eliminate non-seasonal and seasonal non-stationarity. The term a_{t} follows a white noise process and s defines the seasonal period.

\phi(B)=1-\sum_{i=1}^{p} \phi_{i} B^{i}                                                                                                              (2)

\Phi(B)=1-\sum_{i=1}^{P} \Phi_{i} B^{s i}                                                                                                           (3)

\theta(B)=1-\sum_{i=1}^{q} \theta_{i} B^{i}                                                                                                              (4)

\Theta(B)=1-\sum_{i=1}^{Q} \Theta_{i} B^{s i}                                                                                                          (5)

\nabla^{d}=(1-B)^{d}                                                                                                                           (6)

\nabla_{\mathrm{s}}^{\mathrm{D}}=\left(1-\mathrm{B}^{\mathrm{s}}\right)^{\mathrm{D}}                                                                                                                        (7)

The SARIMA model can be further generalized to incorporate exogenous variables. Such a model, often referred to as Box-Tiao ARIMA model, combines an intervention function and a seasonal ARIMA noise model:

\phi(\mathrm{B}) \Phi(\mathrm{B}) \nabla^{\mathrm{d}} \nabla_{\mathrm{s}}^{\mathrm{D}}\left[\mathrm{y}_{\mathrm{t}}-\left(\mathrm{w}_{\mathrm{o}}+\sum_{\mathrm{i}=1}^{\mathrm{r}} \mathrm{w}_{\mathrm{i}}(\mathrm{B}) \mathrm{x}_{\mathrm{it}}\right)\right]=\theta(\mathrm{B}) \Theta(\mathrm{B}) \mathrm{a}_{\mathrm{t}}(8)                                  (8)

where the coefficients \mathrm{W}_{\mathrm{i}} of the intervention function capture the deterministic effects for r exogenous variables x on y. Hence, the intervention function allows to model calendar effects that result from the shift in occurrence from year to year. Easter Sunday, which falls on a Sunday between March 22nd and April 25th, is such an example. Another calendar effect results from the change of the relative position of a particular date from year to year. For example, December 24th falls on different days of the week depending on the year.

Vector time series model: A system of dynamic simultaneous equations describes the joint data generation process represented by a VAR(p) model:

\mathrm{Y}_{\mathrm{t}}=\mathrm{W}_{0}+\sum_{\mathrm{i}=1}^{\mathrm{P}} \mathrm{A}_{\mathrm{i}} \mathrm{Y}_{\mathrm{t}-\mathrm{i}}+\mathrm{U}_{\mathrm{t}}                                                                                            (9)

Where:

\mathrm{A}_{\mathrm{i}} = (K×K) coefficient matrices

\mathrm{W}_{\mathrm{0}} = K-dimensional vector of intercept

\mathrm{Y}_{\mathrm{t}} = K-dimensional vector of endogenous variables

\mathrm{U}_{\mathrm{t}} = K-dimensional white noise error vector

The joint data generation process may capture additionally variables determined outside the system, such as calendar effects. Respectively, the VAR(p) model with exogenous variables is referred to as VARX(p,q):

\mathrm{Y}_{\mathrm{t}}=\mathrm{W}_{0}+\sum_{\mathrm{i}=1}^{\mathrm{p}} \mathrm{A}_{\mathrm{i}} \mathrm{Y}_{\mathrm{t}-\mathrm{i}} \sum_{\mathrm{j}=0}^{\mathrm{q}} \mathrm{B}_{\mathrm{j}} \mathrm{X}_{\mathrm{t}-\mathrm{j}}+\mathrm{U}_{\mathrm{t}}                                                                (10)

Where:

\mathrm{B}_{\mathrm{j}} = (K×M) coefficient matrices

\mathrm{X}_{\mathrm{t}} = M-dimensional vector of exogenous variables

The order of the VARX model concerning endogenous and exogenous variables is determined by the parameters p and q.

Next, the steps involved in specifying, estimating and testing of the models outline above are described. Adequate models are identified using the first 731 observations t=\{1, . ., 731\} of each time series. Consequently, the holdout sample covers the last 28 observations of each time series preserving two full years of data for in-sample analysis and model estimation.

Seasonal ARIMA model specification: Specification of the SARIMA (p,d,q)×(P,D,Q)s model is based on enumerative search following the procedure presented in Wei. The enumerative search in this study considers models of the type \mathrm{p} \in\{0,1,2\}, \quad \mathrm{q} \in\{0,1,2\} \mathrm{P} \in\{0,1,2\}, \mathrm{Q} \in\{0,1,2\}, \mathrm{d} \in\{0,1,2\}, \mathrm{D} \in\{0,1,2\} and the seasonal period s = 7. The order of integration is selected using non-seasonal and seasonal unit root tests. The model choice among the class of adequate models for each time series is based on the 1-step ahead forecasting error using the Mean Absolute Percentage Error (MAPE).

Table 1 depicts the single best performing SARIMA model for each of the 20 ATMs for the insample period t=\{1, . ., 731\}. All series, except ATM19 are seasonally integrated and require seasonal differencing of order one. In fact, the SARIMA (0,0,0)×(0,1,1)7 model is the single best performing model for 14 out of 20 ATMs. The selected models for the remaining series include additionally non-seasonal differencing of order one and moving average and autoregressive polynomials. The SARIMA (0,1,1)×(0,1,1)7 model, which resembles the well know airline model, is selected for ATM1 and ATM16.

Extensions of the SARIMA model include an intervention function and concern the impact of calendar effects on the demand for cash. Adequate models contain only parameters that are statistically significant (p<0.05).

Table 1: Best performing SARIMA models

ATM
SARIMA model
(p,d,q)×(P,D,Q)s
Count 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
(0,1,1)×(0,1,1)7 2 X X
(0,0,0)×(0,1,1)7 14 X X X X X X X X X X X X X X
(0,1,1)×(1,0,0)7 1
X
(0,1,2)×(0,1,1)7 2 X X
(2,1,0)×(0,1,1)7 1 X


Selection of the best performing model is again based on the MAPE. Exogenous variables for day of the week and month of the year allow to capture calendar effects that result from the change of the relative position from year to year, which otherwise would not be taken into account. Other calendar effects captured by exogenous variables identify public or bank holidays, the day before a public or bank holidays and end of the month.

Vector time series model specification: Specification of the vector time series model involves the selection of the VAR order. Obviously, the true order p of the observed data generation process is unknown. An intuitive choice of the VAR order p for the given empirical data with a weekly cyclical behavior is p = 7. The literature suggests a range of criteria to avoid fitting VAR models with unnecessarily large orders. The Hannan-Quinn Information Criterion (HQIC) and the Schwarz-Bayes Information Criterion (SBIC) provide consistent order selection criteria. Both, the HQIC and the SBIC suggest a VAR model of order p = 1. The Akaike Information Criterion (AIC) and the Final Prediction Error (FPE) criterion minimize the forecast MSE. However, the latter two suggest a VAR model of order p = 2. Both, AIC and FPE are known to asymptotically overestimate the true order. Hence, VAR models of order p = 1 are considered for further evaluation. The whiteness of the error terms is confirmed by a portmanteau test and LM test.

VARX models consider in addition the same calendar effects as specified for the SARIMA model with exogenous variables. Coupling effects, such as between the day before a public or bank holiday and the actual holiday are captured using separate calendar effects. Hence, a VARX(p,q) model with order q = 0 is considered, which reduces to a VAR(p) model with exogenous variables, sometimes referred to as VARX(p) model.

Subject to exogenous variables, three alternative models are considered to gain further insight to the impact of calendar effects on forecasting accuracy. The first model represents a model with all calendar effects. The second model only considers the day-of-the-week and month-of-the-year effects, while the third model is restricted to the day-of-the-week effect.