ARIMA in Python

Use this tutorial to implement an ARIMA model and make forecasts. General reference is made to a data set, but you must obtain your own CSV file for actual data. A great source for data scientists is Kaggle. With your current expertise, you should be able to search for and download a .csv file with stock price data that is not too large (<50MB). Additionally, as illustrated in the tutorial, you can apply pandas to extract a column of data.

The ARIMA model allows us to forecast a time series using the series' past values.

A time series is a collection of data points collected at constant time intervals. Time series are used to forecast future values based on previous values.

A stationary time series is one whose statistical properties (mean, variance, autocorrelation, etc.) are all constant over time. A non-stationary series is one whose statistical properties change over time.

ARIMA model

An ARIMA model is characterized by three terms: p, d, and q.

• p is the order of the AR part of the model term.
• q is the order of the MA term.
• d is the number of differencing required to make the time series stationary.

ARIMA model in python

In this example, we will predict the next 10 days of stock prices from a given data of 100 days.

Step 1

Import the relevant libraries to perform time series forecasting:
import numpy as np, pandas as pdimport statsmodels.tsa.stattools as tsfrom statsmodels.tsa.arima_model import ARIMAimport matplotlib.pyplot as plt

Step 2
Upload the relevant dataset using  pandas.read_csv()  method:

file = pd.read_csv("data.csv")// prices is a field in .csv file containing all stock prices.stock_price = df['prices']

You can view this data in stock_price using the  plt.plot()  method:
plt.plot(stock_price)

Below is the code to output the variation in stock price for the last 100 days. It also contains a  .csv  file with sample stock prices.

Step 3
Initialize the ARIMA model and set the values of p, d, and q as 1, 1, and 2.
model = ARIMA(stock_price, order=(1,1,2))model_fit = model.fit(disp=0)// summary provides a detailed summary of the time series modelprint(model_fit.summary())

Step 4
Let's predict the next 10 values and plot them on a graph:
pred = model_fit.predict(100,109,typ='levels')// 100-109, refers to the next 10 values after the value at 99th index. newarr = []for i in price:  newarr.append(i)for x in pred:  newarr.append(x)plt.plot(newarr)

Complete code

 main.py
import numpy as np, pandas as pdimport matplotlib.pyplot as pltimport statsmodels.tsa.stattools as tsfrom statsmodels.tsa.arima_model import ARIMAdf = pd.read_csv("data.csv")# prices is a field in .csv file containing all stock prices.stock_price = df['prices']plt.plot(stock_price)# ARIMA model model = ARIMA(stock_price, order=(1,1,2))model_fit = model.fit(disp=0)# summary provides a detailed summary of the time series modelprint(model_fit.summary()) # Predicting valuespred = model_fit.predict(100,109,typ='levels')# 100-109, refers to the next 10 values after the value at 99th index. # newarr array combines the predicted the stock values in one arraynewarr = []for i in price:  newarr.append(i)for x in pred:  newarr.append(x)plt.plot(newarr)

 data.csv

"prices"888485858485838588899199104112126138146151150148147149143132131139147150148145140134131131129126126132137140142150159167170171172172174175172172174174169165156142131121112104102999995888484878988858689919194101110121135145149156165171175177182193204208210215222228226222220