Tags

, , , , , ,

ARIMA FORECASTING TUTORIAL (Part 1)

#Download csv file here:

DOWNLOAD FILE

> weather<- read.csv(“M4_not_cleaned.csv“, header=TRUE)

> names(weather)

> head(weather, n=10)

> str(weather)

# Cleaning the rows with NaN values

> weather.clean<-weather[complete.cases(weather), ]

#but one extra column row.names is generated here, so remove that column by setting it to NULL:

> row.names(weather.clean) <- NULL

# i would suggest you to install the “tseries” and “forecast” packages, if not already installed on your system

> install.packages(“tseries”, “forecast”, dependencies=TRUE)

#once installed, load the packages in workspace

> library(“forecast”, “tseries”)

#plot only Atmospheric Pressure Data

> atm.prs<- weather.clean$AtmosphericPressure_mb

> plot(atm.prs)

#Storing data in a column (say Atmospheric Pressure) as time series daily data:

> atm.prs <- ts(weather.clean$AtmosphericPressure_mb, frequency=7)

##When the time series is long enough to take in more than a year, then it may be necessary
##to allow for annual seasonality as well as weekly seasonality.
##In that case, a multiple seasonal model such as TBATS is required.

> atm.prs<- msts(weather.clean$AtmosphericPressure_mb, seasonal.periods=c(7,365.25))

#then you can plot time series of that data:

> plot.ts(atm.prs)

#Since someone asked me about seasonal decomposition, let me explain why we do that:
#if time series consists of a trend component, a seasonal component and an irregular component
# then Decomposing is required
#in this series I saw no irregularity in the pattern (of plot), I would suggest not to perform seasonal decomposition of that series, although which I will do later in this tutorial for your understanding on decomposition.

# Perform auto arima on that time series data using auto.arima() function of “forecast” package

> auto.arima(atm.prs)

#otherwise, you can use these commands, but these will make R become very slow and unresponsive until calculations are completed

> atm.fit <- tbats(atm.prs)

#forecast it:

> atm.fc <- forecast(fit)

#plot the forecast

> plot(atm.fc)

#seasonal decomposition, use the decompose() function in R

> atm.prs.dc <- decompose(atm.prs)

# to print the first 10 estimated values of the seasonal component, use the following command:

> head(atm.prs.dc$seasonal, n=10)

#plot the decomposed series

> plot(atm.prs.dc)

#now it is upto you, to analyse the decomposed series and plots

#further, if you want, you can Seasonally Adjust the original series and plot

> atm.prs.dc.sa <- atm.prs – atm.prs.dc$seasonal
> plot(atm.prs.dc.sa)

#since, using auto.arima() function, we already have got values for (p,d,q) terms (1,1,2):

# Series: atm.prs
# ARIMA(1,1,2)

# Coefficients:
# ar1 ma1 ma2
# 0.8278 -0.6008 0.1077
# s.e. 0.0062 0.0082 0.0064

# sigma^2 estimated as 0.5137: log likelihood=-33131.92
# AIC=66271.83 AICc=66271.83 BIC=66305.14

# I will explain you further, in the next tutorial, how to go ahead in ARIMA forecasting and what are the rules for the best prediction.

Advertisements