In a previous tutorial, we discussed the basics of time series and time series analysis. We looked at how to convert data into time series data and analyze this in R. In this tutorial, we”ll go into more depth and look at time series decomposition.
We’ll firstly recap the components of time series and then discuss the moving average concept. After that we’ll focus on two time series decompositions – a simple method based on moving averages and the local regression method.
You can download the data files for this tutorial here.
Components of Time Series
We know that there are four time series components, out of which trend and seasonality are the main components. We can assume two models for time series – the additive model and the multiplicative model. When we assume the additive model, the data at any period t, that is ‘Yt’, is the addition of the trend ‘Tt’, seasonal ‘St’ and error ‘Rt’ components at period t. Alternatively, in a multiplication model, we assume that Yt is the multiplication of different components Tt, St and Rt. When the magnitude of seasonal fluctuations or the variation around a trend cycle does not vary with the level of time series, the additive model is more appropriate than the multiplicative model.
Yt : Time series value at period t
St : Seasonal component at period t
Tt : Trend-cycle component at period t
Rt : Remainder (or irregular or error) component at period t
Alternatively, a multiplicative model would be written as
Understanding Moving Averages
In time series analysis, the moving average method is a common approach for estimating trends in time series. Let us understand how moving averages are calculated. Moving averages are averages calculated for consecutive data from overlapping subgroups of fixed length. Moving averages smoothen the time series by filtering out random fluctuations. The period of moving average depends on the type of data. For non-seasonal data, a shorter length, typically a 3 period or a 5-period moving average, is considered. For seasonal data, the length equals the number of observations in a season, 12 for monthly data, 4 for quarterly data, etc. While calculating a moving average of period 3, the first 2 moving averages are not calculated. The moving average for day 3 is the average of values at day 1,2 and 3. The moving average for day 4 is the average of values at day 2,3 and 4. Similarly, for a period 5, the first four moving averages are not calculated.
Time Series Decomposition – Simple Method
Let us now try to understand the first technique of time series decomposition. Decomposition is a statistical method that deconstructs a time series. The three basics steps to decompose a time series using the simple method are 1) Estimating the trend 2) Eliminating the trend 3) Estimating Seasonality. To find the trend, we obtain moving averages covering one season. We then eliminate the trend component from the original time series by calculating Yt minus Tt, where Tt is the trend value. Lastly, to estimate the seasonal component for a given time period, we average the de-trended values for that time period. We then adjust these seasonal indexes to ensure that they add to zero. The remainder component is calculated by subtracting the estimated seasonal and trend-cycle components.
Let’s consider an example. Suppose we have monthly time series data for three years 2014, 2015 and 2016. First, calculate the moving average. We consider 13 values for capturing the trend in the yearly data – that is – we consider the previous 6 months, the following 6 months, and the current month to calculate moving average for the current month. This gives us the trend component. After doing that, we remove the trend component Tt from the original time series Yt. Finally, the seasonal index for July is the average of all the de-trended July values in the data, that is the average de-trended for July 2014, July 2015 and July 2016. Note that this moving averages approach is slightly different from what we discussed earlier as it uses pre and post-data values for a given period moving average.
Let’s consider a case study of monthly sales data for three years from 2013 to 2015. The objective of the study is to apply decomposition methods and analyze each component of the time series separately. We have 36 records with year, month, and sales as the variables of the study.
Here is a snapshot of the data. We have three columns representing three variables Year and month are time variables, whereas sales is our time series of interest.
Time Series Decomposition in R – Simple Method
Let’s import our data using the read.csv function. As discussed in the previous tutorial, we’ll use the ‘ts’ function in R to convert a variable from a data frame to a time series object. We specify the x-axis scale, that is the year and month in our data, as the start and end argument. frequency=12 tells R that we have monthly data. Once we set our data frame to a time series object, we perform a classical seasonal decomposition through moving average by using the decompose function. We then plot the decomposed data using the plot function in R.
# Time Series Decomposition
salesdata<-read.csv("Sales Data for 3 Years.csv",header=TRUE) salesseries<-ts(salesdata$Sales,start=c(2013,1), end=c(2015,12), frequency=12)
ts() converts a column from a data frame to a simple time series object.
start= and end= arguments specify the x-axis scale. (Year and month in this case).
frequency=12 tells that we have monthly data
decompose() performs classical seasonal decomposition through moving averages.
plot() of decompose object gives a 4-level visual representation
R uses the default additive time series model to decompose the data. To use the multiplicative model, we specify type equals to multiplicative in the decompose function. We don’t calculate the trend with the first and last few values. The seasonal component repeats from year to year.
Here we can see a very nice clean visualization showing our original time series, trend, seasonal and random components.
We can view each component of the time series separately by using the object name and the ($) operator. Since the trend has not been estimated for the first few and last values, we can see na’s in the output. This will be reflected in the random component as well.
#Analysing the decompose() object. Each component can be separately viewed by using the $ operator
By subtracting the seasonal component from the original time series we can obtain a seasonally adjusted time series as shown in this plot here.
# Doing Seasonal Adjustment
seasadj <- salesseries - decomp$seasonal plot(seasadj)
Time Series Decomposition – Local
Regression Method (LOESS)
Let us now discuss the second technique of time series decomposition called the local regression method abbreviated as LOESS. It is a non-parametric generalization of ordinary least squares regression. The fitting technique does not require a prior specification of the relationship between the dependent and independent variables. The seasonal and trend decomposition using the LOESS method abbreviated as STL works by iterating through the smoothing of the seasonal and trend components. STL is a procedure for regular time series, so that the design points of the smoothing operation are equally spaced.
Time Series Decomposition in R – LOESS Method
In R, the stl function is used to carry out decomposition by LOESS method. The s.window specifies the seasonal window by using either a character string ‘periodic’ or the span (in lags) of the loess window for the seasonal extraction. The t.window specifies the trend window for trend extraction, which should be odd number or kept as the default. Here, we have not specified t.window so R will use the default value. We will extract the seasonal window of the data by specifying s.window equals to periodic and then plot this decomposed time series using the plot function.
#Local Regression Method for Seasonal Decomposition
stl() in used to carry out decomposition by loess method.
t.window= specifies trend window. It takes the span (in lags) of the loess window for trend extraction, which should be odd. If NULL, the default, nextodd(ceiling((1.5*period) / (1-(1.5/s.window)))) is taken
s.window= specifies seasonal window. It uses either the character string “periodic” or the span (in lags) of the loess window for seasonal extraction
The data shows the trend and seasonality components of the time series. The trend values are estimated for all periods.
Let us recap what we learnt. In today’s lesson, we learnt how to decompose a time series using the simple and the LOESS method. The loess method is used for estimating non-linear relationships. The decompose function carries out simple seasonal decomposition whereas the stl function is used for doing the LOESS decomposition.