Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Sunday, 18 August 2024

How to Decompose Time Series Data into Trend and Seasonality

 Time series decomposition involves thinking of a series as a combination of level, trend, seasonality, and noise components.

Decomposition provides a useful abstract model for thinking about time series generally and for better understanding problems during time series analysis and forecasting.

In this tutorial, you will discover time series decomposition and how to automatically split a time series into its components with Python.

After completing this tutorial, you will know:

  • The time series decomposition method of analysis and how it can help with forecasting.
  • How to automatically decompose time series data in Python.
  • How to decompose additive and multiplicative time series problems and plot the results.

    Time Series Components

    A useful abstraction for selecting forecasting methods is to break a time series down into systematic and unsystematic components.

    • Systematic: Components of the time series that have consistency or recurrence and can be described and modeled.
    • Non-Systematic: Components of the time series that cannot be directly modeled.

    A given time series is thought to consist of three systematic components including level, trend, seasonality, and one non-systematic component called noise.

    These components are defined as follows:

    • Level: The average value in the series.
    • Trend: The increasing or decreasing value in the series.
    • Seasonality: The repeating short-term cycle in the series.
    • Noise: The random variation in the series.

      Combining Time Series Components

      A series is thought to be an aggregate or combination of these four components.

      All series have a level and noise. The trend and seasonality components are optional.

      It is helpful to think of the components as combining either additively or multiplicatively.

      Additive Model

      An additive model suggests that the components are added together as follows:

      An additive model is linear where changes over time are consistently made by the same amount.

      A linear trend is a straight line.

      A linear seasonality has the same frequency (width of cycles) and amplitude (height of cycles).

      Multiplicative Model

      A multiplicative model suggests that the components are multiplied together as follows:

      A multiplicative model is nonlinear, such as quadratic or exponential. Changes increase or decrease over time.

      A nonlinear trend is a curved line.

      A non-linear seasonality has an increasing or decreasing frequency and/or amplitude over time.

      Decomposition as a Tool

      This is a useful abstraction.

      Decomposition is primarily used for time series analysis, and as an analysis tool it can be used to inform forecasting models on your problem.

      It provides a structured way of thinking about a time series forecasting problem, both generally in terms of modeling complexity and specifically in terms of how to best capture each of these components in a given model.

      Each of these components are something you may need to think about and address during data preparation, model selection, and model tuning. You may address it explicitly in terms of modeling the trend and subtracting it from your data, or implicitly by providing enough history for an algorithm to model a trend if it may exist.

      You may or may not be able to cleanly or perfectly break down your specific time series as an additive or multiplicative model.

      Real-world problems are messy and noisy. There may be additive and multiplicative components. There may be an increasing trend followed by a decreasing trend. There may be non-repeating cycles mixed in with the repeating seasonality components.

      Nevertheless, these abstract models provide a simple framework that you can use to analyze your data and explore ways to think about and forecast your problem.

      Automatic Time Series Decomposition

      There are methods to automatically decompose a time series.

      The statsmodels library provides an implementation of the naive, or classical, decomposition method in a function called seasonal_decompose(). It requires that you specify whether the model is additive or multiplicative.

      Both will produce a result and you must be careful to be critical when interpreting the result. A review of a plot of the time series and some summary statistics can often be a good start to get an idea of whether your time series problem looks additive or multiplicative.

      The seasonal_decompose() function returns a result object. The result object contains arrays to access four pieces of data from the decomposition.

      For example, the snippet below shows how to decompose a series into trend, seasonal, and residual components assuming an additive model.

      The result object provides access to the trend and seasonal series as arrays. It also provides access to the residuals, which are the time series after the trend, and seasonal components are removed. Finally, the original or observed data is also stored.

      These four time series can be plotted directly from the result object by calling the plot() function. For example:

      Let’s look at some examples.

      Additive Decomposition

      We can create a time series comprised of a linearly increasing trend from 1 to 99 and some random noise and decompose it as an additive model.

      Because the time series was contrived and was provided as an array of numbers, we must specify the frequency of the observations (the period=1 argument). If a Pandas Series object is provided, this argument is not required.

      Running the example creates the series, performs the decomposition, and plots the 4 resulting series.

      We can see that the entire series was taken as the trend component and that there was no seasonality.

      Additive Model Decomposition Plot

      Additive Model Decomposition Plot

      We can also see that the residual plot shows zero. This is a good example where the naive, or classical, decomposition was not able to separate the noise that we added from the linear trend.

      The naive decomposition method is a simple one, and there are more advanced decompositions available, like Seasonal and Trend decomposition using Loess or STL decomposition.

      Caution and healthy skepticism is needed when using automated decomposition methods.

      Multiplicative Decomposition

      We can contrive a quadratic time series as a square of the time step from 1 to 99, and then decompose it assuming a multiplicative model.

      Running the example, we can see that, as in the additive case, the trend is easily extracted and wholly characterizes the time series.

      Multiplicative Model Decomposition Plot

      Multiplicative Model Decomposition Plot

      Exponential changes can be made linear by data transforms. In this case, a quadratic trend can be made linear by taking the square root. An exponential growth in seasonality may be made linear by taking the natural logarithm.

      Again, it is important to treat decomposition as a potentially useful analysis tool, but to consider exploring the many different ways it could be applied for your problem, such as on data after it has been transformed or on residual model errors.

      Let’s look at a real world dataset.

      Airline Passengers Dataset

      The Airline Passengers dataset describes the total number of airline passengers over a period of time.

      The units are a count of the number of airline passengers in thousands. There are 144 monthly observations from 1949 to 1960.

      Download the dataset to your current working directory with the filename “airline-passengers.csv“.

      First, let’s graph the raw observations.

      Reviewing the line plot, it suggests that there may be a linear trend, but it is hard to be sure from eye-balling. There is also seasonality, but the amplitude (height) of the cycles appears to be increasing, suggesting that it is multiplicative.

      Plot of the Airline Passengers Dataset

      Plot of the Airline Passengers Dataset

      We will assume a multiplicative model.

      The example below decomposes the airline passengers dataset as a multiplicative model.

      Running the example plots the observed, trend, seasonal, and residual time series.

      We can see that the trend and seasonality information extracted from the series does seem reasonable. The residuals are also interesting, showing periods of high variability in the early and later years of the series.

      Multiplicative Decomposition of Airline Passenger Dataset

      Multiplicative Decomposition of Airline Passenger Dataset

      Further Reading

      This section lists some resources for further reading on time series decomposition.

      Summary

      In this tutorial, you discovered time series decomposition and how to decompose time series data with Python.

      Specifically, you learned:

      • The structure of decomposing time series into level, trend, seasonality, and noise.
      • How to automatically decompose a time series dataset with Python.
      • How to decompose an additive or multiplicative model and plot the results.

      Do you have any questions about time series decomposition, or about this tutorial?
      Ask your questions in the comments below and I will do my best to answer.

No comments:

Post a Comment

Connect broadband

How to Develop a Character-Based Neural Language Model in Keras

  A   language model   predicts the next word in the sequence based on the specific words that have come before it in the sequence. It is al...