Visit with Dr. James G. Scott

So since Jasper is a student, we took advantage of that and visited one of the statistics professor at UT Austin, Dr. James G. Scott. He generously offered to meet with us during his office hour. To give a brief intro of him, he is an assistant professor in Department of Information, Risk, and Operations Management of McCombs school of business whose research focuses on the core issues of model choice, multiple testing, variable selection, and latent-feature extraction. Click here to see more of his bio.

He suggested that we start with a simple linear function with dummy variables. The dummy variables would be to comprehend different signals in the data – such as day of the week, month of the year, weather, sports game, festivals, and other local events that my affect the revenue performance of restaurants. Once we would build a simple table full of coefficients for each dummy variables, the calculation would be very easy and quick, unlike Holt Winters that requires heavy computational cycles.

Efficient way to find the best fitting Alpha, Beta and Gamma?

So now that I have the initial values for the three elements – level, trend and seasonality, I am onto the most challenging part of Holt Winters algorithm. That is to calculate $latex \alpha$, $latex \beta$, and $latex \gamma$ that yield the least RMSE. With my limited knowledge of programming (if anyone who happens to stumble upon this blog knows a better way, please kindly suggest in the comment section), I decided to recursively find the best $latex \alpha$, $latex \beta$, and $latex \gamma$. From the get go, it does not seem very efficient. Because even if I do not want any more granularity below nearest one percent, I will have to iterate at least 1.030301 million times ($latex 101^3$). But if the speed is not an issue (though it was for us), this does the job.

***You can get more efficiency by using multi-threading and the ladder and jar algorithm but it still is not in the practical range of speed for web service. Seth Hoenig, our super genius developer, converted my Python code into C and that’s when the speed became so fast that it’s almost instantaneous.

In order to calculate $latex \alpha$, $latex \beta$, and $latex \gamma$, I first have to set up algorithms that calculate subsequent values of level, trend and seasonality and smoothed values.

  1. Level: $latex s_t = \alpha \frac{x_t}{c_{t-L}} + (1 – \alpha)(s_{t-1} + b_{t – 1})$
  2. Trend: $latex b_t = \beta (s_t – s_{t-1}) + (1- \beta) b_{t-1}$
  3. Seasonality: $latex c_t = \gamma \frac{x_t}{s_t} + (1 – \gamma) c_{t-L}$
  4. Smoothed values: $latex F_{t + m} = (s_t + mb_t) c_t$

Function to calculate level, trend, seasonality and smoothed values

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
def calculate_smoothed_values(data, cycle, level_init_value, trend_init_value, season_list, alpha, beta, gamma):
 
    smoothed_value_list = []
    previous_level = level_init_value
    previous_trend = trend_init_value
    previous_seasonality = season_list[-1]
 
    new_season_list = [0] * len(season_list)
    for i in xrange(len(data) - cycle):
        if i < len(season_list):
            new_season_list[i] = season_list[i]
 
        level       = (alpha * data[i + cycle] / new_season_list[i]) + ((1 - alpha)*(previous_level + previous_trend))
 
        trend       = (beta * (level - previous_level)) + ((1 - beta) * previous_trend)
 
        seasonality = (gamma * data[i + cycle] / level) + ((1 - gamma) * previous_seasonality)
 
        smoothed_value = (level + trend) * seasonality
 
        previous_level = level
        previous_trend = trend
        previous_seasonality = seasonality
 
        new_season_list.append(seasonality)
 
        smoothed_value_list.append(smoothed_value)
 
 
    return smoothed_value_list

Then I call the above function in an recursive function with different alpha, beta and gamma combos and calculate RMSE for each case. (rmse function that’s being used below is a simple function that calculates root mean squared errors)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def least_rmse_combo(data, cycle, level_init_value, trend_init_value, season_init_list):
    '''
    Try different permutation of alpha, beta and gamma to get smoothed data of Level, Trend and Seasonality with the least RMSE
    '''
    least_rmse = float("inf")
 
    lrc = [0, 0, 0]
    for a in xrange(101):
        alpha = a/100.0
        for b in xrange(101):
            beta = b / 100.0
            for g in xrange(101):
                gamma = g / 100.0
                smoothed_values = calculate_smoothed_values(data, cycle, level_init_value, trend_init_value, season_init_list, alpha, beta, gamma)
                rmse_value = rmse(smoothed_values, data[cycle:])
 
                if rmse_value < least_rmse:
                    lrc = [alpha, beta, gamma]
                    least_rmse = rmse_value
    return lrc

Exponential smoothing. (n.d.). In Wikipedia. Retrieved Mar 30, 2013, from
http://en.wikipedia.org/wiki/Exponential_smoothing

Future ready!

Now what we have left is the last step in this process – calculate the forecast values. The formula is rather simple.

$latex F_{t + m} = (s_t + mb_t) c_{t – L + ((m – 1) mod L)}$

In Python,

1
2
3
4
5
6
7
8
9
10
11
12
def calculate_forecast_value(last_level, last_trend, last_season_cycle, forecast_period):
    '''
    Given the last values of level and trend and the last full cycle of seasonality index, calculates forecast data for the given period
    '''
    forecast_list = []
    cycle = len(last_season_cycle)
    for i in xrange(forecast_period):
        m = i % cycle
        forecast = (last_level + (i * last_trend)) * last_season_cycle[m]
        forecast_list.append(forecast)
 
    return forecast_list

Exponential smoothing. (n.d.). In Wikipedia. Retrieved Mar 30, 2013, from
http://en.wikipedia.org/wiki/Exponential_smoothing

The first impression matters!

It is very important to set the initial values of level, trend and seasonality correctly, as they play a critical role in calculating the subsequent values. There are multiple ways to set the initial values, but I used the most commonly used methods.

The initial value of level is usually set to be equivalent to the first value of sample data:  $latex s_1 = x_0$.

1
2
3
def get_initial_level(data):
    level_init_value    = data[0]
    return level_init_value

So far so good! But the initial value of trend is a bit more involved. $latex b_0 = \frac{1}{L} (\frac{x_{L+1} – x_2}{L} + \frac{x_{L+2} – x_2}{L} + .. + \frac{x_{L+L} – x_L}{L})$ with the cycle denoted as L. If your data is monthly, the complete cycle would be 12 months and thus L would be 12. The trend can be thought of as the slope of the equation. Therefore, positive trend means gradual growth over time and negative means decline.

1
2
3
def get_initial_trend(data, cycle):
    trend_init_value    = sum([data[i+cycle] - data[i] for i in xrange(cycle)]) / float(cycle)
    return trend_init_value

Last but not least, the mathematical expression of the initial value of seasonality is as follows: $latex c_i = \frac{1}{L} \sum_{j=1}^N \frac{x_{L(j-1)+i}}{A_j}$, where $latex A_j$ is the average value of x in the jth cycle of the data. N is the number of complete cycles (L) in the data set. If you have 36 monthly data, your L value would be 12 and N value 3.

1
2
3
def get_initial_seasonality(data, cycle):
    season_init_list    = [data[i] / (reduce(lambda x, y: x + y, data[:cycle]) / float(cycle)) for i in xrange(cycle)]
    return season_init_list

Exponential smoothing. (n.d.). In Wikipedia. Retrieved Mar 30, 2013, from
http://en.wikipedia.org/wiki/Exponential_smoothing

How accurate can it be?

As sad as it sounds, I am the main data scientist of our brainchild – Overlotus.com. I dug up the CFA Level II books to refresh my memory on the simple and multiple regression, coefficient of determination, significance of regression coefficients, time-series and all the other sorts of good stuff.

My main goal is to “predict the next 2 weeks of daily revenue for a restaurant based on historical data, weather, sports game, holiday, and other relevant external factors.”

Our friend who runs a local Japanese restaurant generously gave us the past years revenue data which we will use to develop the prediction model. Though one data set may not validate the accuracy of our algorithm, it is a start. So the first formula we decided to experiment with is Holt-Winters triple exponential smoothing. By the way, there are two basic types of time-series forecasting techniques: Autoregressive and Moving Average. And the mixture of the two types is ARMA (Auto Regressive Moving Average). Holt-Winters takes into account level, trend and seasonality and is thus called “triple exponential smoothing.” Exponential smoothing is a type of moving average which applies exponentially increasing weights to the most recent past data.

There are mainly three steps involved to forecasting using the Holt Winters method.

  1. Initialize level, trend and seasonality
  2. FindĀ $latex \alpha$, $latex \beta$, and $latex \gamma$ values that minimize the RMSE (Root Mean Square Error) between smoothed values and the sample data
  3. Calculate subsequent values of level, trend and seasonality using $latex \alpha$, $latex \beta$, and $latex \gamma$ above
  4. Forecast beyond sample data

I am going to go through each step in each individual post and show how I converted the complex mathematical formulae into Python code.