Holt-Winter forecast time range

Asked by Wenna

Hi there,

I'm trying to get forecasting data based on history data. But get a bit confused about the time range of holt-winter forecast functions, even after I reviewed the source code.

So for example, if the input series is from -5days until -1days, and then I call the holtWintersForecast, it will give me a series also from -5days to -1days. I know internally it also used data from one previous week. But from the code, I think the prediction also based on current series from -5days to -1days. That means we use the series plus previous a week's data to predict the series itself? It doesn't make any sense to predict some data we already know the value!

Please correct me if I'm wrong. If the training data is actually from one previous week to the start time of the series, and then giving this output, it would be much reasonable.

Thanks,
Wenna

Question information

Language:
English Edit question
Status:
Expired
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#1

Hello Wenna,

 Training data is actually from one previous week to the start time of the series, indeed. In current master and upcoming 1.0.2 release has a separate parameter for setting training interval.

Revision history for this message
Wenna (daiw) said :
#2

Hello Denis,

Thanks for response. If the training data end at start time of the series, it would be great. But maybe I failed to get it, from the code I still see training data include the predicted part of series. Series in previewList are sent to HoltWintersAnalysis. And the start time of each series has been changed to one week prior to the start time, but the end time remains the same.

newContext = requestContext.copy()
newContext['startTime'] = requestContext['startTime'] - timedelta(seconds=previewSeconds)
previewList = evaluateTokens(newContext, requestContext['args'][0])

I see the prediction result is very good and it can correctly predict any slight trend of real data. So I really want to make sure that testing data are not included in training data. Would you please help me to check it or explain a little bit about it?

Thanks very much!

Revision history for this message
Wenna (daiw) said :
#3

Also I have another question. It's indeed good to compare prediction result and actual data we have for model tuning and development, but it actually loses the total point of forecasting, if future forecasting is not supported. And it's actually very easy to extend. Would you consider add this feature?

For example, do HoltWintersForecast based on data in current one week, and forecast the next 1 day's data. If it's kind of hard to plot future data on the chart, simple forecasting value output would also be very good.

You mentioned that training interval could be changed in newest release. But the training interval is still prior to the start time, which is not flexible enough to use.

Thanks ahead for explaining!

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#4

>So I really want to make sure that testing data are not included in training data.
Sorry, I didn't understand your question right. Yes, start time of each series has been changed to one week prior to the start time.
And indeed, end time remains the same, so, your series are included in training set. Why shouldn't it?

>it actually loses the total point of forecasting, if future forecasting is not supported. And it's actually very easy to extend.
Indeed, that's why HW is not really popular in Graphite. Unfortunately, Graphite doesn't support future datapoints, it's quite hard to implement that.

Revision history for this message
Wenna (daiw) said :
#5

- Yes, start time of each series has been changed to one week prior to the start time.
And indeed, end time remains the same, so, your series are included in training set. Why shouldn't it?

Because we are actually trying to predict the series. In graphite we plotted the HW prediction and data together to compare. The future prediction is not supported as you said, so the prediction result is indeed within the same period of time as series.

Again an example, we input data from -24hours till now, and call hw on this series. Then in Graphite it would include data from -24hours-1week till now as training data, and output prediction from -24hours to now. That's not prediction because what we want to predict is the input of the prediction.

It makes more sense if training data come from -24hours -1week till -24hours. In this case, even though we know about the data from -24hours till now, we are still forecasting.

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#6

>The future prediction is not supported as you said, so the prediction result is indeed within the same period of time as series.
Yes.

>Again an example, we input data from -24hours till now, and call hw on this series. Then in Graphite it would include data from -24hours-1week till now as training data, and output prediction from -24hours to now.
Yes.

>That's not prediction because what we want to predict is the input of the prediction.
That's a little bit counter-intuitive, but more data for prediction makes prediction more precise. (it's not a truth in 100% of cases even for HW, though). As I said before Graphite doesn't support future datapoints, so, HW is only useful to have some anomaly detection, i.e. predict if your current (or almost current) data is OK or now. And in this case including current data in training set has a perfect sense to me.
But I'm not a statistician and my knowledge of HW is quite scarce. If you sure if shifting of end of training set could be useful you can implement shifting upper bound of bootstrap interval, similar to my patch - https://github.com/graphite-project/graphite-web/pull/1977 - e.g. introducing some new parameter to trigger new behavior.

Revision history for this message
Wenna (daiw) said :
#7

Thanks very much for your explanation and patience Denis.
Actually my use case is to predict the future datapoint. Aside from shifting the end of training set to the start point of series, I still need to implement that. Do you think it's possible or it's too hard for Graphite framework?
Thanks.

Revision history for this message
Launchpad Janitor (janitor) said :
#8

This question was expired because it remained in the 'Open' state without activity for the last 15 days.