Can you clarify the retentions setting restrictions in the docs?

Asked by mikelehen on 2012-08-07

The docs (http://graphite.readthedocs.org/en/latest/whisper.html#archives-retention-and-precision) say:

To support accurate aggregation from higher to lower resolution archives, the number of points in a longer retention archive must be divisible by its next lower retention archive. For example, an archive with 1 data points every 60 seconds and retention of 120 points (2 hours worth of data) can have a lower-resolution archive following it with a resolution of 1 data point every 300 seconds for 1200 points, while the same resolution but for only 1000 points would be invalid since 1000 is not evenly divisible by 120.

So this is saying "1m:2h,5m:6000m" is valid, but "1m:2h,5m:5000m" is not valid? I haven't dug into how whisper aggregation works, but this surprises me. It doesn't seem like the number of data points should matter at all, but rather the resolution of the archives is what seems important to me. That is, since 5 minutes is a multiple of 1 minute, whisper should be able to take 5 data points from the first archive and turn it into a single data point in the second archive, without loss of precision. Why does the total number of data points matter?

Questions:
1. Are the docs wrong or am I missing something?
2. If "1m:2h,5m:5000m" is invalid, why doesn't carbon complain if I use it? Is this a bug? (carbon-cache starts successfully and I don't see any errors in console.log)

Suggestion: It sure would be handy to have some example retentions policies to choose from. I feel like this is probably an important / hard-to-change setting, but I don't have a good intuition about what a good policy might look like.

Thanks!
-Michael

Question information

Language:
English Edit question
Status:
Answered
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
2012-08-07
Last reply:
2012-08-17
Michael Leinartas (mleinartas) said : #1

Yeah.. I'm not sure what I was thinking when I wrote that. The docs are indeed incorrect in that point. The divisibility rule that *does* matter is that the lower precisions must be divisible by the higher precisions. That is, you can't have 1m:1y,3m:2y,10m:3y because the time bucket in the second retention cant be aggregated cleanly into the 3rd retention.

I'll get the docs corrected to show this properly

Jeff Blaine (jblaine-kickflop) said : #2

"Suggestion: It sure would be handy to have some example retentions policies to choose from. I feel like this is probably an important / hard-to-change setting, but I don't have a good intuition about what a good policy might look like."

Seconded. I've been scratching my head at a few screw-ups lately of my own.

I also second that carbon-cache should at least warn about improper retention policies.

Can you help with this problem?

Provide an answer of your own, or ask mikelehen for more information if necessary.

To post a message you must log in.