Summing-up data points in first retention type, not just lower-precision retentions

Asked by Elad Rosenheim

Hi,
I'm trying to count events (such as logins etc.) - data that should be summed-up in buckets, not averaged.

Seems that aggregationMethod is in effect when moving data to lower-precision buckets, however the buckets of the first type of retention always hold average data (if I send multiple stats during the time interval of the bucket) - or is data actually being replaced?

If, for exmaple, my schema is set to 10s:10m,1m:1y and aggregation is - sum, the 10s buckets would always contain average data.

This is in line with the docs, but IMHO is a kind of inconsistent state, as some buckets hold average data and some summarized data.
Also, I'm practically not sure what's the best way to handle such counts.

Please advise...
Thanks,
Elad

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Nicholas Leskiw
Solved:
Last query:
Last reply:
Revision history for this message
Best Nicholas Leskiw (nleskiw) said :
#1

You meet to aggregate the data before sending to Graphite.

 e.g. you have a 1min. bucket, send one value with the #of logins in the past minute.

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#2

You will also want to make sure storage-aggregation.conf is configured to sum and not avg.

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#3

"You will need..." in first message...damn autocorrect...

Revision history for this message
Elad Rosenheim (elad-rosenheim) said :
#4

Thanks Nicholas Leskiw, that solved my question.

Revision history for this message
Elad Rosenheim (elad-rosenheim) said :
#5

Ok, so here's my understanding. I hope it would benefit others as well:

As I'm using StatsD to aggregate counts, I need to make sure that its flushInterval >= highest-resolution retention that I've defined, so it never flushes twice in the same bucket.

With multiple client machines, each having their own local StatsD, I should prefix metrics by hostname (so two StatsD instances don't end up reporting in same bucket), and may use carbon-aggregator for efficient retrieval of 'all hosts' metrics.

Revision history for this message
Elad Rosenheim (elad-rosenheim) said :
#6

And we've all seen far worse autocorrect fiascos, so thanks and no worries ;-)