Added consolidate() function to webapp/graphite/render/functions.py

Bug #822560 reported by Dani
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Graphite
Fix Released
Undecided
Unassigned

Bug Description

As per question #166935 I've added a function called consolidate that works like summarize but instead of summing datapoints it averages them.

If you do a lot of rendering in browser side, having consolidated data should help improving browser performance.

Tags: webapp
Revision history for this message
Dani (dsolsona) wrote :
Revision history for this message
Aman Gupta (tmm1) wrote :

Consolidate is a very generic term, it does not mean sum or average. I would rather see this new function be called average.

Rather than duplicating all of summarize, it might make more sense to add a new optional parameter to the summarize function to change what function is used.

Revision history for this message
Dani (dsolsona) wrote :

Then

You're saying something like that:

&target=summarize(my.metric, "5min", "avg")

&target=summarize(my.metric, "5min") default does sum

Or move consolidate to average.

Revision history for this message
Aman Gupta (tmm1) wrote :

This patch on trunk adds the new flag and some basic docs:

diff --git a/webapp/graphite/render/functions.py b/webapp/graphite/render/functions.py
index fb979a3..76bda6a 100644
--- a/webapp/graphite/render/functions.py
+++ b/webapp/graphite/render/functions.py
@@ -1196,16 +1196,24 @@ def exclude(requestContext, seriesList, pattern):
   return [s for s in seriesList if not regex.search(s.name)]

-def summarize(requestContext, seriesList, intervalString):
+def summarize(requestContext, seriesList, intervalString, func='sum'):
   """
   Summarize the data into interval buckets of a certain size.

+ By default, the contents of each interval bucket are summed together. This is
+ useful for counters where each increment represents a discrete event and
+ retrieving a "per X" value requires summing all the events in that interval.
+
+ Specifying 'avg' instead will return the mean for each bucket, which can be more
+ useful when the value is a gauge that represents a certain value in time.
+
   Example:

   .. code-block:: none

- &target=summarize(counter.errors, "1hour") # errors per hour
+ &target=summarize(counter.errors, "1hour") # total errors per hour
     &target=summarize(nonNegativeDerivative(gauge.num_users), "1week") # new users per week
+ &target=summarize(queue.size, "1hour", "avg") # average queue size per hour
   """
   results = []
   delta = parseTimeOffset(intervalString)
@@ -1235,7 +1243,10 @@ def summarize(requestContext, seriesList, intervalString):
       bucket = buckets.get(bucketInterval, [])

       if bucket:
- newValues.append( sum(bucket) )
+ if func == 'avg':
+ newValues.append( float(sum(bucket)) / float(len(bucket)) )
+ else:
+ newValues.append( sum(bucket) )
       else:
         newValues.append( None )

Revision history for this message
Aman Gupta (tmm1) wrote :

diff --git a/webapp/graphite/render/functions.py b/webapp/graphite/render/functions.py
index fb979a3..76bda6a 100644
--- a/webapp/graphite/render/functions.py
+++ b/webapp/graphite/render/functions.py
@@ -1196,16 +1196,24 @@ def exclude(requestContext, seriesList, pattern):
   return [s for s in seriesList if not regex.search(s.name)]

-def summarize(requestContext, seriesList, intervalString):
+def summarize(requestContext, seriesList, intervalString, func='sum'):
   """
   Summarize the data into interval buckets of a certain size.

+ By default, the contents of each interval bucket are summed together. This is
+ useful for counters where each increment represents a discrete event and
+ retrieving a "per X" value requires summing all the events in that interval.
+
+ Specifying 'avg' instead will return the mean for each bucket, which can be more
+ useful when the value is a gauge that represents a certain value in time.
+
   Example:

   .. code-block:: none

- &target=summarize(counter.errors, "1hour") # errors per hour
+ &target=summarize(counter.errors, "1hour") # total errors per hour
     &target=summarize(nonNegativeDerivative(gauge.num_users), "1week") # new users per week
+ &target=summarize(queue.size, "1hour", "avg") # average queue size per hour
   """
   results = []
   delta = parseTimeOffset(intervalString)
@@ -1235,7 +1243,10 @@ def summarize(requestContext, seriesList, intervalString):
       bucket = buckets.get(bucketInterval, [])

       if bucket:
- newValues.append( sum(bucket) )
+ if func == 'avg':
+ newValues.append( float(sum(bucket)) / float(len(bucket)) )
+ else:
+ newValues.append( sum(bucket) )
       else:
         newValues.append( None )

Revision history for this message
Aman Gupta (tmm1) wrote :
Revision history for this message
Aman Gupta (tmm1) wrote :

Sorry for the triple post.

Note that my patch requires graphite trunk, where I just made another fix to summarize(): http://bazaar.launchpad.net/~graphite-dev/graphite/main/revision/510

Your copied consolidate function would require the same fixes.

Revision history for this message
Dani (dsolsona) wrote :

Thanks, that looks much cleaner.

Revision history for this message
chrismd (chrismd) wrote :

summarize() in trunk now has all these changes plus a few more

Changed in graphite:
status: New → Fix Committed
Revision history for this message
Michael Leinartas (mleinartas) wrote :

Released in 0.9.9

Changed in graphite:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.