carbon-aggregator.py not aggregating.

Asked by vimau on 2012-06-07

I have written an aggregation rule to do sum on the incoming metrics from different hosts. Somehow this is not working. I am quite a newbie and please forgive my question if this is quite stupid :-) Any help would be really great.

Do i need to have carbon-relay running to do aggregation? Who forwards the metric from carbon-cache to carbon-aggregator? Who should be connecting to the port 2023/2024?

[vimau conf]$ cat aggregation-rules.conf
stats.gauges.qps.widget.all.nginx.access (60) = sum stats.gauges.qps.widget.*.nginx.access
[vimau conf]$

My metric name is correct because i am able to see the whisper file properly (when i formulate the path by replacing '.'s with '/')
../storage/whisper/stats/gauges/qps/widget/domU-x.x.x.x_compute-1_internal/nginx/access.wsp
../storage/whisper/stats/gauges/qps/widget/ip-x.x.x.x_internal/nginx/access.wsp

[vimau conf]$ grep -v ^# carbon.conf | grep -v '^$'
[cache]
GRAPHITE_ROOT = /opt/graphite
STORAGE_DIR = /opt/graphite/storage/
LOCAL_DATA_DIR = /opt/graphite/storage/whisper/
WHITELISTS_DIR = /opt/graphite/storage/lists/
CONF_DIR = /opt/graphite/storage/conf/
LOG_DIR = /opt/graphite/storage/log/
PID_DIR = /opt/graphite/storage/
LOCAL_DATA_DIR = /opt/graphite/storage/whisper/
USER =
MAX_CACHE_SIZE = inf
MAX_UPDATES_PER_SECOND = 500
MAX_CREATES_PER_MINUTE = 50
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
ENABLE_UDP_LISTENER = False
UDP_RECEIVER_INTERFACE = 0.0.0.0
UDP_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
USE_INSECURE_UNPICKLER = False
CACHE_QUERY_INTERFACE = 0.0.0.0
CACHE_QUERY_PORT = 7002
USE_FLOW_CONTROL = True
LOG_UPDATES = False
WHISPER_AUTOFLUSH = False
[relay]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2013
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2014
RELAY_METHOD = rules
REPLICATION_FACTOR = 1
DESTINATIONS = 127.0.0.1:2004
MAX_DATAPOINTS_PER_MESSAGE = 500
MAX_QUEUE_SIZE = 10000
USE_FLOW_CONTROL = True
[aggregator]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2023
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2024
DESTINATIONS = 127.0.0.1:2004
REPLICATION_FACTOR = 1
MAX_QUEUE_SIZE = 10000
USE_FLOW_CONTROL = True
MAX_DATAPOINTS_PER_MESSAGE = 500
MAX_AGGREGATION_INTERVALS = 5
[vimau conf]$

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Michael Leinartas
Solved:
2012-06-14
Last query:
2012-06-14
Last reply:
2012-06-14
Michael Leinartas (mleinartas) said : #1

You'll want any metrics that need to be aggregated to be sent directly to carbon-aggregator rather than carbon-cache. Carbon-aggregator while then relay to your carbon-cache instance. FYI, any metrics that dont match aggregation rules will also pass through to carbon-cache so it's safe to switch everything to go through carbon-aggregator.

Your config looks ok to me so if you start pointing your nginx metrics to Graphite on port 2023 you should see the aggregated metric appear.

vimau (vigith-m) said : #2

How can i configure such that i get both the aggregated and non-aggregated metrics?

ie, i need all the individual metrics for each host and the total sum too.

Michael Leinartas (mleinartas) said : #3

Carbon-aggregator will always pass through the non-aggregated metric as well as long as it doesn't match the aggregated metric name (in which case the aggregated metric takes precedence) so this doesn't need to be specially configured - you can just point your stream to the aggregator port.

vimau (vigith-m) said : #4

Hi Michael,

Sorry for the confusion, but my doubt is. If i want both the aggregated and non-aggregated metrics of the same name, how can i configure.

eg,
a.b.c.SERVER1.d
a.b.c.SERVER2.d
...
a.b.c.SERVERn.d

are the incoming metrics.

Now I want the total metrics of type 'd' from all the servers AND ALSO the individual metrics.

ie, my aggregation rule should have "a.b.c.all.d (60) = sum a.b.c.*.d " to get the metrics aggregated. Now since this metric name is a match for all the above metrics, aggregation will happen, but the individual metrics will never pass through (as you have explained in comment#3). My requirement is, i need both the individual metrics and aggregated metrics of same name.

One work around i am using now is, i am not using carbon-aggregator and thus passing all the individual metrics as is to carbon-cache and is taking the help of 'graphite' to do a sum(a.b.c.*d) and plotting it. Is this the right approach?

Sorry for the trouble :-)

-vm

Best Michael Leinartas (mleinartas) said : #5

No, in that example the unaggregated metrics *will* pass through. The one case where the unaggregated metrics wont pass through is if they would conflict with the aggregated metrics. Generally this is when you have rules where both sides are identical:

a.b.c.SERVER1.d (60) = sum a.b.c.SERVER1.d

As long as your aggregated metrics are all composed of multiple metrics and have a separate naming (as you have above with 'all' in the aggregated metric) all of the unaggregated metrics will pass through

vimau (vigith-m) said : #6

Thanks Michael Leinartas, that solved my question.

I have the same problem as you faced vimau...I am not getting my aggregated metrics is not sum of the metrics received from all servers but is equal to the one only. Can you please tell me how u solved the above problem..

Lance Norskog (lance-norskog) said : #8

Hi-

This is not mentioned in the docs: "Carbon-aggregator will always pass through the non-aggregated metric as well as long as it doesn't match the aggregated metric name (in which case the aggregated metric takes precedence)".

I had a similar problem. Finally realized:

1. You need to point your statsd to the carbon-aggregator.py port on 2023, NOT carbon-cache.py on 2003.

     graphitePort: 2023 /*this is carbon-aggregator, carbon-cache is 2003*/
     , graphiteHost: "172.16.78.131"

2. You need to restart statsd. Although the log says it dynamically reread it, it does not seem to actually use the new graphitePort
   21 Jun 10:17:32 - reading config file: ./mystatsdconf.js <=== do not trust that this uses the new graphitePort