clustered graphite webapps render graphs that are inconsistent between instances.

Asked by Suresh Avadhanula

I have 4 instances in a clustered graphite instances.
carbon-cache uses consistent hashing relay method.

Here is carbon-conf from one of the instance.

The other instances are same except for the interface IPs.

[cache]
LOCAL_DATA_DIR = /opt/graphite/storage/whisper/
USER=
MAX_CACHE_SIZE = inf
MAX_UPDATES_PER_SECOND = 500
MAX_CREATES_PER_MINUTE = 50
LINE_RECEIVER_INTERFACE = 172.20.17.135
LINE_RECEIVER_PORT = 2003
ENABLE_UDP_LISTENER = False
UDP_RECEIVER_INTERFACE = 0.0.0.0
UDP_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 172.20.17.135
PICKLE_RECEIVER_PORT = 2004
USE_INSECURE_UNPICKLER = False

# Cache Query
CACHE_QUERY_INTERFACE = 172.20.17.135
CACHE_QUERY_PORT = 7002
USE_FLOW_CONTROL = True
LOG_UPDATES = True
WHISPER_AUTOFLUSH = False

[relay]
LINE_RECEIVER_INTERFACE = 172.20.17.135
LINE_RECEIVER_PORT = 2013
PICKLE_RECEIVER_INTERFACE = 172.20.17.135
PICKLE_RECEIVER_PORT = 2014
RELAY_METHOD = consistent-hashing
REPLICATION_FACTOR = 2
DESTINATIONS = 172.20.17.134:2004,172.20.17.132:2004,172.20.17.133:2004,172.20.17.135:2004
MAX_DATAPOINTS_PER_MESSAGE = 500
MAX_QUEUE_SIZE = 10000
USE_FLOW_CONTROL = True

[aggregator]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2023
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2024
DESTINATIONS = 127.0.0.1:2004
REPLICATION_FACTOR = 2
MAX_QUEUE_SIZE = 10000
USE_FLOW_CONTROL = True
MAX_DATAPOINTS_PER_MESSAGE = 500
MAX_AGGREGATION_INTERVALS = 5

relay-rules.conf

[default]
default = true
destinations = 172.20.17.134:2004,172.20.17.132:2004,172.20.17.133:2004,172.20.17.135:2004

local_settings.py

TIME_ZONE = 'America/Los_Angeles'
DEBUG = False
MEMCACHE_HOSTS = []
DATA_DIRS = ['/opt/graphite/storage/whisper/']
CLUSTER_SERVERS= [ "172.20.17.134:80","172.20.17.132:80","172.20.17.133:80","172.20.17.135:80" ]
CARBONLINK_HOSTS = [ "172.20.17.134:7002","172.20.17.132:7002","172.20.17.133:7002","172.20.17.135:7002" ]

metrics are sent to relay port ( 2013). I do not use carbon-aggregator. For testing purpose I am sending 3 metrics periodically ( every 10 seconds ). As the relay method is consistent hashing, the metrics end up with the same carbon-cache instance.

When accessing the webapp, each of the instance
ie 172.20.17.132/dashboard, 172.20.17.133/dashboard, 172.20.17.134/dashboard, 172.20.17.135/dashboard show up different graphs.

For eg: One of the webapp instance shows all the metrics, one shows only 2 of 3 and one does not show any of the metrics.

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Suresh Avadhanula
Solved:
Last query:
Last reply:
Revision history for this message
Suresh Avadhanula (suresh-avadhanula) said :
#1

The problem was that some of the carbon cache had some metrics data from previous runs ( during setup / test ). The webapp returned the response from the first server, from the list of cluster servers, that returned a response.

I cleaned up the whisper directory and restarted carbon cache. Now all webapps show the same result.