Latest datapoints missing in graph

Asked by Zdeněk Pižl

Greetings,

 I am facing strange behaviour of Graphite infrastructure - there are latest datapoints missing from graphs.

My infrastructure contains:
a) 1x dedicated graphite-web for unique access point for any dashboard tool
b) 2x relay hosts running carbon-relay only
c) 2x storage hosts running carbon-cache (and graphite-web) only

The intention of this deployment design is as follows:

- any node sends datapoints metrics to one from defined relay hosts,
- this relay hosts know about all (currently 2) storage nodes and using consistent-hashing distributes datapoints among them,
- storage node with 'MAX_UPDATES_PER_SECOND = 50' and 'MAX_CACHE_SIZE = inf' saves datapoints to whisper database

The issue I am facing is that mostly in each graph there are missing latest datapoints from some metrics. But after some time and multiple reloads of graph (probably after saving those missing datapoint to whisper finally) those datapoints have appeared, but again the new latest ones are missing, and so on ...

My dashboarding tool (Grafana) is asking the graphite-web node (a) which is aware of all carbon-cache (storage) nodes defined in CLUSTER_SERVERS directive of local-setings.py.

What could be the reason of missing data? I assume there should be at least datapoints from all metrics up to some time but not in spread for i.e. 5 minutes.

Thank you for your ideas and help, regards .zp.

Question information

Language:
English Edit question
Status:
Expired
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Nitin (nitinpasari) said :
#1

I think that's because those points aren't written to the whisper files yet. I had a similar situation where I couldn't see the most recent points and there were still in memory. If you look at the metric - carbon.agents.[agent-name].cache.size. It might be huge if the datapoints are lagging by so much.

Now, if I remember correctly, I looked at the code and the documentation when this happened to me, and it seems the web-app tries to look up the cache as well when query for datapoints but obviously it wasn't for me or maybe I had my configuration wrong somewhere.

I am suspecting you are having the same issue. Try to check the cache size of the carbon agents and see if that makes sense?

Revision history for this message
Launchpad Janitor (janitor) said :
#2

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message
Zdeněk Pižl (zdenek-pizl) said :
#3

Well, all this issue were caused by misconfiguration of graphite-web. There was wrong prefix in WHISPER_DIR so any query to cache missed datapoint there. Therefore only datapoints from cache were missing ...