What would cause this broken graph?

Asked by Corry Haines

We seem to have an odd issue in our graphite instance where a specific set of graphs are being rendered very incorrectly.

If we ask for a normal graph, it renders a blank graph (that is correctly scaled).

If we ask for the graph with the lines stacked, we get the following: http://i.imgur.com/oQAMj.png

The raw output is here (if it helps): https://gist.github.com/c8e232ae55d408bacad1

We are running of of Development releases at the moment, so this could be something in 0.9.9-preX

I cannot find anything in any logs, so I am primarily looking for some help as to what could cause the strange rendering in that graph. Specific help would be appreciate though.

Thanks!

Question information

Language:
English Edit question
Status:
Answered
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Nicholas Leskiw (nleskiw) said :
#1

There's some NaN values in the data - This should not be happening, they should be None if there was no data. I'll poke around and see if I can figure out why that's not being handled properly.

Are you sending pickle or plaintext data to your carbon-cache.py listener?

Revision history for this message
Corry Haines (tabletcorry) said :
#2

Hmm, the data is coming in over plaintext via a proxy on top of Collectd.

Could this be the proxy's fault, or should grpahite be rejecting the bad data? I wrote a fair bit of the proxy, so I can tweak it if need be.

Revision history for this message
Corry Haines (tabletcorry) said :
#3

In case its useful, my copy of graphite is coming off of the following commit: https://github.com/tabletcorry/graphite/commit/3cfd72be941921e98eccab22be8496cd190ee050

Its a week or two old.

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#4

I think that carbon-cache.py should reject any data that is not an int or float. If you send

hosts.hostname.ping.thing-foobar nan 1317717700\n

it should either become None or dropped.

Revision history for this message
chrismd (chrismd) said :
#5

I just fixed the bug with NaN's only getting dropped by the pickle receiver (now they will get dropped no matter how metrics are sent). Can you test and verify if this problem only occurs when there's a NaN in the data?

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#6

I have this data parsed out and ready for testing...I'll upgrade to trunk and test.

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#7

So it's mostly fixed.
Any NEW NaN values will be dropped in trunk.
However, the old data already stored in your.wsp file is there.
You'll need to either a) delete the file, b) find the nan values, calculate the epoch time for each of them, and send a None or 0 value to Graphite to overwrite the nan to fix your existing graphs.

Can you help with this problem?

Provide an answer of your own, or ask Corry Haines for more information if necessary.

To post a message you must log in.