Data hidden when viewed across retention boundaries?

Asked by Lance Reed

Question is about Whisper retention and render behavior, specifically if data is hidden when viewed across retention boundaries?

I am using the default retention policy.

[everything-1_min_30days-5_min_1_year-15_mins-2years]
priority = 100
pattern = .*
retentions = 60:43200,300:105120,900:70080

I currently started putting in data for deployment events, which are 1-4 per day (render with drawAsInfinite). I noticed that if I extend the range of time to view these events beyond the first part of the retention policy (1min for 30 days) , the data in the first part of the policy will not render. I can however render data that has the exact same policy, but has effectively continuous data (e.g. HTTP response rates). It should be noted that the "value" saved for the deployment event is "0.1", as I found using something like "1" was squashing my Y-axis when using small values that are less than 1 such as HTTP response rates.

 I also used " whisper-fetch.py" to inspect the data in the wsp files and found that using the "--from=" option to query further back in time changed the default expression on the retention policy.

Example:

Last 24 hours gives 1 min resolution:
whisper-fetch.py --pretty ./blah_test.wsp | tail -3
Mon Aug 15 14:37:00 2011 None
Mon Aug 15 14:38:00 2011 None
Mon Aug 15 14:39:00 2011 None

more than 24 hours gives 5 min resolution.
whisper-fetch.py --from=1310000000 --pretty ./blah_test.wsp | tail -3
Mon Aug 15 14:25:00 2011 None
Mon Aug 15 14:30:00 2011 None
Mon Aug 15 14:35:00 2011 None

IT appears then that when a query is made against a wsp file, that the data returned to match the "oldest" part of the data across the set.

I am wondering what other behavior might be going on here that is causing data to be hidden when compared to "continuous" data?

Any details, thoughts, suggestions would be welcome.
Should I not be using a small value like "0.1"? Is this getting rolled up / discarded in some way?
Should I use a retention policy that is static across a longer time span, say 60 min for 1 year?

IT would be good to have at least 1 min retention later to compare the impacts on releases on other parts of the system.

Thanks in advance for any ideas.

Question information

Language:
English Edit question
Status:
Answered
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Nicholas Leskiw (nleskiw) said :
#1

1.) Graphite will, by default average your data and roll it up into the coarsest time. So if you cross a boundary, you'll get an average.

2.) Can you show the data from whisper-fetch during which you expect to see a build event? I'd like to see what value is actually being stored. (Preferably an event that was sent, then rolled up by Graphite.)

3.) You can try sending a very large value for a build event (say 10000) then use the &yMax= parameter to shrink the y-axis so you can once again see the very small HTTP Response times. Example: &yMax=0.1

Revision history for this message
Aman Gupta (tmm1) said :
#2

Rollups into older retentions only happen when 50% or more of values are present. You can either use a middleware like carbon-aggregator or statsd to write 0s regularly when there are no deployments, or use whisper-resize to change the xFilesFactor on the whisper file to something other than the default 0.5

Revision history for this message
Lance Reed (reed-r-lance) said :
#3

Thanks for these suggestions. I will confirm which options work once I have a change, but now the behavior makes sense.

Thanks!

Can you help with this problem?

Provide an answer of your own, or ask Lance Reed for more information if necessary.

To post a message you must log in.