Creating average for 60 second network byte counter

Asked by Matt Snow

i've been playing around with the different functions in graphite but still can't quite get what i'm looking for so i'm here to ask for help.

I have a script running on a number of linux hosts to track different stats in /proc/ including network bytes TX/RX per interface. the samples are taken every 60 seconds then sent to graphite. I use derivative() to show the delta between the samples stored (storage schema is set for 60 seconds as well) but this gives me the total bytes transfered in that 60 second period. I would like to divide the metrict by 60 (seconds) and show the result as a crude average.

Does this make sense and is this posible or do I need to add code to my script to do this?

Thanks!

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Matt Snow
Solved:
Last query:
Last reply:
Revision history for this message
Jason Dixon (jason-dixongroup) said :
#1

You can use the scale() function to divide. I also recommend nonNegativeDerivative() for network statistics. From one of mine:

scale(nonNegativeDerivative(snmp.192_168_150_8.IF-MIB::ifInOctets.*),0.125)

Revision history for this message
Kevin Blackham (thekev.) said :
#2

This will give you bytes per 1/8th of storage interval. It should be scale 8.0 for bits per interval. If you want bits per second, divide by interval.

For example, bits per second at 300 second resolution is 0.02666667.

On Apr 26, 2011, at 17:11, Jason Dixon <email address hidden> wrote:

> Question #154372 on Graphite changed:
> https://answers.launchpad.net/graphite/+question/154372
>
> Status: Open => Answered
>
> Jason Dixon proposed the following answer:
> You can use the scale() function to divide. I also recommend
> nonNegativeDerivative() for network statistics. From one of mine:
>
> scale(nonNegativeDerivative(snmp.192_168_150_8.IF-
> MIB::ifInOctets.*),0.125)
>
> --
> You received this question notification because you are a member of
> graphite-dev, which is an answer contact for Graphite.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~graphite-dev
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~graphite-dev
> More help : https://help.launchpad.net/ListHelp

Revision history for this message
Matt Snow (matt-mattsn0w) said :
#3

Thanks Jason. I use the nonNegativeDerivative() but i'm fairly certain the byte counters in the Linux kernel do not roll over. there are actually a lot of gripes about this from Solaris admins since you can reset the network interface counters in Solaris.

I guess I didn't really think about it much as for how to get the numbers looking how I want them.

The storage retention for my network data is 60:259200,300:103680,3600:8760. Each metric sample is collected every 60 seconds and sent to graphite. so if 1 / 60 is 0.0166 then that should be my scale, correct?

Revision history for this message
Kevin Blackham (thekev.) said :
#4

1/60 gives you bytes per second if your dataset range is within the 259200 second bucket. You want to *8 for bits. Scale 0.1333333.

When I get some time, I'll be adding a patch for auto determining the scale factor based on the storage interval used for the source values. For example, if you go beyond 259200 second range, you are now dipping into 300 sec and scale is 0.0266667, and there is no way to know automatically other than to notice your graph is totally out of whack. :) (or fetch with &rawData=1 and the interval is in the leading columns of the csv.)

On Apr 26, 2011, at 23:13, Matt Snow <email address hidden> wrote:

> Question #154372 on Graphite changed:
> https://answers.launchpad.net/graphite/+question/154372
>
> Matt Snow posted a new comment:
> Thanks Jason. I use the nonNegativeDerivative() but i'm fairly certain
> the byte counters in the Linux kernel do not roll over. there are
> actually a lot of gripes about this from Solaris admins since you can
> reset the network interface counters in Solaris.
>
> I guess I didn't really think about it much as for how to get the
> numbers looking how I want them.
>
> The storage retention for my network data is
> 60:259200,300:103680,3600:8760. Each metric sample is collected every 60
> seconds and sent to graphite. so if 1 / 60 is 0.0166 then that should be
> my scale, correct?
>
> --
> You received this question notification because you are a member of
> graphite-dev, which is an answer contact for Graphite.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~graphite-dev
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~graphite-dev
> More help : https://help.launchpad.net/ListHelp

Revision history for this message
Nicholas Leskiw (nleskiw) said :
#5

You could also do that math in your script. Have it read /proc once a second, calculate a bytes/sec avg once a minute, and send that to a second Graphite metric like Server.iface.eth0.bytesPerSecond

-Nick

On Apr 27, 2011, at 1:30 AM, Kevin Blackham <email address hidden> wrote:

> Question #154372 on Graphite changed:
> https://answers.launchpad.net/graphite/+question/154372
>
> Kevin Blackham proposed the following answer:
> 1/60 gives you bytes per second if your dataset range is within the
> 259200 second bucket. You want to *8 for bits. Scale 0.1333333.
>
> When I get some time, I'll be adding a patch for auto determining the
> scale factor based on the storage interval used for the source values.
> For example, if you go beyond 259200 second range, you are now dipping
> into 300 sec and scale is 0.0266667, and there is no way to know
> automatically other than to notice your graph is totally out of whack.
> :) (or fetch with &rawData=1 and the interval is in the leading columns
> of the csv.)
>
> On Apr 26, 2011, at 23:13, Matt Snow
> <email address hidden> wrote:
>
>> Question #154372 on Graphite changed:
>> https://answers.launchpad.net/graphite/+question/154372
>>
>> Matt Snow posted a new comment:
>> Thanks Jason. I use the nonNegativeDerivative() but i'm fairly certain
>> the byte counters in the Linux kernel do not roll over. there are
>> actually a lot of gripes about this from Solaris admins since you can
>> reset the network interface counters in Solaris.
>>
>> I guess I didn't really think about it much as for how to get the
>> numbers looking how I want them.
>>
>> The storage retention for my network data is
>> 60:259200,300:103680,3600:8760. Each metric sample is collected every 60
>> seconds and sent to graphite. so if 1 / 60 is 0.0166 then that should be
>> my scale, correct?
>>
>> --
>> You received this question notification because you are a member of
>> graphite-dev, which is an answer contact for Graphite.
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~graphite-dev
>> Post to : <email address hidden>
>> Unsubscribe : https://launchpad.net/~graphite-dev
>> More help : https://help.launchpad.net/ListHelp
>
> --
> You received this question notification because you are a member of
> graphite-dev, which is an answer contact for Graphite.
>
> _______________________________________________
> Mailing list: https://launchpad.net/~graphite-dev
> Post to : <email address hidden>
> Unsubscribe : https://launchpad.net/~graphite-dev
> More help : https://help.launchpad.net/ListHelp

Revision history for this message
Matt Snow (matt-mattsn0w) said :
#6

Kevin - All I care about is bytes since the interfaces are 1Gb/ps or more. I am familiar with bit/octet to byte ratio. :)
I was wondering about the second range and how the data is stored once I go past the 180 day mark. In my environment 6 months is sufficient retention for granular data.

Nick - i've thought about it and still am considering that route. I'm doing a POC at work to prove the tool is useful in the environment then will hopefully get to write some more code and maybe even release it. :)

Revision history for this message
Dave Rawks (drawks) said :
#7

Kevin, I don't think a patch is needed, if you use summarize as the inner most function it /should/ normalize the data to the summary period which /should/ then be reliably used by your derivative and scale functions.