Send pickled metrics data to graphite

Asked by Yousef F on 2017-05-30

I am in a situation where I am generating multiple text files with thousands of metrics and I want to send them as pickled data to graphite instead of looping line by line to save time. I am not familiar with python so if any one has an example of how the script should look like to do so. I am flexible with the text file format and I can generate it to suit the required format by python and graphite.
Appreciate the help

Ref.
http://graphite.readthedocs.io/en/latest/feeding-carbon.html#the-pickle-protocol

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Yousef F
Solved:
2017-06-12
Last query:
2017-06-12
Last reply:
2017-05-30
Piotr Popieluch (piotr1212) said : #1

Here you can find an example pickle client:
https://github.com/graphite-project/carbon/blob/master/examples/example-pickle-client.py

There are many ways in python to process files, most likely you will do something like:
with open('path', 'r') as f:
    for line in f:
        do_something(line)

Don't expect too much performance gains with pickling, the bottleneck will most likely not be the protocol itself. I've successfully run line protocol with millions of metrics per second, the protocol was the least of my worries.

Yousef F (yousef.fattal) said : #2

@Piotr Popieluch
Really appreciate the response.

Having said that you managed to send millions of metrics per seconds, what is the approach you are using ? can you post your script example please? I have a file now with almost 3000 metrics /lines in it, what I am doing now is to read the file into separate arrays and then loop through the arrays and send them individually , this file is taking around 15+ seconds to process which is too long for us as we need at least 10,000 per file and some times way more.

Any suggestions is really appreciated.

Piotr Popieluch (piotr1212) said : #3

IIRC I've sent about 10 million random metrics in two seconds over the line protocol with:
https://github.com/Civil/graphite_perf_test_go
There was no file processing involved, just wanted to indicate that you don't need pickle protocol to send many metrics.

You will have to find out what the bottleneck is in your script and optimize that, is it file processing or does carbon block?

Did a simple test, this runs on my laptop in 0.09 seconds (wall-time)

#!/usr/bin/python3

import socket

with open('metrics', 'w') as f:
    for i in range(10000):
        f.write("metricname.{} 42 1496179248\n".format(i))

sock = socket.socket()
sock.connect(('127.0.0.1', 2003))
with open('metrics', 'r') as f:
    for line in f:
        sock.sendall(line.encode('ascii'))

Yousef F (yousef.fattal) said : #4

Really appreciate your response, my bottle neck was actually the looping and by using the python script made it way less, sending 8,000+ lines in less than a second.
Thank you again