Federated Storage and Mixing Data Points Choices

Asked by bhardy

Does a given metric need to be consistently sent to the same carbon cache for there not to be any breaks in the graph? I ask this due to the answer given in the launchpad thread below. Its states that graphite will NOT mix datapoints from two servers that have the same metric.

https://answers.launchpad.net/graphite/+question/108505

Even when a datapoint does reside on more than one server, after testing I have noticed that the web UI iterates between where it retrieves the data from in no particular pattern.

Our setup: four linux servers running graphite each with its own separate SAN mount. CLUSTER_SERVERS is properly setup in local_settings.py

Is the only work around for this to use relay-rules to send to a specified server OR hard-code producers of metrics to point to a particular server? My primary goal is to gather options to render unified data from the four servers.

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
bhardy
Solved:
Last query:
Last reply:
Revision history for this message
Nicholas Leskiw (nleskiw) said :
#1

I'm pretty sure they don't mix.
That's how I understand the answer to 108505.

Revision history for this message
chrismd (chrismd) said :
#2

That's correct, when a metric lives on a particular graphite server *all* datapoints for that metric need to be sent to that server. If you want the metric to be on 2 servers then all datapoints for it need to go to both servers. When you request a graph from a webapp it first checks to see if it has the metric data locally and if it does not it queries the other servers in the cluster, the first to respond that it has the metric then gets queried for whatever datapoints it has. So if you're sending some datapoints to server A and some to server B, your graph may contain the data from either server but not both, hence both servers need to have the complete set of datapoints.

The way to ensure your datapoints all end up on the right servers is by using carbon-relay.py and defining rules that consistently map metrics to the servers. All clients sending data to graphite should send their data to the carbon-relay and not to carbon-cache. The carbon-relay will apply the rules and then forward the datapoints on to the appropriate carbon-cache instances. An example relay-rules.conf is shown on the question page you referenced. Let me know if you need more clarification about configuring the relay.

Revision history for this message
bhardy (bhardy) said :
#3

Chris, as someone else has said, "I'm impressed!" I thank you for the follow-up as that is the information I needed.

I will watch for the functionality to mix datapoints because as soon as it is available we will need to implement it. Thanks!