Multiple webapp-s render slow

Asked by Simonov Boris

Hello!
I've tested single node of graphite, it works very fast, now decided to create fail-safe cluster.
Started testing with two nodes in cluster, config listed below, and faced with problem of slow rendering when running two webapps. Fetch takes about 4 seconds, but when shutdown one replica fetch longs about 250ms (same latency when there was no cluster) What is wrong?
Each node config:
- Relay
   LINE_RECIEVER 0.0.0.0:2003
   PICKLE_RECIEVER 0.0.0.0:2004
   RELAY_METHOD consistent-hashing
   DESTINATIONS 192.168.0.10:2104:a,192.168.0.10:2204:b,192.168.0.11:2104:a,192.168.0.11:2204:b
- Cache:a
   PICKLE_RECIEVER 0.0.0.0:2104
   CACHE_QUERY 0.0.0.0:7102
- Cache:b
   PICKLE_RECIEVER 0.0.0.0:2204
   CACHE_QUERY 0.0.0.0:7202
- WebApp
   CARBONLINK_HOSTS = ["127.0.0.1:7102:a","127.0.0.1:7202:b"]
   CLUSTER_SERVERS = ["192.168.0.11:8000"] # and ["192.168.0.10:8000"] on other node
- Nginx
   upstream servers: 192.168.0.10:8000, 192.168.0.11:8000
all data was wiped before making cluster
nodes connected with gigabit links

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Simonov Boris
Solved:
Last query:
Last reply:
Revision history for this message
Jason Dixon (jason-dixongroup) said :
#1

There may be a bug affecting multiple webapps on a single node. Frankly, unless you're running some sort of unique configuration to partition namespace (or in at least one case I've seen, by timezone) for users users, there's very little benefit in running multiple webapps like this.

Possibly related:
https://github.com/graphite-project/graphite-web/issues/312
https://github.com/graphite-project/graphite-web/pull/492 (fixed in master)

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#2

Hello Boris,

IMO your setup is sowewhat wrong, and maybe you hit some bug bacause of that, as stated by Jason.
If you want to have nicely working cache (and I found that is quite mandatory for high loaded graphite) you need to be sure that DESTINATIONS of relay is exact match of CARBONLINK_HOSTS (I mean IP and instance name, not port) - then consistent hash will work fine and take right carbon daemon for metric check w/o checking other carbon instances and possibly waste time.
So, you need two set of relays - one will distribute load accross nodes and another one - on each node - will distribute load accross carbon caches. Also you need to add both nodes to CLUSTER_SERVERS.
Something like that:
[relay:a]
LINE_RECIEVER 0.0.0.0:2003
PICKLE_RECIEVER 0.0.0.0:2004
RELAY_METHOD consistent-hashing
DESTINATIONS 192.168.0.10:2013,192.168.0.11:2013
[relay:b]
LINE_RECIEVER 0.0.0.0:2013
PICKLE_RECIEVER 0.0.0.0:2014
RELAY_METHOD consistent-hashing
DESTINATIONS 127.0.0.1:2104:a,127.0.0.1:2104:b
[cache:a]
PICKLE_RECIEVER 0.0.0.0:2104
CACHE_QUERY 0.0.0.0:7102
[cache:b]
PICKLE_RECIEVER 0.0.0.0:2204
CACHE_QUERY 0.0.0.0:7202

WebApp
CARBONLINK_HOSTS = ["127.0.0.1:7102:a","127.0.0.1:7202:b"]
CLUSTER_SERVERS = ["192.168.0.10:8000", "192.168.0.11:8000"]

I have similar setup and it works quite well.

Revision history for this message
Simonov Boris (simonov-bs) said :
#3

Thanks!
Sugested scheme works well. Possiblly i misunderstood manuals that read before