graphite-web with two backends - Recommended settings for local_settings.py

Asked by Tobias

Hi,

I am running a graphite cluster with one frontend reading from two backends with the CLUSTER_SERVERS:

MEMCACHE_HOSTS = ['graphite-frontend-a:11211','graphite-frontend-b:11211']
DEFAULT_CACHE_DURATION = 60
DEFAULT_CACHE_POLICY = [(0, 60), # default is 60 seconds
                        (7200, 120), # >= 2 hour queries are cached 2 minutes
                        (21600, 180), # >= 6 hour queries are cached 3 minutes
                        (43200, 360), # >= 12 hour queries are cached 6 minutes
                        (86400, 600)] # >= 24 hour queries are cached 10 minutes

REMOTE_FIND_TIMEOUT = 60 # Timeout for metric find requests
REMOTE_FETCH_TIMEOUT = 60 # Timeout to fetch series data
CLUSTER_SERVERS = ["graphite-backend-a-1:80", "graphite-backend-a-2:80"]
REMOTE_PREFETCH_DATA = True

The write throughput is really good for this cluster (using carbon-c-relay) but the queries tend to get slow during peak hours and sometimes even timing out.

The frontend machine has:
8GB memory
2 x Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz

The backend machines (i3 instances) have:
32 GB memory
4 x Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
dedicated SSD disk

Do you have any tips on how I could change the settings to make it faster?

Question information

Language:
English Edit question
Status:
Expired
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#1

Clustering in Graphite was improved in latest releases, but I would recommend to get rid of frontend servers and move rendering back to backends - it should be bit better. Ie. just point your users to backend servers, and put cluster config there.
Also you can use `msgpack` as transport protocol, i.e.
CLUSTER_SERVERS = ["http://graphite-backend-a-1:80?format=msgpack", "http://graphite-backend-a-2:80?format=msgpack"]
But please install `msgpack` python module then - otherwise it will fallback to pure-python module, which is slow.
Also, you can play with numbers of workers and worker threads in Apache / Nginx / uwsgi config.

Revision history for this message
Tobias (lindqt01) said :
#2

Thanks a lot for the feedback. I will definitely follow your recommendations.

Revision history for this message
Launchpad Janitor (janitor) said :
#3

This question was expired because it remained in the 'Open' state without activity for the last 15 days.