Verifying data sent from relay to cache.

Asked by Chris

I inherited a semi working graphite setup and I'm attempting to track down an issue with no data being displayed/written to whisper. I'm running into a lack of understanding on my part, so please let me know if I'm wrong about any of the conceptual data flow.

I can restart all services(statsd, carbon-cache, relay, aggregate) and get one or two points of data written to the Whisper DB.

My understanding is that the data flow is:

Statsd -> Relay -> Cache which then writes to the Whisper DB. I'm a little iffy on what Aggregate does.

At this point I'm attempting to verify data is going from Relay to Cache using tcpdump to listen on port 2013/2014 and I'm not catching anything. What's the best way to verify data flow and can I post any logs to make this easier to understand?

Question information

Language:
English Edit question
Status:
Expired
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Chris (cahc98) said :
#1

For reference here is our carbon.conf

[cache]
USER = carbon
MAX_CACHE_SIZE = 500000
MAX_UPDATES_PER_SECOND = 1000
MAX_CREATES_PER_MINUTE = 1000
LOG_UPDATES = false
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2013
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2014
CACHE_QUERY_INTERFACE = 127.0.0.1
CACHE_QUERY_PORT = 7002
USE_INSECURE_UNPICKLER = false

[relay]
USER = carbon
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
RELAY_METHOD = consistent-hashing
REPLICATION_FACTOR = 1
DESTINATIONS = 127.0.0.1:2014

[aggregator]
USER = carbon
DESTINATIONS = 127.0.0.1:2014
LINE_RECEIVER_PORT = 2023
PICKLE_RECEIVER_PORT = 2024
RULES = []

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#2

So, according to config above setup is :
1. Relay listening for metrics on port 2003 (line protocol) and 2004 (pickle protocol)
2. Relay sending metrics to 127.0.0.1:2014
3. Cache listening for metrics on port 2013 (line protocol) and 2014 (pickle protocol), so, it should accept metrics from the relay and write them to whisper files.
4. Aggregator listening metrics on port 2013 (line protocol) and 2014 (pickle protocol) and sending them to cache too (127.0.0.1:2014).

So, you need to check
1. does anything coming to port 2003?
2. does anything coming to port 2014?

Revision history for this message
Chris (cahc98) said :
#3

After looking into this some more and learning the architecture as well:

I've tried sending data directly from Statsd to Cache on the receiver and pickle port(not 100% where it should go).

The whisper file is created, but then starts writing None after an initial metric or two.

I'm struggling to find out why connectivity is in place, but Carbon just stops writing to the db file. I'm following the retention listed in the git example for carbon:

[carbon]
pattern = ^carbon\.
retentions = 60:90d

[default_1min_for_1day]
pattern = .*
retentions = 60s:1d

And my XFILESFACTOR = .1

If anything I feel like writes of 0.00 should be coming through not Null/None, which leads me to believe something is wrong with Carbon/Whisper or it's getting hammered too hard to catch up?

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#4

Graphite is caching metrics in a cache, so, they can be flushed on disk much later - but still should be accessible in web-interface.

Revision history for this message
Chris (cahc98) said :
#5

Once the processes are restarted the whisper file is written to instantly and over the course of two writes acts normally and of course it shows up on the graphite.

After those few writes to the whisper file, nothing gets written besides None and the graphs never display this information.

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#6

Did your graphite-web using carbonlink connection (127.0.0.1:7002)? Could you please show your local_settings.py config?

Revision history for this message
Chris (cahc98) said :
#7

No for carbonlink and local_settings.py is as follows:

TIME_ZONE = 'America/Chicago'

DATABASES = {
    'default': {
        'NAME': 'graphite',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'xxxxxxx',
        'PASSWORD': 'xxxxxxxx',
        'HOST': 'localhost',
        'PORT': '3306'
    }
}

CLUSTER_SERVERS = ''

MEMCACHE_HOSTS = ''

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#8

Should be default, but could you please add `CARBONLINK_HOSTS=['127.0.0.1:7002']` and restart Apache?

Revision history for this message
Chris (cahc98) said :
#9

Added and restarted, doesn't seem to be a change for the whisper files or graph either one.

Revision history for this message
Chris (cahc98) said :
#10

An update to this:

I edited the StatsD conf to point to carbon-cache instead of relay and everything seems to be working fine. I'm not sure what made relay stop functioning or what is wrong with it, but at least I know which part of the flow is broken.

Revision history for this message
Launchpad Janitor (janitor) said :
#11

This question was expired because it remained in the 'Open' state without activity for the last 15 days.