exceptions.TypeError: must be encoded string without NULL bytes, not str

Asked by Mike Wylie

Hi,

I get this same exception every second in the console.log file (exception below). Could somebody point me in the right direction to fault find the problem?

I send about 300 metrics every 20 mins at present and they are not all getting recorded, so I think this error may be the cause of my problems.

Many thanks

14/08/2012 09:52:02 :: Sorted 1012 cache queues in 0.000527 seconds
14/08/2012 09:52:02 :: Unhandled Error
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/dist-packages/twisted/python/threadpool.py", line 167, in _worker
    result = context.call(ctx, function, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/twisted/python/context.py", line 81, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/opt/graphite/lib/carbon/writer.py", line 158, in writeForever
    writeCachedDataPoints()
  File "/opt/graphite/lib/carbon/writer.py", line 91, in writeCachedDataPoints
    for (metric, datapoints, dbFilePath, dbFileExists) in optimalWriteOrder():
  File "/opt/graphite/lib/carbon/writer.py", line 54, in optimalWriteOrder
    dbFileExists = exists(dbFilePath)
  File "/usr/lib/python2.7/genericpath.py", line 18, in exists
    os.stat(path)
exceptions.TypeError: must be encoded string without NULL bytes, not str

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
Mike Wylie
Solved:
Last query:
Last reply:
Revision history for this message
Michael Leinartas (mleinartas) said :
#1

Carbon converts the metric name into a filesystem path with just a substitution:
  join(settings.LOCAL_DATA_DIR, metric.replace('.','/')) + '.wsp'

os.path.exists() is complaining of a null byte so there must be a null byte getting sent in one of your metric names

Revision history for this message
Mike Wylie (madcow-uk) said :
#2

Hi thank you for the advice, it sounds sensible.

I altered my script that sends the metrics, so that it logged them all in a file. I have about 2.5k metrics and I've played with them with grep and excel and so far I can't find any problems with the metrics in the file?

I stopped sending metrics when I created this question (around 2 days ago), but the console.log file is still filling with one exception per second. Could it be stuck trying to do something?

Many thanks

PS I'm off on a 2 week holiday tomorrow, so my apologies if my reply is delayed

Revision history for this message
Mike Wylie (madcow-uk) said :
#3

I had a look at the code in the exception trace and had a though....would I be taking my life in my hands if I altered the file /opt/graphite/lib/carbon/writer.py, to send the metric to the log file, just before trying to look for its path? This may give me an idea what it's failing to work with.

Would I need to restart something for the code to take affect?

I'm not a real python programmer, but I can understand the basics, I was thinking something along the lines of:

def optimalWriteOrder():
  "Generates metrics with the most cached values first and applies a soft rate limit on new metrics"
  global lastCreateInterval
  global createCount
  metrics = MetricCache.counts()

  t = time.time()
  metrics.sort(key=lambda item: item[1], reverse=True) # by queue size, descending
  log.msg("Sorted %d cache queues in %.6f seconds" % (len(metrics), time.time() - t))

  for metric, queueSize in metrics:
    if state.cacheTooFull and MetricCache.size < CACHE_SIZE_LOW_WATERMARK:
      events.cacheSpaceAvailable()
# ------------------------------------------------
# my new piece of code
#
    log.msg("reviewing metric:"+metric)
# ------------------------------------------------
    dbFilePath = getFilesystemPath(metric)
    dbFileExists = exists(dbFilePath)

Revision history for this message
Michael Leinartas (mleinartas) said :
#4

Null characters wont show with grep - you may want to try catting the file with cat -e which will display null characters as ^@

The log statement is fine - you may want to add another log message after the dbFileExists statement below it since that's the line it's going to blow up at if there are nulls.

Revision history for this message
Mike Wylie (madcow-uk) said :
#5

I'm back from my holiday and working on this problem again.

I implemented the above logging which gave me some confusing results. The log message clearly shows some bad characters in the directory name:
¬ítúuk.test.wlst.jvm.AdminServer.usedjvm
The text should start with "uk". However, I had shutdown carbon, restarted it and left it running for a while, then sent some values, but not the value above.

Basically the log files were clean for a few hours, then started getting this error for a metric I hadn't sent.

There was also an error for a metric I had sent, but I'm not sure if that's related:
¬ítQàuk.test.wlst.jdbc.com_bluemartini_dbpool_OneTouch_search.WaitingForConnectionCurrentCount

I also log all the metrics to a test file at the same time as sending to graphite, these bad characters do not exist in the log file.

So I have 2 questions.
1) Is there a file of cached values I need to clear out, before running my tests
2) Is there some way anyone can see how I get these bad characters in the metric names?

Many thanks

Revision history for this message
Mike Wylie (madcow-uk) said :
#6

I have a solution. To be clear there was no fault with the graphite tools either.

My code to send data to graphite is written in Jython and there were some problems with standard python code so the guy setting it up for me resorted to java and used an ObjectOutputStream to send the data.

It's taken me a while to figure it out, but ObjectOutput stream sends the text as an object, and must add a little binary padding at the start of my metrics, causing all of my problems. I swapped out to using a Java PrintWriter, which is specifically for text and hey presto. My metrics are now clean.

Thanks for the help Michael, you gave me enough to crack it in the end.