Memcache(DB) as a backend

Asked by Sebastien Estienne

I'd like to modify whisper to store the content of the db in a memcached or memcachedb using it as a shared storage.

Would it be enough to re implement open,seek,close to store the binary db in memcache and process the db in memory instead of on-disk ?

I may also gzip the db to store more of them in memory.

Any tips about how to implement this?

Question information

Language:
English Edit question
Status:
Answered
For:
Graphite Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
chrismd (chrismd) said :
#1

Storing the database file in memory is an interesting idea, though I'm not sure if memcached would be ideal because adding a single data point would require you to retrieve the entire file, update it, then re-insert it into the memcache. This may be fine if your files are not very large or you don't update them frequently, and you may be able to do something more clever depending on the needs of your application. As far as having the whisper library itself operate purely on memory, there are two approaches I would recommend. The simplest would be to create a ramdisk and store the files there, so no code needs to change. But of course this is not always an option. The other approach would be to write a new function in the whisper module called open() that provides the same API as the regular python open() call but returns a StringIO object (which behaves just a like a file but is stored in memory). Your open() function could use the path it is given as a memcached key for storing and retrieving the StringIO object. The trick is that since the existing whisper functions assume they are working on a regular file they do not return the file object when they are done with it, meaning they will not return your StringIO object either. You would either need to modify the functions to return their file object, or your open() function would have to be clever and make the StringIO objects it creates available through some other mechanism, like putting them in a global dict where the file paths are the keys, though that would not be safe for concurrency.

As far as gzipping it, you could use the zlib module from the standard library to compress and decompress the value of your StringIO objects (zlib is gzip compatible and does not require an actual file to operate on). However I would caution you against compressing whisper files because their contents tend to change on a regular basis and that means you'd have to decompress and re-compress the entire file very frequently, which may be expensive. Another (admittedly lame) approach to saving space would be to modify whisper to store smaller values if your application permits it. By default whisper stores values as doubles, which are typically 8 bytes, but if you do not need to store numbers with this precision (say your values are always integers less than 1,000 or something like that) you could modify the structure format definitions to use a smaller numeric type.

Can you help with this problem?

Provide an answer of your own, or ask Sebastien Estienne for more information if necessary.

To post a message you must log in.