alternate backend storage

Asked by bhardy

Has anyone successfully implemented an alternate backend storage for Graphite other than whisper?

The overall goal is to use less expensive storage than SSDs or a SAN. Possibly using SSDs for short-term and an alternate backend for long-term storage. I am currently running on SSDs which provide the needed IOPS but it becomes more expensive as we hope to meet both needs of increased metrics and long retention times (13months). It looks as if ceres never (really) took off and as you know the Graphite webapp does not combine data from various whisper files.

Question information

Language:
English Edit question
Status:
Solved
For:
Graphite Edit question
Assignee:
No assignee Edit question
Solved by:
bhardy
Solved:
Last query:
Last reply:
Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#1

BTW Ceres is used in PROD, especially in big installations, e.g. in Yandex, they even have own branch with ceres specific fixes - https://github.com/dkulikovsky. Ceres gave quite big gain for disk space, but not so big gain for IOPS - SSD still required for big load.
You can also try cianite - https://github.com/pyr/cyanite + https://github.com/brutasse/graphite-cyanite - but not sure if whole Cassandra cluster is cheaper than bigger SSD...

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#2
Revision history for this message
Jason Dixon (jason-dixongroup) said :
#3

Sure, if you like a TSDB that changes its clustering design every 3 months. *cough*

Revision history for this message
Denis Zhdanov (deniszhdanov) said :
#4

PPS: OpenTSDB storage plugin for graphite - https://github.com/mikebryant/graphite-opentsdb-finder
http://opentsdb.net/ is required, of course - and it's much harder (from operational point of view) than Cassandra IMO :)

Revision history for this message
bhardy (bhardy) said :
#5

I have been specifically waiting for influxDB because it holds alot of promise but it has yet to mature. I had version 8.x installed and I had the graphite webapp reading from it but there are numerous changes in version 9.x. I think it will be a leader when a prod version is released.

There are also the concerns of having the graphite webapp read from the alternate backend since many use a tag/label and values. The functions on the graphite webapp still seem to be unparalleled.

I know this post is a loaded question but I figure I would at least be humored. The link below seems to match my concern.
https://wikitech.wikimedia.org/wiki/Graphite/Scaling

The time-series database space will be awesome in a year but what do we do now.
http://prometheus.io/
OpenTSDB
influxDB

Revision history for this message
Jason Dixon (jason-dixongroup) said :
#6

I love (sarcastic) theoretical discussions like this which completely overlook the fact that numerous companies use Graphite at significant scale in production.

Revision history for this message
bhardy (bhardy) said :
#7

I hope I didn't come across as sarcastic because I sincerely was hoping to find ideas for a short term solution until undertaking a larger project. As I mentioned in my previous comment, I also knew it would be a loaded question. Not trying to offend anyone. Our current graphite infrastructure handles 1.5 million metrics/min but I'm pausing and looking around before scaling it yet more.

Revision history for this message
Jason Dixon (jason-dixongroup) said :
#8

Not at all, it was my own personal observation / frustration towards the community at large that want a magic pill for time-series storage. This thread just seemed like as good a place as any to release a brief rant. ;-)

Revision history for this message
bhardy (bhardy) said :
#9

Thanks for the brainstorming/comment all.