automated error detection/reporting based on takesnapshot.log.bz2

Asked by Tommy

I'd like to run a daily cron job to check for errors in all my BIT jobs. Thankfully, BIT stores a helpful log inside its individual takesnapshot.log.bz2 file. I want to know how to *most accurately* detect errors reported by BIT in these logs, or at least do so with a very high level of confidence that I've detected any problems. My current solution is basically centered around this command:

bzgrep '^\[E\] Error' /path/to/snapshot/takesnapshot.log.bz2

Any output from the command above is piped to an email alerting system. Does this look like a good solution? I don't think it covers all scenarios, so in your opinion, how could it be improved? Any and all feedback is appreciated :D

Question information

Language:
English Edit question
Status:
Answered
For:
Back In Time Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Germar (germar) said :
#1

This will only find errors reported by rsync. Other errors are logged to syslog. So it might be good to add a rule in syslog which will allert on warning and error level from backintime.

cron also send out an email if the process printed on stdout or stderr. At the moment BIT always installs cronjobs with 2>&1 >/dev/null to make cron silent. But I'm planing to make this optional so that every warning and error will send an email.

Revision history for this message
Tommy (tbutler-ubuntu) said :
#2

I can adjust the cron job, and provide a prototype alerting mechanism (patch?) to benefit the community. For now, to benefit the community I put together a proof of concept script that does error detection which can be run from cron at a configurable interval to monitor all backups running.

https://gist.github.com/tommybutler/8560c627d81245fa492b

Revision history for this message
Tommy (tbutler-ubuntu) said :
#3

Please let me know what kind of message / pattern to grep for in the syslog in order to detect errors or warnings that happened there.

Revision history for this message
Germar (germar) said :
#4

This should show all error and warn messages from syslog
grep -e 'backintime (\w*): \(WARNING\|ERROR\):' /var/log/syslog

Code contributions are always very welcome, thanks :D
I'll have a look at your alert script overnext week after my vacations.

For the crontab stdout and stderr redirection I've something in mind where you can switch off one or the other in Expert-Options. Default for new profiles should be stdout redirected and stderr ringing the bell. But old profiles should be both redirected for backwards compatiblity.

The crontab command is created in common/config.py line 1549

Kind regards,
Germar

Revision history for this message
Germar (germar) said :
#5

I already added the stdout and stderr redirection options.

Cheers,
Germar

Can you help with this problem?

Provide an answer of your own, or ask Tommy for more information if necessary.

To post a message you must log in.