Is it possible to make incremental backups while the full backup runs?

Asked by Patrick Allemann on 2010-11-13

Hi,

I am backupping a 50GB worth of data on a remote server through a 2 Mbits line. The full backup takes me roughly 4 days, an incremental backup about 15 minutes. I have my settings set to:

/usr/bin/duplicity --name backup1 --encrypt-key masked--sign-key masked--verbosity 6 --full-if-older-than 1M --asynchronous-upload --ssh-askpass /home scp://<email address hidden>/backup/

now my big concern is that while a full-backup is triggered after one month, my incrementals would not be uploaded during the 4 days it takes to upload the full backup. If I simply start another process, another full-backup would be triggered.
I am looking for a best practise to prevent my backups having a 4 day "hole" every month.

Thank you for your help

Question information

Language:
English Edit question
Status:
Answered
For:
Duplicity Edit question
Assignee:
No assignee Edit question
Last query:
2010-11-13
Last reply:
2011-06-01
edso (ed.so) said : #1

A backup over the timeframe of 4 days is really something. Consider to do your backups to local filespace (e.g. usb-disk) first and use a separate application e.g. rsync to do the uploading work. This way your fulls would be "finished" faster locally.

To circumvent a hole in your backup chain you can put the new full in a new folder and doing daily incrementals in the old week's folder until the full is uploaded completely.

I see you are using the ssh backend. I read tests that the rsync over ssh backend is faster. But in your case the line might be the bottleneck, you could test it though.

ede/duply.net

Patrick Allemann (pallemann) said : #2

Hi ede,

yeah - I did think of that workaround but I would still be at risk as rsync would also upload the files sequentially. I would have a backup on my usb-disk but it still would not be stored off-site in case of a catastrophic event.

As far as I got the concept, a incremental backup alsways is connected to a backup chain which starts with a full backup. I don't yet see why my incremental backups could be added to last months backup chain as long as this months full backup is not finished yet. The "new-recent-fullbackup" is valid starting the date it finished I guess. Can't I force duplicity to connect to the old backup chain? In duply you have the possibility to force the process into full or incremental - but it gets messed up with the cached files if there are two running processes.

patrick

edso (ed.so) said : #3

On 13.11.2010 13:26, Patrick Allemann wrote:
> Question #133863 on Duplicity changed:
> https://answers.launchpad.net/duplicity/+question/133863
>
> Patrick Allemann posted a new comment:
> Hi ede,
>
> yeah - I did think of that workaround but I would still be at risk as
> rsync would also upload the files sequentially. I would have a backup on

what do you do when duplicity fails because your line chokes? Restart the full? And if it chokes again? Restart again? Recent duplicity is supposed to resume, but personally I wouldn't count on it for production usage.

The local storage workaround results in duplicity quickly finishing backups. This way you separate the backup creation from the upload process. You separate it from the bottleneck that your line is.

> my usb-disk but it still would not be stored off-site in case of a
> catastrophic event.

This way you can in parallel to your ongoing new full upload, upload the much smaller incremental backup of the old chain. They should be much smaller of course, so they can be transferred within a day despite your other upload(s). How much daily change do you generate?

> As far as I got the concept, a incremental backup alsways is connected
> to a backup chain which starts with a full backup. I don't yet see why
> my incremental backups could be added to last months backup chain as
> long as this months full backup is not finished yet. The "new-recent-
> fullbackup" is valid starting the date it finished I guess.

Right. Duplicity looks what is latest in the repository and acts accordingly. That's why i wrote 'new folder'. Reread my last comment ;).
A new folder is a new empty repository, hence the old backup chain in the old repository can be incrementally extended as you like.

>Can't I
> force duplicity to connect to the old backup chain? In duply you have
> the possibility to force the process into full or incremental - but it

duply only simplifies duplicity. You can of course force duplicity to full or incr (if a full already exists) without duply.

> gets messed up with the cached files if there are two running processes.
>

yes. Duplicity currently expects that there is only one instance running per repository.

..ede/duply.net

Allo (allo) said : #4

I had the same question. it would not be 4 days, but over 20 hours at best rate, so i can expect my daily backup may start before the full backup is uploaded, because it may become more than 24 hours to finish the upload.

my Ideas:
maybe using flock(1):
flock -w 1 duplicity --bla --blub #if a lock from another running job exists, wait 1 second and stop without backup

duplicity full /tobackup /remote/path/new-full;
rm -r /remote/path/backup
mv /remote/path/new-full /remote/path/backup

and incremental backups to /remote/path/backup

but with --name "bla", duplicity should check for other instances running on job "bla" and stop, when it finds another instance.

Adam Porter (alphapapa) said : #5

How much of your 50 GB changes in a month? Maybe you could drop older sets and continue making incrementals for a longer period. If only a small part of your 50 GB changes in a month, it seems...tedious to reupload it all every month.

edso (ed.so) said : #6

Adam,

the issue with this strategy is that you will temporarily loose all changes in the incrementals delete/you move out of the way. If i'd choose to go that route i'd rather copy the base full+metafiles to another folder locally on the backend and point duplicity to it.
Then run an incremental against it and only if this is successfull you will have a kind of a recent full.

ede/duply.net

PS:Thanks to the almighty for broadband access.

Can you help with this problem?

Provide an answer of your own, or ask Patrick Allemann for more information if necessary.

To post a message you must log in.