Feature request: back up some files as full every time

Asked by Frank Bicknell on 2017-07-31

Some files, like virtual machine images, are very large and take a really long time (many hours) to back up incrementally when the file has changed since the last backup.

Would it be possible to somehow flag certain files as "full backup only" instead of having duplicity try to wedge in the changes to an incremental backup?

The rest of the backup would still be incremental, but flagged files would just get a full copy every time.

Seems to me this would apply in some circumstances.

Question information

Language:
English Edit question
Status:
Answered
For:
Duplicity Edit question
Assignee:
No assignee Edit question
Last query:
2017-07-31
Last reply:
2017-08-01
Frank Bicknell (fbicknel) said : #1

... assuming of course that I didn't want to put those files in a separate filesystem. ^.^

edso (ed.so) said : #2

On 31.07.2017 22:03, Frank Bicknell wrote:
> New question #654413 on Duplicity:
> https://answers.launchpad.net/duplicity/+question/654413
>
> Some files, like virtual machine images, are very large and take a really long time (many hours) to back up incrementally when the file has changed since the last backup.
>
> Would it be possible to somehow flag certain files as "full backup only" instead of having duplicity try to wedge in the changes to an incremental backup?
>
> The rest of the backup would still be incremental, but flagged files would just get a full copy every time.
>
> Seems to me this would apply in some circumstances.
>

hey Frank,

the idea if incrementals itself is meant to speed up the backup (duplicity's librsync eg. uses the rolling checksum https://rsync.samba.org/tech_report/node3.html).

can you tell why you assume that backing up fully would be faster? did you do tests?

..ede/duply.net

Frank Bicknell (fbicknel) said : #3

I occasionally find duplicity still running from the wee hours backup and it's working on backing up one of those virtualbox machine images that changed the day before. I don't use my virtual Windows box(es) every day, so it's a marked contrast to "normal" incremental backup days.

I assumed it was trying to reconcile changes in a huge, encrypted file. It always finishes, but there's always a lot of cpu utilization (not to the point of choking the machine, mind you).

Just seemed to me that if my premise is correct, it might make sense to back up such files full.

It would obviously take up more space at the backend, but it would be faster.

I fully admit my findings are subjective and could be improved on.

If no one else agrees, then consider it just a passing thought and we can all get on with our lives.

edso (ed.so) said : #4

hey Frank,

On 01.08.2017 17:09, Frank Bicknell wrote:
> Question #654413 on Duplicity changed:
> https://answers.launchpad.net/duplicity/+question/654413
>
> Frank Bicknell posted a new comment:
> I occasionally find duplicity still running from the wee hours backup
> and it's working on backing up one of those virtualbox machine images
> that changed the day before. I don't use my virtual Windows box(es)
> every day, so it's a marked contrast to "normal" incremental backup
> days.

ic

> I assumed it was trying to reconcile changes in a huge, encrypted file.
> It always finishes, but there's always a lot of cpu utilization (not to
> the point of choking the machine, mind you).
>
> Just seemed to me that if my premise is correct, it might make sense to
> back up such files full.

yeah, if. why do you assume that the full would be faster? :)

> It would obviously take up more space at the backend, but it would be
> faster.

but why? the librsync algorithm is designed to be as fast as a full passthrough, but only delivering the changes.

> I fully admit my findings are subjective and could be improved on.
>
> If no one else agrees, then consider it just a passing thought and we
> can all get on with our lives.
>

not so fast cowboy ;). how about the next time this happens you remember the file and do a test?

first backup the whole file to a file:// location.
then do an incremental to that same location.
compare both times.
if you want you could change some stuff in the vm and run the incremental again, so that you effectively have changes to backup.

not saying you are not right. just saying, it shouldn't be so and of course where's the data (proof;) ..ede/duply.net

PS: https://www.ted.com/talks/talithia_williams_own_your_body_s_data

Can you help with this problem?

Provide an answer of your own, or ask Frank Bicknell for more information if necessary.

To post a message you must log in.