Python2 support ends soon. Is an upgrade to Python3 feasible?

Asked by Byron on 2019-12-27

Python2 support ends soon. Can we upgrade to Python3?

This has the advantage of being supported but also allows for the addition of support for mgzip for parallel/multi-core gzip compression (a nice performance boost for multi core systems). I have Python experience so can help with all of this.

Question information

Language:
English Edit question
Status:
Answered
For:
Duplicity Edit question
Assignee:
No assignee Edit question
Last query:
2019-12-27
Last reply:
2019-12-28

Hello Byron,

Duplicity already has Python 3 support in the 0.8 series. Anything that doesn't work with Python 3 is a bug that should definitely be fixed.

We currently support both Python 2 and 3 (see the requirements here: https://bazaar.launchpad.net/~duplicity-team/duplicity/0.8-series/view/head:/README ).

The Python 3 support is relatively recent (see the history here: https://blueprints.launchpad.net/duplicity/+spec/python3 and in recent commits) and the backends in particular had some bugs. For that reason we have not yet, for example, switched the stable snaps over to using Python 3, but distros are shipping duplicity for use with Python 3 and anything that does not work is a bug.

If you can show that using mgzip would reduce backup times, we would happily accept merge requests. Let us know if you need any help to get started with the code.

Byron (byronester) said : #2

Ok, sorry. I saw the python2 shebangs and missed the python version check in setup.py. I see now.

I've made a set of preliminary changes that should cover all the cases for compression at least. I have also run some preliminary tests just to show my suspicions...

Here is the original duplicity run (version 0.8.04)
byron@Desktop:~/src/OpenSource$ time duplicity full --no-encryption /home/byron/VirtualBox\ VMs/Ubuntu\ x64\ -\ ReachCMS/ file:///tmp/VMBackup-gzip
Local and Remote metadata are synchronised, no sync needed.
Last full backup date: none
--------------[ Backup Statistics ]--------------
StartTime 1577528067.50 (Sat Dec 28 18:14:27 2019)
EndTime 1577528225.29 (Sat Dec 28 18:17:05 2019)
ElapsedTime 157.79 (2 minutes 37.79 seconds)
SourceFiles 10
SourceFileSize 5597132154 (5.21 GB)
NewFiles 10
NewFileSize 5597132154 (5.21 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 10
RawDeltaSize 5597123962 (5.21 GB)
TotalDestinationSizeChange 1602344515 (1.49 GB)
Errors 0
-------------------------------------------------

real 2m40.195s
user 2m35.594s
sys 0m3.209s

Here is the same backup but with my mgzip modifications

(venv) byron@Desktop:~/src/OpenSource$ time python3 venv/bin/duplicity full --no-encryption /home/byron/VirtualBox\ VMs/Ubuntu\ x64\ -\ ReachCMS/ file:///tmp/VMBackup
Local and Remote metadata are synchronised, no sync needed.
Last full backup date: none
--------------[ Backup Statistics ]--------------
StartTime 1577527973.04 (Sat Dec 28 18:12:53 2019)
EndTime 1577528010.94 (Sat Dec 28 18:13:30 2019)
ElapsedTime 37.90 (37.90 seconds)
SourceFiles 10
SourceFileSize 5597132154 (5.21 GB)
NewFiles 10
NewFileSize 5597132154 (5.21 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 10
RawDeltaSize 5597123962 (5.21 GB)
TotalDestinationSizeChange 1609150119 (1.50 GB)
Errors 0
-------------------------------------------------

real 0m43.352s
user 2m49.017s
sys 0m5.035s

This is a substantial speed increase and is expected on my 6x core machine.

I will need some assistance now though as restoring the mgzip backup has a problem. The backup itself looks ok. I can look inside and test the integrity of the difftar.gz files and they seem ok. The difftar files all seem ok inside as well (with a cursory look). Same with the sigtar.

(venv) byron@Desktop:/tmp$ time python3 ~/src/OpenSource/venv/bin/duplicity restore --no-encryption file:///tmp/VMBackup/ /tmp/VMRestore
Synchronising remote metadata to local cache...
Copying duplicity-full-signatures.20191228T101253Z.sigtar.gz to local cache.
Copying duplicity-full.20191228T101253Z.manifest to local cache.
Last full backup date: Sat Dec 28 18:12:53 2019
Traceback (innermost last):
  File "/home/byron/src/OpenSource/venv/bin/duplicity", line 101, in <module>
    with_tempdir(main)
  File "/home/byron/src/OpenSource/venv/bin/duplicity", line 87, in with_tempdir
    fn()
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/dup_main.py", line 1539, in main
    do_backup(action)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/dup_main.py", line 1619, in do_backup
    restore(col_stats)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/dup_main.py", line 725, in restore
    restore_get_patched_rop_iter(col_stats)):
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/patchdir.py", line 580, in Write_ROPaths
    ITR(ropath.index, ropath)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/lazy.py", line 362, in __call__
    last_branch.fast_process, args)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/robust.py", line 41, in check_common_error
    return function(*args)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/patchdir.py", line 634, in fast_process
    ropath.copy(self.base_path.new_index(index))
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/path.py", line 456, in copy
    other.writefileobj(self.open(u"rb"))
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/path.py", line 652, in writefileobj
    buf = fin.read(_copy_blocksize)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/patchdir.py", line 227, in read
    if not self.addtobuffer():
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/patchdir.py", line 252, in addtobuffer
    self.tarinfo_list[0] = next(self.tar_iter)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/site-packages/duplicity/patchdir.py", line 359, in __next__
    return next(self.tar_iter)
  File "/home/byron/src/OpenSource/venv/lib/python3.7/tarfile.py", line 2405, in __iter__
    tarinfo = self.next()
  File "/home/byron/src/OpenSource/venv/lib/python3.7/tarfile.py", line 2283, in next
    raise ReadError("unexpected end of data")
 tarfile.ReadError: unexpected end of data

real 0m6.475s
user 0m7.736s
sys 0m2.092s

Do you have a branch to point us to?

Byron (byronester) said : #5

I can create a full backup using my mgzip branch and restore it using the duplicity from the Ubuntu repos. md5sum of a large 5.6GB file in restore dir is consistent with the original so that to me proves there is nothing wrong with the backup itself using mgzip.

Can you help with this problem?

Provide an answer of your own, or ask Byron for more information if necessary.

To post a message you must log in.