multipathd failed to get udev uid, add missing path

Bug #1875594 reported by slawomir rynkiewicz
70
This bug affects 13 people
Affects Status Importance Assigned to Milestone
multipath-tools (Ubuntu)
Opinion
Undecided
Unassigned
open-vm-tools (Ubuntu)
Opinion
Undecided
Unassigned

Bug Description

On fresh instalation ubuntu 20.04 on vmware esx 6.0 i get errors in journalct frequently:

Apr 28 08:56:31 docker-host multipathd[704]: sda: failed to get udev uid: Invalid argument
Apr 28 08:56:31 docker-host multipathd[704]: sda: failed to get sysfs uid: Invalid argument
Apr 28 08:56:31 docker-host multipathd[704]: sda: failed to get sgio uid: No such file or directory
Apr 28 08:56:36 docker-host multipathd[704]: sda: add missing path
Apr 28 08:56:36 docker-host multipathd[704]: sda: failed to get udev uid: Invalid argument
Apr 28 08:56:36 docker-host multipathd[704]: sda: failed to get sysfs uid: Invalid argument
Apr 28 08:56:36 docker-host multipathd[704]: sda: failed to get sgio uid: No such file or directory
Apr 28 08:56:41 docker-host multipathd[704]: sda: add missing path
Apr 28 08:56:41 docker-host multipathd[704]: sda: failed to get udev uid: Invalid argument
Apr 28 08:56:41 docker-host multipathd[704]: sda: failed to get sysfs uid: Invalid argument
Apr 28 08:56:41 docker-host multipathd[704]: sda: failed to get sgio uid: No such file or directory
Apr 28 08:56:46 docker-host multipathd[704]: sda: add missing path
Apr 28 08:56:46 docker-host multipathd[704]: sda: failed to get udev uid: Invalid argument
Apr 28 08:56:46 docker-host multipathd[704]: sda: failed to get sysfs uid: Invalid argument
Apr 28 08:56:46 docker-host multipathd[704]: sda: failed to get sgio uid: No such file or directory
Apr 28 08:56:51 docker-host multipathd[704]: sda: add missing path
Apr 28 08:56:51 docker-host multipathd[704]: sda: failed to get udev uid: Invalid argument
Apr 28 08:56:51 docker-host multipathd[704]: sda: failed to get sysfs uid: Invalid argument
Apr 28 08:56:51 docker-host multipathd[704]: sda: failed to get sgio uid: No such file or directory

ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: multipath-tools 0.8.3-1ubuntu2
ProcVersionSignature: Ubuntu 5.4.0-26.30-generic 5.4.30
Uname: Linux 5.4.0-26-generic x86_64
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
CasperMD5CheckResult: pass
Date: Tue Apr 28 08:52:09 2020
InstallationDate: Installed on 2020-04-27 (0 days ago)
InstallationMedia: Ubuntu-Server 20.04 LTS "Focal Fossa" - Release amd64 (20200423)
SourcePackage: multipath-tools
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in multipath-tools (Ubuntu):
status: New → Confirmed
Changed in multipath-tools (Ubuntu):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
status: Confirmed → Triaged
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Hello Slawomir,

could you please provide the following information:

$ sudo multipath -v3 -ll

$ sudo multipath -t

$ sudo udevadm info -e

$ sudo multipathd show devices

$ sudo multipathd show paths

$ sudo multipathd show maps status

$ sudo multipathd show maps stats

Feel free to attach output files into this bug so I can check it.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

I have found some upstream patches I'd like to bring to multipath-tools for Focal:

5abeb33afedc013fb9996555ac79c4c711612fea (HEAD -> focal) libmultipath: fix sgio_get_vpd looping
d5cb76dc185ec5fd6822d9c8acc8b5a1218f2666 libmultipath: _init_foreign(): fix possible memory leak
c007a54882e45787f9e7c098d23c8d336b871406 libmultipath: remove_wwids(): fix possible leaks
8b15c32fb2bdb8ff2c086292da242badc1432248 libmultipath: replace_wwids(): fix possible fd leak
b2ed2805ffda32ffa8cdd5cb3a661632553c906f libmultipath: uevent_listen(): fix poll() retval check
2ccbc0072e5415b7acb0b36b33c180ef988a9f3a libmultipath: path_discovery: handle libudev errors
0069fcc68bf0d6e5169c5a57f7b2926c9b3988f0 libmultipath: lookup_binding(): don't rely on int overflow
f42f37f82fb4baf20934215b56e4c8591545f49e libmultipath: scan_devname: fix int overflow detection
1ae17e570b11e5361e1fa310481bc6f369918b30 libmultipath: format_devname: avoid buffer overflow

So I'd like to provide you a PPA with a new package containing those fixes. For your particular case, the patch:

2ccbc0072e5415b7acb0b36b33c180ef988a9f3a libmultipath: path_discovery: handle libudev errors

will likely accuse the error for the device but it won't make multipath to repeat them as the device won't be monitored. (output from previous comment is needed as well).

I'll add here a PPA and ask you to test packages from it if that is ok for you.

Thanks in advance

tags: added: block-proposed-focal
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Could you use the following PPA:

https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1875594

together with the output of the commands provided in comment #3 as the feedback ?

I'd like to hear that the errors either appeared just once or they did not appear with the PPA version (you can even compare before and after). PPA contains multipath-tools package with several fixes BUT it still might be the case that I need to add something else for your particular case.

Thanks a lot for reporting this.

-rafaeldtinoco

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :

I upload more logs you asked for, and try to install and test the PPA.

.slawek

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :

I installed the packages and rebooted server. Errors continue to appear in a loop.

I upload more logs you asked for after installing PPA.
multipathd show maps stats|status are empty, so i did not attached.

Apr 29 10:12:06 docker-host multipathd[746]: sda: add missing path
Apr 29 10:12:06 docker-host multipathd[746]: sda: failed to get udev uid: Invalid argument
Apr 29 10:12:06 docker-host multipathd[746]: sda: failed to get sysfs uid: Invalid argument
Apr 29 10:12:06 docker-host multipathd[746]: sda: failed to get sgio uid: No such file or directory
Apr 29 10:12:11 docker-host multipathd[746]: sda: add missing path
Apr 29 10:12:11 docker-host multipathd[746]: sda: failed to get udev uid: Invalid argument
Apr 29 10:12:11 docker-host multipathd[746]: sda: failed to get sysfs uid: Invalid argument
Apr 29 10:12:11 docker-host multipathd[746]: sda: failed to get sgio uid: No such file or directory
Apr 29 10:12:16 docker-host multipathd[746]: sda: add missing path
Apr 29 10:12:16 docker-host multipathd[746]: sda: failed to get udev uid: Invalid argument
Apr 29 10:12:16 docker-host multipathd[746]: sda: failed to get sysfs uid: Invalid argument
Apr 29 10:12:16 docker-host multipathd[746]: sda: failed to get sgio uid: No such file or directory
Apr 29 10:12:21 docker-host multipathd[746]: sda: add missing path
Apr 29 10:12:21 docker-host multipathd[746]: sda: failed to get udev uid: Invalid argument
Apr 29 10:12:21 docker-host multipathd[746]: sda: failed to get sysfs uid: Invalid argument
Apr 29 10:12:21 docker-host multipathd[746]: sda: failed to get sgio uid: No such file or directory
Apr 29 10:12:26 docker-host multipathd[746]: sda: add missing path
Apr 29 10:12:26 docker-host multipathd[746]: sda: failed to get udev uid: Invalid argument
Apr 29 10:12:26 docker-host multipathd[746]: sda: failed to get sysfs uid: Invalid argument
Apr 29 10:12:26 docker-host multipathd[746]: sda: failed to get sgio uid: No such file or directory

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

VMware SCSI emulated disks:

Apr 29 09:27:26 | sda: uid_attribute = ID_SERIAL (setting: multipath internal)
Apr 29 09:27:26 | sda: no ID_SERIAL attribute
Apr 29 09:27:26 | sda: failed to get udev uid: Invalid argument
Apr 29 09:27:26 | 32:0:0:0: attribute vpd_pg83 not found in sysfs
Apr 29 09:27:26 | failed to read sysfs vpd pg83
Apr 29 09:27:26 | sda: failed to get sysfs uid: Invalid argument
Apr 29 09:27:26 | failed to issue vpd inquiry for pg83
Apr 29 09:27:26 | sda: failed to get sgio uid: No such file or directory

don't provide VPD PG83 support as it seems. Multipath does not have a blacklist rule for those, I'll work in providing it. For now you should blacklist those disks by adding:

blacklist {
    device {
        vendor "VMware"
        product "Virtual disk"
    }
}

in your /etc/multipath.conf file AND restarting multipath-tools service:

$ sudo systemctl restart multipath-tools

Let me know if messages disappear please.

Thank you!

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :

Yep, it's clean now. Nice job.
Thank You :)

Last log during restart multipath-tools services:

Apr 29 18:28:21 docker-host systemd[1]: Stopping Device-Mapper Multipath Device Controller...
Apr 29 18:28:21 docker-host multipathd[746]: --------shut down-------
Apr 29 18:28:21 docker-host systemd[1]: multipathd.service: Succeeded.
Apr 29 18:28:21 docker-host systemd[1]: Stopped Device-Mapper Multipath Device Controller.
Apr 29 18:28:21 docker-host systemd[1]: Starting Device-Mapper Multipath Device Controller...
Apr 29 18:28:21 docker-host multipathd[15483]: --------start up--------
Apr 29 18:28:21 docker-host multipathd[15483]: read /etc/multipath.conf
Apr 29 18:28:21 docker-host multipathd[15483]: path checkers start up
Apr 29 18:28:21 docker-host systemd[1]: Started Device-Mapper Multipath Device Controller.
Apr 29 18:28:21 docker-host sudo[15470]: pam_unix(sudo:session): session closed for user root

slawek

Revision history for this message
slawomir rynkiewicz (slawomir-rynkiewicz) wrote :

If you want to test something, I am at your disposal. I also have PPA connected all the time.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@paelzer,

I really think the right place for a fix like this would be to have a rule at:

/lib/udev/rules.d/99-vmware-scsi-udev.rules

from open-vm-tools package.

From a previous fix I did to sg3-tools, I remember ignoring disks without VPDs and providing them a SCSI_ID based on something else. We could do the same here. Do you agree ? I can provide the patch.

Changed in open-vm-tools (Ubuntu):
status: New → Triaged
Changed in multipath-tools (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Hi Rafael,
I agree to give the approach a chance but we'd want buy in by the VMWare people just to be sure.
You know, If we fix a disk type that never works that is great. But I'd want to avoid that we change behavior for a disk type that "sometimes" fails, because it might regress people for which it is working right now.
So if it isn't to much effort to come up with something we can have it discussed when upstreaming to [1].

About the details - would that rule make the output of sg_id or whatever is used internally work then - or does it provide a different way for multipath to pick up an ID?

[1]: https://github.com/vmware/open-vm-tools

Revision history for this message
Hannes Fuchs (hfuchs) wrote :

Hello,

I stumbled upon this bug after migrating two uncritical VMs and after setting up new ones with Ubuntu 20.04. The host system for the VMs is running esxi 6.7.

After searching for the error I found one interesting forum post [1], wich solves the problem with options on the hosts system (VMware ESX). After setting the option "disk.EnableUUID" to "true" mutipathd is not reporting any "failed to get ... uuid". But not everyone has access to the host system.

Also the solution in Comment 17 [2] solves the problem.

One VM uses raw disks and there the error doesn't occur. Also the vendor and product of raw disks differs from the default vmdk in a data store.

Some interesting finds:
"multipath -v3 -ll":
  Using a default vmdk:
    sda: fail to get serial
  Using a default vmdk with disk.EnableUUID enabled:
    sda: serial = 6000c29adee74d1efcd3057f58416b7e
  Using a raw disk:
    sda: serial = 021845000405

Hope that can help. If you need more infos, let me know.

[1] https://ubuntuforums.org/showthread.php?t=2441797&p=13951248#post13951248
[2] https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1875594/comments/17

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Since we have a workaround for those who don't have access to ESXi hosts:

comment #17

and a final solution for those who have:

comment #22

and based on feedback from @paelzer we would have to have VMware buy in for a udev rule creating a serial for devices that are missing... which loses strength now that we have a way to make them to have serial...

I'm marking this bug as Opinion. Thank you very much @hfuchs and @slawomir-rynkiewicz for reporting this and working together.

-rafaeldtinoco

Changed in multipath-tools (Ubuntu):
status: Triaged → Opinion
Changed in open-vm-tools (Ubuntu):
status: Triaged → Opinion
Revision history for this message
Vince Thyng (vthyng) wrote :

I wanted to point out that the solution in comment #22 to set disk.EnableUUID to true in the VMX file has perils to be aware of. This discussion surfaces them:

https://communities.vmware.com/t5/VMware-vSphere-Discussions/Why-is-disk-enableuuid-TRUE-not-the-default/td-p/518472

Mostly around cloning a VM, which is pretty common in virtualized environments. If the user does not update the UUID in the VM, backup software may think that the UUID has already been backed up and not backup a VM's data. Changing the UUID on a cloned VM can confuse the OS when booting, seeing it as a new disk.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.