Unable to PXE boot on overcloud deploy

Asked by Arnold Song

Hi all,

I'm having some trouble with overcloud deployment using TripleO release Train and wondered if this look familiar to anyone.

The nodes to be provisioned as overcloud nodes all pass introspection cleanly and I’m able to move them into the `available` state. I had set `profile:compute` and `profile:control` in the instackenv.json file. The provisioning of the compute and controller nodes times out.

It seems pretty obvious to me that the DHCP requests from the nodes are being ignored by the ironic-inspector-dnsmasq service, but I can't for the life of me figure out what is misconfigured to cause this behavior.

Just for reference, when PXE booting is working during introspection, the tcpdump output looks like this:

17:28:39.035851 c8:1f:66:c2:f6:63 > Broadcast, ethertype IPv4 (0x0800), length 442: 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from c8:1f:66:c2:f6:63, length 400
17:28:39.036251 c8:1f:66:c3:0e:87 > Broadcast, ethertype IPv4 (0x0800), length 373: 10.232.16.7.bootps > 255.255.255.255.bootpc: BOOTP/DHCP, Reply, length 331

Where you see a BOOTP/DHCP request from the PXE enabled NIC (in this case, c8:1f:66:c2:f6:63), then a reply shortly after from the PXE boot server (10.232.16.7).

In the ironic-inspector-dnsmasq container, during introspection you first see this:
Jul 13 17:25:25 dnsmasq[7]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63
Jul 13 17:25:25 dnsmasq-dhcp[7]: read /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63

Which is when the ignore flag is removed from /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63. This change is detected and then read by dnsmasq-dhcp. This allow the PXE boot request to be received by the PXE boot server and then you see the following:
Jul 13 17:28:28 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) c8:1f:66:c2:f6:63
Jul 13 17:28:28 dnsmasq-dhcp[7]: DHCPOFFER(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63
Jul 13 17:28:32 dnsmasq-dhcp[7]: DHCPREQUEST(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63
Jul 13 17:28:32 dnsmasq-dhcp[7]: DHCPACK(br-ctlplane) 10.232.18.2 c8:1f:66:c2:f6:63

The introspection completes and you see the /var/lib/ironic-inspector/dhcp-hostsdir/c8:1f:66:c2:f6:63 file change again and get read by dnsmasq-dhcp with the ignore flag now included.

When things aren’t working during the heat stack build, in the tcpdump output you see the BOOTP/DHCP request as before, but no reply from the PXE boot server.

You also don’t see any of the files in /var/lib/ironic-inspector/dhcp-hostsdir change, so you see the following:
Jul 13 17:51:28 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) c8:1f:66:c2:f6:63 ignored

This is repeated until the provisioning times out waiting on callback from the overcloud nodes, which makes sense since they never PXE boot and loaded with the overcloud image.

Any help or insight would be so very much appreciated!

Question information

Language:
English Edit question
Status:
Expired
For:
tripleo Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.