Bug #1562249 “Failed to deploy machine with HP Smart Array Raid ...” : Bugs : Landscape Server

Revision history for this message

Robin (robinrego) wrote on 2016-03-26:

#1

api.log Edit (97.1 KiB, text/plain)
appserver.log Edit (191.1 KiB, text/plain)
async-frontend.log Edit (9.0 KiB, text/plain)
job-handler.log Edit (21.4 KiB, text/plain)
juju-sync.log Edit (792 bytes, text/plain)
landscape-profiles.log Edit (4.5 KiB, text/plain)
message-server.log Edit (56.9 KiB, text/plain)
package-search.log Edit (394 bytes, text/plain)
package-upload.log Edit (93.7 KiB, text/plain)
pingserver.log Edit (54.7 KiB, text/plain)
process-alerts.log Edit (1.1 KiB, text/plain)
update-alerts.log Edit (4.3 KiB, text/plain)

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-28:

#2

I have two suggestions:
a) do a node deployment without juju, just maas.
b) do a juju node deployment

For (a), all you have to do is import a public ssh key into your MAAS user and then select a node and hit "deploy". It should install ubuntu on the node and you should be able to ssh in using that key and as the ubuntu user.

If (a) worked, then you can try (b). To do that, configure juju to use MAAS as a provider following the instructions here: https://jujucharms.com/docs/stable/config-maas

Then do a juju bootstrap. This is what the autopilot does.

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-28:

#3

You can also check in the MAAS UI, specifically the node node-089caa3c-f27c-11e5-8e87-0014c2c1fead, to see if maas logged any errors about it. It should be near the bottom of the page in an option called "installation output" in the dropdown menu.

Changed in landscape:
status:	New → Incomplete

Revision history for this message

Robin (robinrego) wrote on 2016-03-29:

#4

I tried suggestion (a) and it works on the other nodes. I am able to deploy them and they show 'Deployed, in the maas UI.
When I try it with my problem node (HP server -DL 380 G4) with lan mac#0014c2c1fead it shows 'failed deployment' in the maas UI.

Even though the node shows failed deployment, I was able to ssh into the node from the maas server.

The errors logged in the maas UI - installation output for the 'failed deployment' node are pasted below.

Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
An error occured handling 'cciss!c0d0': OSError - [Errno 2] No such file or directory: '/sys/block/c0d0/holders'
[Errno 2] No such file or directory: '/sys/block/c0d0/holders'
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: "Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.\nAn error occured handling 'cciss!c0d0': OSError - [Errno 2] No such file or directory: '/sys/block/c0d0/holders'\n[Errno 2] No such file or directory: '/sys/block/c0d0/holders'\n"
Stderr: ''

Thanks

Revision history for this message

Robin (robinrego) wrote on 2016-03-29:

#5

log1.txt Edit (15.0 KiB, text/plain)

re: "Even though the node shows failed deployment, I was able to ssh into the node from the maas server". ... now Im, not sure about this.
I deleted the node re enlisted, commissioned and deployed. Same errors but this time I am not able to ssh into the node.

Pls see attached file which has output of /var/log$ cat cloud-init-output.log This was when I was able to ssh into node inspite of it saying failed deployment in maas UI.

Thanks

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-03-29:

#6

What are the disks or block devices attached to the problem node? Anything that makes it different from the others? Like an sd card? usb pendrive? PCIe SSD? Some removable storage? Can you get me the lshw output for it? It should be in the node page in MAAS, in yaml or xml format IIRC.

Revision history for this message

Robin (robinrego) wrote on 2016-03-29:

#7

LSHW.txt Edit (57.0 KiB, text/plain)

It has 4 Scsi 36.4 GB 15K rpm drives. Earlier I set them up as Raid 0 like the other machines. That didnt work out so I tried Raid 1 and used 2 drives for each array. This is the current set up.

I dont have any other drives attached to it.

I have attached the lshw for this node for your reference.

Thanks.

Revision history for this message

Robin (robinrego) wrote on 2016-04-01:

#8

LSHW - 2 .txt Edit (37.3 KiB, text/plain)

I did a fresh & clean install of ubuntu 14.04 LTS this time. Installed maas and deployed the remaing 4 nodes using maas UI. I had the same problem with that same node... and it gives up with the msg: Failed Deployment. see attahed file 'lshw -2' attached for your reference.

Thanks.

Revision history for this message

Robin (robinrego) wrote on 2016-04-01:

#9

I am able to SSH into the node that shows 'failed deployment' Please let me know if there is any information you need from that machine that might help.

Here us the output of block-devices from maas UI:

HpRS1.maas 00-maas-07-block-devices.out

[
{
  "BLOCK_SIZE": "4096",
  "NAME": "sda",
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000100000001",
  "PATH": "/dev/sda",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "VIRTUAL-DISK",
  "RO": "1",
  "SERIAL": "3000000100000001",
  "SIZE": "1468006400"
},
{
  "BLOCK_SIZE": "4096",
  "NAME": "cciss!c0d0",
  "ID_PATH": "/dev/disk/by-id/wwn-0x600508b100184439535350395850003d",
  "PATH": "/dev/cciss/c0d0",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "LOGICAL VOLUME",
  "RO": "0",
  "SERIAL": "600508b100184439535350395850003d",
  "SIZE": "36414750720"
},
{
  "BLOCK_SIZE": "4096",
  "NAME": "cciss!c0d1",
  "ID_PATH": "/dev/disk/by-id/wwn-0x600508b100184439535350395850003e",
  "PATH": "/dev/cciss/c0d1",
  "ROTA": "1",
  "RM": "0",
  "MODEL": "LOGICAL VOLUME",
  "RO": "0",
  "SERIAL": "600508b100184439535350395850003e",
  "SIZE": "36414750720"
}
]

Here is output of: CAT/PROC/PARTITIONS form the failing machine

ubuntu@HpRS1:~$ cat /proc/partitions
major minor #blocks name

11 0 1048575 sr0
104 0 35561280 cciss/c0d0
104 1 29334528 cciss/c0d0p1
104 2 1 cciss/c0d0p2
104 5 6223872 cciss/c0d0p5
104 16 35561280 cciss/c0d1
8 0 1433600 sda

Here is output of : FIND/SYS/BLOCK/ from the failing machine.

ubuntu@HpRS1:~$ find /sys/block/
/sys/block/
/sys/block/fd0
/sys/block/sda
/sys/block/sr0
/sys/block/ram0
/sys/block/ram1
/sys/block/ram2
/sys/block/ram3
/sys/block/ram4
/sys/block/ram5
/sys/block/ram6
/sys/block/ram7
/sys/block/ram8
/sys/block/ram9
/sys/block/loop0
/sys/block/loop1
/sys/block/loop2
/sys/block/loop3
/sys/block/loop4
/sys/block/loop5
/sys/block/loop6
/sys/block/loop7
/sys/block/ram10
/sys/block/ram11
/sys/block/ram12
/sys/block/ram13
/sys/block/ram14
/sys/block/ram15
/sys/block/cciss!c0d0
/sys/block/cciss!c0d1

and output of LSBLK from the failing node

ubuntu@HpRS1:~$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.4G 1 disk /media/root-ro
sr0 11:0 1 1024M 0 rom
cciss!c0d0 104:0 0 33.9G 0 disk
├─cciss!c0d0p1 104:1 0 28G 0 part
├─cciss!c0d0p2 104:2 0 1K 0 part
└─cciss!c0d0p5 104:5 0 6G 0 part
cciss!c0d1 104:16 0 33.9G 0 disk

In all the above .. there is an 'sda' virtual drive or something which maybe causing the c0d0 and c0d1 to not get detected.

If this is the case .. please suggest a workaround or solution.

I am able to SSH into the node that shows 'failed deployment'  Please let me know if there is any information you need from that machine that might help.

Here us the output of block-devices from maas UI:

HpRS1.maas 00-maas-07-block-devices.out
                
[
 {
  "BLOCK_SIZE": "4096", 
  "NAME": "sda", 
  "ID_PATH": "/dev/disk/by-id/wwn-0x3000000100000001", 
  "PATH": "/dev/sda", 
  "ROTA": "1", 
  "RM": "0", 
  "MODEL": "VIRTUAL-DISK", 
  "RO": "1", 
  "SERIAL": "3000000100000001", 
  "SIZE": "1468006400"
 }, 
 {
  "BLOCK_SIZE": "4096", 
  "NAME": "cciss!c0d0", 
  "ID_PATH": "/dev/disk/by-id/wwn-0x600508b100184439535350395850003d", 
  "PATH": "/dev/cciss/c0d0", 
  "ROTA": "1", 
  "RM": "0", 
  "MODEL": "LOGICAL VOLUME", 
  "RO": "0", 
  "SERIAL": "600508b100184439535350395850003d", 
  "SIZE": "36414750720"
 }, 
 {
  "BLOCK_SIZE": "4096", 
  "NAME": "cciss!c0d1", 
  "ID_PATH": "/dev/disk/by-id/wwn-0x600508b100184439535350395850003e", 
  "PATH": "/dev/cciss/c0d1", 
  "ROTA": "1", 
  "RM": "0", 
  "MODEL": "LOGICAL VOLUME", 
  "RO": "0", 
  "SERIAL": "600508b100184439535350395850003e", 
  "SIZE": "36414750720"
 }
]

Here is output of: CAT/PROC/PARTITIONS form the failing machine

ubuntu@HpRS1:~$ cat /proc/partitions
major minor  #blocks  name

11        0    1048575 sr0
 104        0   35561280 cciss/c0d0
 104        1   29334528 cciss/c0d0p1
 104        2          1 cciss/c0d0p2
 104        5    6223872 cciss/c0d0p5
 104       16   35561280 cciss/c0d1
   8        0    1433600 sda

Here is output of : FIND/SYS/BLOCK/ from the failing machine.

ubuntu@HpRS1:~$ find /sys/block/
/sys/block/
/sys/block/fd0
/sys/block/sda
/sys/block/sr0
/sys/block/ram0
/sys/block/ram1
/sys/block/ram2
/sys/block/ram3
/sys/block/ram4
/sys/block/ram5
/sys/block/ram6
/sys/block/ram7
/sys/block/ram8
/sys/block/ram9
/sys/block/loop0
/sys/block/loop1
/sys/block/loop2
/sys/block/loop3
/sys/block/loop4
/sys/block/loop5
/sys/block/loop6
/sys/block/loop7
/sys/block/ram10
/sys/block/ram11
/sys/block/ram12
/sys/block/ram13
/sys/block/ram14
/sys/block/ram15
/sys/block/cciss!c0d0
/sys/block/cciss!c0d1

and output of LSBLK from the failing node

ubuntu@HpRS1:~$ lsblk
NAME           MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda              8:0    0   1.4G  1 disk /media/root-ro
sr0             11:0    1  1024M  0 rom
cciss!c0d0     104:0    0  33.9G  0 disk
├─cciss!c0d0p1 104:1    0    28G  0 part
├─cciss!c0d0p2 104:2    0     1K  0 part
└─cciss!c0d0p5 104:5    0     6G  0 part
cciss!c0d1     104:16   0  33.9G  0 disk

In all the above .. there is an 'sda' virtual drive or something which maybe causing the c0d0 and c0d1 to not get detected.

If this is the case .. please suggest a workaround  or solution.

Adam Collard (adam-collard) on 2016-04-25

information type:	Proprietary → Public
Changed in landscape:
status:	Incomplete → Invalid
summary:	- node 'failed deployment' during openstack install + Failed to deploy machine with HP Smart Array Raid 6i
description:	updated
tags:	added: kanban-cross-team landscape removed: cloud-install-failure

🤖 Landscape Builder (landscape-builder) on 2016-04-25

tags:

removed: kanban-cross-team

Blake Rouse (blake-rouse) on 2016-04-25

Changed in maas:
status:	New → Invalid

Revision history for this message

Andres Rodriguez (andreserl) wrote on 2016-04-25:

#11

Hi Robin,

It seems you are using a custom partitioning layout, can you please attach the output of:

maas <maasuser> node get-curtin-config <system id>
maas <maasuser> node read <system id>

Also, please attach a full installation log (you can find it in the webUI at the bottom).

Changed in maas:
status:	Invalid → Incomplete

Revision history for this message

Ryan Faircloth (ry3n) wrote on 2016-05-03:

#13

I am having similar problems on a DL380G5 and DL360G5 using MAAS beta 3

Revision history for this message

Robin (robinrego) wrote on 2016-05-04:

#16

output of: maas node read Edit (37.0 KiB, text/plain)

Revision history for this message

Robin (robinrego) wrote on 2016-05-04:

#17

output of : maas node get-curtin-config Edit (4.4 KiB, text/plain)

Hi Andres

I have attached the files requested.

I do not need to use any custom partitioning and I am willing to make changes to the server configuration if necessary.

Thank you for your suggestions so far.

Robin (robinrego) on 2016-05-12

Changed in maas:
status:	Incomplete → Confirmed

Ryan Harper (raharper) on 2016-06-22

tags:

added: curtin-clear-holders

Ryan Harper (raharper) on 2016-06-22

tags:

added: curtin-sru

Wesley Wiedenmeier (wesley-wiedenmeier) on 2016-06-26

Changed in curtin:
status:	New → In Progress

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-06-26:

#18

It appears that some people have noticed the same issue and reported it in (LP: 1263181). There are several places in block_meta where curtin makes incorrect assumptions about the layout of /sys. I believe that in addition to clear_holders not being able to operate on ccis devices, bcache configuration will not work on them.

I am working on a fix right now, as I don't believe the fix I had been working on earlier is sufficient.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-06-28:

#19

I have a fix in lp:~wesley-wiedenmeier/curtin/1562249 but it would be good to test it on server with a hpsa device.

There is a build of the fixed version here:
https://launchpad.net/~wesley-wiedenmeier/+archive/ubuntu/test2/+build/10181498

Revision history for this message

Robin (robinrego) wrote on 2016-06-30:

#20

I tested the fix but the HP-DL380-G4 and HP-DL380-G5 servers still could not be deployed.

Since this is my first attempt and trying out a fix for a bug, I'd like to describe how I tested and would appreciate guidance if I did not do it right.

SSH into maas server and added ---
sudo add-apt-repository ppa:wesley-wiedenmeier/test2
sudo apt update
sudo apt upgrade
sudo apt dist-upgrade

Released hp server nodes and then tried to deploy. No success.

Then I tried doing the same on a fresh install of ubuntu server 14.04 LTS and Maas and the result was the same.. i.e all except the HP G4 & G5 in mysetup could not be deployed.

Please note that I am able to deploy these HP G5 servers if I add the line:

cciss.blacklist=yes modprobe.blacklist=cciss hpsa.hpsa_allow_any=1

to the Global Kernel Parameters (Boot parameters to pass to the kernel by default) section in the Settings for the Maas UI.

However.. that does not help with the DL380-G4 Machine.

Revision history for this message

Ryan Harper (raharper) wrote on 2016-06-30: Re: [Bug 1562249] Re: Failed to deploy machine with HP Smart Array Raid 6i

#21

On Thu, Jun 30, 2016 at 2:07 AM, Robin <email address hidden> wrote:

> I tested the fix but the HP-DL380-G4 and HP-DL380-G5 servers still could
> not be deployed.
>

Thanks for giving this a try.

>
> Since this is my first attempt and trying out a fix for a bug, I'd like
> to describe how I tested and would appreciate guidance if I did not do
> it right.
>
> SSH into maas server and added ---
> sudo add-apt-repository ppa:wesley-wiedenmeier/test2
> sudo apt update
> sudo apt upgrade
> sudo apt dist-upgrade
>
> Released hp server nodes and then tried to deploy. No success.
>

That looks correct, if you can confirm the curtin package version installed
with:

apt-cache policy python3-curtin

>
> Then I tried doing the same on a fresh install of ubuntu server 14.04 LTS
> and Maas and the result was the same.. i.e all except the HP G4 & G5 in
> mysetup could not be deployed.
>

Which maas version are you running?

Can you do the following and try again to collect some debugging logs?

curtin config:
maas <session> node get-curtin-config <system-id>

enable verbose debugging of curtin:
maas <session> maas set-config name=curtin_verbose value=true

In the node details page, it should display the curtin log.

>
>
> Please note that I am able to deploy these HP G5 servers if I add the
> line:
>
> cciss.blacklist=yes modprobe.blacklist=cciss hpsa.hpsa_allow_any=1
>
> to the Global Kernel Parameters (Boot parameters to pass to the kernel
> by default) section in the Settings for the Maas UI.
>
> However.. that does not help with the DL380-G4 Machine.
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1562249
>
> Title:
> Failed to deploy machine with HP Smart Array Raid 6i
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1562249/+subscriptions
>

Revision history for this message

Robin (robinrego) wrote on 2016-06-30:

#22

CURTIN PKG VERSION
robin@IbmRS1:~$ apt-cache policy python3-curtin
python3-curtin:
  Installed: (none)
  Candidate: 0.1.0~bzr385-0ubuntu1
  Version table:
     0.1.0~bzr385-0ubuntu1 0
        500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main amd64 Packages
     0.1.0~bzr227-0ubuntu1~14.04.1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     0.1.0~bzr126-0ubuntu1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

MAAS VERSION:
robin@IbmRS1:~$ apt-cache policy maas{,-dns,-dhcp} | grep Installed -B1 -A1
maas:
  Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
  Candidate: 1.9.3+bzr4577-0ubuntu1~trusty1
--
maas-dns:
  Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
  Candidate: 1.9.3+bzr4577-0ubuntu1~trusty1
--
maas-dhcp:
  Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
  Candidate: 1.9.3+bzr4577-0ubuntu1~trusty1

Revision history for this message

Ryan Harper (raharper) wrote on 2016-06-30:

#23

On Thu, Jun 30, 2016 at 11:48 AM, Robin <email address hidden> wrote:

> CURTIN PKG VERSION
> robin@IbmRS1:~$ apt-cache policy python3-curtin
> python3-curtin:
> Installed: (none)
> Candidate: 0.1.0~bzr385-0ubuntu1
> Version table:
> 0.1.0~bzr385-0ubuntu1 0
> 500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main
> amd64 Packages
> 0.1.0~bzr227-0ubuntu1~14.04.1 0
> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe
> amd64 Packages
> 0.1.0~bzr126-0ubuntu1 0
> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64
> Packages
>

It looks like you didn't get the PPA package installed, it supplies
python3-curtin ~bzr403

And that's because there's only a yakkety version. I'll work with Wesley
to get a
trusty version of the curtin package available in the PPA and let you know.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-06-30:

#24

I just updated my ppa to include a package for trusty, xenial and yakkety, so it should be possible to test with the updated package now. Sorry for the inconvenience.

Revision history for this message

Ryan Harper (raharper) wrote on 2016-06-30:

#25

On Thu, Jun 30, 2016 at 1:34 PM, Ryan Harper <email address hidden>
wrote:

>
>
> On Thu, Jun 30, 2016 at 11:48 AM, Robin <email address hidden> wrote:
>
>> CURTIN PKG VERSION
>> robin@IbmRS1:~$ apt-cache policy python3-curtin
>> python3-curtin:
>> Installed: (none)
>> Candidate: 0.1.0~bzr385-0ubuntu1
>> Version table:
>> 0.1.0~bzr385-0ubuntu1 0
>> 500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main
>> amd64 Packages
>> 0.1.0~bzr227-0ubuntu1~14.04.1 0
>> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe
>> amd64 Packages
>> 0.1.0~bzr126-0ubuntu1 0
>> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64
>> Packages
>>
>
> It looks like you didn't get the PPA package installed, it supplies
> python3-curtin ~bzr403
>
> And that's because there's only a yakkety version. I'll work with Wesley
> to get a
> trusty version of the curtin package available in the PPA and let you know.
>

OK, trusty version is present, so you can:

sudo add-apt-repository -y ppa:wesley-wiedenmeier/test2
sudo apt-get update
sudo apt-get install python3-curtin

In sudo apt-cache policy python3-curtin, you should see output that points
to the PPA:

python3-curtin:
  Installed: (none)
  Candidate: 0.1.0~bzr403-0ubuntu1
  Version table:
     0.1.0~bzr403-0ubuntu1 0
        500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/
trusty/main amd64 Packages
     0.1.0~bzr227-0ubuntu1~14.04.1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64
Packages
     0.1.0~bzr126-0ubuntu1 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

Revision history for this message

Robin (robinrego) wrote on 2016-06-30:

#26

Download full text (6.5 KiB)

Thanks for including the patch for trusty. Here are the new unsuccessful results:

robin@IbmRS1:~$ sudo apt-cache policy python3-curtin
[sudo] password for robin:
python3-curtin:
  Installed: 0.1.0~bzr403-0ubuntu1
  Candidate: 0.1.0~bzr403-0ubuntu1
  Version table:
*** 0.1.0~bzr403-0ubuntu1 0
        500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr385-0ubuntu1 0
        500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main amd64 Packages
     0.1.0~bzr227-0ubuntu1~14.04.1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     0.1.0~bzr126-0ubuntu1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

MAAS VERSION:
Installed: 1.9.3+bzr4577-0ubuntu1~trusty1

DL380-G4 Failed Deployment
MAchine Outut:

Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
  Skipping volume group MaaS
  Volume group name has invalid characters
File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
  Skipping volume group MaaS
  Volume group name has invalid characters
File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
An error occured handling 'cciss!c0d0': ProcessExecutionError - Unexpected error while running command.
Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Installation...

Thanks for including the patch for trusty.  Here are the new unsuccessful results:

robin@IbmRS1:~$ sudo apt-cache policy python3-curtin
[sudo] password for robin:
python3-curtin:
  Installed: 0.1.0~bzr403-0ubuntu1
  Candidate: 0.1.0~bzr403-0ubuntu1
  Version table:
 *** 0.1.0~bzr403-0ubuntu1 0
        500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr385-0ubuntu1 0
        500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main amd64 Packages
     0.1.0~bzr227-0ubuntu1~14.04.1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     0.1.0~bzr126-0ubuntu1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

MAAS VERSION: 
Installed: 1.9.3+bzr4577-0ubuntu1~trusty1

DL380-G4 Failed Deployment 
MAchine Outut:
                    
Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
  Skipping volume group MaaS
  Volume group name  has invalid characters
File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
  Skipping volume group MaaS
  Volume group name  has invalid characters
File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python
  Volume group "MaaS" not found
Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
An error occured handling 'cciss!c0d0': ProcessExecutionError - Unexpected error while running command.
Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Unexpected error while running command.
Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
Exit code: 1
Reason: -
Stdout: ''
Stderr: ''
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: 'Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.\nFile descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python\nFile descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python\n  Volume group "MaaS" not found\n  Skipping volume group MaaS\n  Volume group name  has invalid characters\nFile descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python\nFile descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python\n  Volume group "MaaS" not found\nFile descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent PID 10046: python\nFile descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046: python\n  Volume group "MaaS" not found\n  Skipping volume group MaaS\n  Volume group name  has invalid characters\nFile descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID 10046: python\nFile descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046: python\n  Volume group "MaaS" not found\nError: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.\nAn error occured handling \'cciss!c0d0\': ProcessExecutionError - Unexpected error while running command.\nCommand: [\'parted\', \'/dev/cciss/c0d0\', \'--script\', \'mklabel\', \'msdos\']\nExit code: 1\nReason: -\nStdout: \'\'\nStderr: \'\'\nUnexpected error while running command.\nCommand: [\'parted\', \'/dev/cciss/c0d0\', \'--script\', \'mklabel\', \'msdos\']\nExit code: 1\nReason: -\nStdout: \'\'\nStderr: \'\'\n'
Stderr: ''

DL380-G5-28 Failed Deployment
Machine output:
                   
Error: /dev/cciss/c0d1: unrecognised disk label
Error: /dev/cciss/c0d1: unrecognised disk label
An error occured handling 'cciss!c0d0-part1': OSError - [Errno 2] No such file or directory: '/dev/cciss/c0d01'
[Errno 2] No such file or directory: '/dev/cciss/c0d01'
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: "Error: /dev/cciss/c0d1: unrecognised disk label\nError: /dev/cciss/c0d1: unrecognised disk label\nAn error occured handling 'cciss!c0d0-part1': OSError - [Errno 2] No such file or directory: '/dev/cciss/c0d01'\n[Errno 2] No such file or directory: '/dev/cciss/c0d01'\n"
Stderr: ''

The second HpDL380-G5 also fails deployment with a similar machine output.

The IbmX3650 Deploys succellfully.

I have yet to understand how tho get the curtin logs .. and will revert back with that.

Revision history for this message

Ryan Harper (raharper) wrote on 2016-06-30:

#27

Download full text (7.9 KiB)

On Thu, Jun 30, 2016 at 4:00 PM, Robin <email address hidden> wrote:

> Thanks for including the patch for trusty. Here are the new
> unsuccessful results:

> robin@IbmRS1:~$ sudo apt-cache policy python3-curtin
> [sudo] password for robin:
> python3-curtin:
> Installed: 0.1.0~bzr403-0ubuntu1
> Candidate: 0.1.0~bzr403-0ubuntu1
> Version table:
> *** 0.1.0~bzr403-0ubuntu1 0
> 500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/
> trusty/main amd64 Packages
> 100 /var/lib/dpkg/status
> 0.1.0~bzr385-0ubuntu1 0
> 500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main
> amd64 Packages
> 0.1.0~bzr227-0ubuntu1~14.04.1 0
> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe
> amd64 Packages
> 0.1.0~bzr126-0ubuntu1 0
> 500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64
> Packages
>
>
> MAAS VERSION:
> Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
>
>
> DL380-G4 Failed Deployment
> MAchine Outut:
>
> Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have
> been unable to inform the kernel of the change, probably because it/they
> are in use. As a result, the old partition(s) will remain in use. You
> should reboot now before making further changes.
> File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> Volume group "MaaS" not found
> Skipping volume group MaaS
> Volume group name has invalid characters
> File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> Volume group "MaaS" not found
> File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> Volume group "MaaS" not found
> Skipping volume group MaaS
> Volume group name has invalid characters
> File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> Volume group "MaaS" not found
> Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have
> been unable to inform the kernel of the change, probably because it/they
> are in use. As a result, the old partition(s) will remain in use. You
> should reboot now before making further changes.
> An error occured handling 'cciss!c0d0': ProcessExecutionError - Unexpected
> error while running command.
> Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
> Exit code: 1
> Reason:...

On Thu, Jun 30, 2016 at 4:00 PM, Robin <robinrego@yahoo.com> wrote:

> Thanks for including the patch for trusty.  Here are the new
> unsuccessful results:

> robin@IbmRS1:~$ sudo apt-cache policy python3-curtin
> [sudo] password for robin:
> python3-curtin:
>   Installed: 0.1.0~bzr403-0ubuntu1
>   Candidate: 0.1.0~bzr403-0ubuntu1
>   Version table:
>  *** 0.1.0~bzr403-0ubuntu1 0
>         500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/
> trusty/main amd64 Packages
>         100 /var/lib/dpkg/status
>      0.1.0~bzr385-0ubuntu1 0
>         500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main
> amd64 Packages
>      0.1.0~bzr227-0ubuntu1~14.04.1 0
>         500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe
> amd64 Packages
>      0.1.0~bzr126-0ubuntu1 0
>         500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64
> Packages
>
>
> MAAS VERSION:
> Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
>
>
> DL380-G4 Failed Deployment
> MAchine Outut:
>
> Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have
> been unable to inform the kernel of the change, probably because it/they
> are in use.  As a result, the old partition(s) will remain in use.  You
> should reboot now before making further changes.
> File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
>   Volume group "MaaS" not found
>   Skipping volume group MaaS
>   Volume group name  has invalid characters
> File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
>   Volume group "MaaS" not found
> File descriptor 3 (socket:[13947]) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on lvremove invocation. Parent
> PID 10046: python
>   Volume group "MaaS" not found
>   Skipping volume group MaaS
>   Volume group name  has invalid characters
> File descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 4 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
> File descriptor 5 (/tmp/install.log) leaked on vgremove invocation. Parent
> PID 10046: python
>   Volume group "MaaS" not found
> Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have
> been unable to inform the kernel of the change, probably because it/they
> are in use.  As a result, the old partition(s) will remain in use.  You
> should reboot now before making further changes.
> An error occured handling 'cciss!c0d0': ProcessExecutionError - Unexpected
> error while running command.
> Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
> Exit code: 1
> Reason: -
> Stdout: ''
> Stderr: ''
> Unexpected error while running command.
> Command: ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos']
> Exit code: 1
> Reason: -
> Stdout: ''
> Stderr: ''
> Installation failed with exception: Unexpected error while running command.
> Command: ['curtin', 'block-meta', 'custom']
> Exit code: 3
> Reason: -
> Stdout: 'Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but
> we have been unable to inform the kernel of the change, probably because
> it/they are in use.  As a result, the old partition(s) will remain in use.
> You should reboot now before making further changes.\nFile descriptor 3
> (socket:[13947]) leaked on lvremove invocation. Parent PID 10046:
> python\nFile descriptor 4 (/tmp/install.log) leaked on lvremove invocation.
> Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log) leaked on
> lvremove invocation. Parent PID 10046: python\n  Volume group "MaaS" not
> found\n  Skipping volume group MaaS\n  Volume group name  has invalid
> characters\nFile descriptor 3 (socket:[13947]) leaked on vgremove
> invocation. Parent PID 10046: python\nFile descriptor 4 (/tmp/install.log)
> leaked on vgremove invocation. Parent PID 10046: python\nFile descriptor 5
> (/tmp/install.log) leaked on vgremove invocation. Parent PID 10046:
> python\n  Volume group "MaaS" not found\nFile descriptor 3 (socket:[13947])
> leaked on lvremove invocation. Parent PID 10046: python\nFile descriptor 4
> (/tmp/install.log) leaked on lvremove invocation. Parent PID 10046:
> python\nFile descriptor 5 (/tmp/install.log) leaked on lvremove invocation.
> Parent PID 10046: python\n  Volume group "MaaS" not found\n  Skipping
> volume group MaaS\n  Volume group name  has invalid characters\nFile
> descriptor 3 (socket:[13947]) leaked on vgremove invocation. Parent PID
> 10046: python\nFile descriptor 4 (/tmp/install.log) leaked on vgremove
> invocation. Parent PID 10046: python\nFile descriptor 5 (/tmp/install.log)
> leaked on vgremove invocation. Parent PID 10046: python\n  Volume group
> "MaaS" not found\nError: Partition(s) 5 on /dev/cciss/c0d0 have been
> written, but we have been unable to inform the kernel of the change,
> probably because it/they are in use.  As a result, the old partition(s)
> will remain in use.  You should reboot now before making further
> changes.\nAn error occured handling \'cciss!c0d0\': ProcessExecutionError -
> Unexpected error while running command.\nCommand: [\'parted\',
> \'/dev/cciss/c0d0\', \'--script\', \'mklabel\', \'msdos\']\nExit code:
> 1\nReason: -\nStdout: \'\'\nStderr: \'\'\nUnexpected error while running
> command.\nCommand: [\'parted\', \'/dev/cciss/c0d0\', \'--script\',
> \'mklabel\', \'msdos\']\nExit code: 1\nReason: -\nStdout: \'\'\nStderr:
> \'\'\n'
> Stderr: ''
>
>
> DL380-G5-28 Failed Deployment
> Machine output:
>
> Error: /dev/cciss/c0d1: unrecognised disk label
> Error: /dev/cciss/c0d1: unrecognised disk label
> An error occured handling 'cciss!c0d0-part1': OSError - [Errno 2] No such
> file or directory: '/dev/cciss/c0d01'
> [Errno 2] No such file or directory: '/dev/cciss/c0d01'
> Installation failed with exception: Unexpected error while running command.
> Command: ['curtin', 'block-meta', 'custom']
> Exit code: 3
> Reason: -
> Stdout: "Error: /dev/cciss/c0d1: unrecognised disk label\nError:
> /dev/cciss/c0d1: unrecognised disk label\nAn error occured handling
> 'cciss!c0d0-part1': OSError - [Errno 2] No such file or directory:
> '/dev/cciss/c0d01'\n[Errno 2] No such file or directory:
> '/dev/cciss/c0d01'\n"
> Stderr: ''
>
>
> The second HpDL380-G5 also fails deployment with a similar machine output.
>

Thanks that's quite helpful.

>
> The IbmX3650 Deploys succellfully.
>
> I have yet to understand how tho get the curtin logs .. and will revert
> back with that.
>

You should be able to follow the maascli guide:

https://maas.ubuntu.com/docs/maascli.html

Once you've logged in to a session

then this will dump a yaml output to stdout:

maas <session> node get-curtin-config <system-id>

And prior to running a deployment, you can do:

maas <session> maas set-config name=curtin_verbose value=true

And then redeploy the failure case, the output should include more curtin
debugging output.

That said, the general issue seems to be around the various kernel levels
for the cciss driver
sometimes the path /dev/cciss/<disk> exists (say Vivid, Wily, Xenial) but
older release it's not

Can you try deploying Wily or Xenial Ubuntu release instead of  Trusty to
your target node?

> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1562249
>
> Title:
>   Failed to deploy machine with HP Smart Array Raid 6i
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1562249/+subscriptions
>

Revision history for this message

Robin (robinrego) wrote on 2016-07-01:

#28

Download full text (4.4 KiB)

I released one of the failing nodes (DL380-G4), ran: maas <session> maas set-config name=curtin_verbose value=true
Deployed the node and waited till it failed.
Then I ran: maas maaster node get-curtin-config node-1e6e4f34-3e88-11e6-8da5-001a640920e4
and this is the StdOut:
robin@IbmRS1:~$ maas maaster node get-curtin-config node-1e6e4f34-3e88-11e6-8da5-001a640920e4
Success.
Machine-readable output follows:
apt_mirrors:
  ubuntu_archive: http://archive.ubuntu.com//ubuntu
  ubuntu_security: http://archive.ubuntu.com//ubuntu
apt_proxy: http://192.168.1.150:8000/
debconf_selections:
  maas: 'cloud-init cloud-init/datasources multiselect MAAS

cloud-init cloud-init/maas-metadata-url string http://192.168.1.150/MAAS/metadata/

cloud-init cloud-init/maas-metadata-credentials string oauth_token_key=BW2fFMWap4qWJAvA4w&oauth_token_secret=3xfcj8MAtuSRNhJpntC5w86yww2LVzhb&oauth_consumer_key=vnC7UzTtzRU47BvUFj

    cloud-init cloud-init/local-cloud-config string apt_preserve_sources_list:
    true\napt_proxy: http://192.168.1.150:8000/\nmanage_etc_hosts: false\nmanual_cache_clean:
    true\nreporting:\n maas: {consumer_key: vnC7UzTtzRU47BvUFj, endpoint: ''http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4'',\n token_key:
    BW2fFMWap4qWJAvA4w, token_secret: 3xfcj8MAtuSRNhJpntC5w86yww2LVzhb,\n type:
    webhook}\nsystem_info:\n package_mirrors:\n - arches: [i386, amd64]\n failsafe:
    {primary: ''http://archive.ubuntu.com/ubuntu'', security: ''http://security.ubuntu.com/ubuntu''}\n search:\n primary:
    [''http://archive.ubuntu.com/ubuntu'']\n security: [''http://archive.ubuntu.com/ubuntu'']\n -
    arches: [default]\n failsafe: {primary: ''http://ports.ubuntu.com/ubuntu-ports'',
    security: ''http://ports.ubuntu.com/ubuntu-ports''}\n search:\n primary:
    [''http://ports.ubuntu.com/ubuntu-ports'']\n security: [''http://ports.ubuntu.com/ubuntu-ports'']\n

    '
install:
  log_file: /tmp/install.log
  post_files:
  - /tmp/install.log
kernel:
  mapping: {}
  package: linux-generic
late_commands:
  maas:
  - wget
  - --no-proxy
  - http://192.168.1.150/MAAS/metadata/latest/by-id/node-1e6e4f34-3e88-11e6-8da5-001a640920e4/
  - --post-data
  - op=netboot_off
  - -O
  - /dev/null
network:
  config:
  - id: eth0
    mac_address: 00:14:c2:c1:fe:ad
    mtu: 1500
    name: eth0
    subnets:
    - address: 10.1.1.152/24
      dns_nameservers: []
      gateway: 10.1.1.100
      type: static
    type: physical
  - id: eth1
    mac_address: 00:14:c2:c1:fe:ac
    mtu: 1500
    name: eth1
    subnets:
    - type: manual
    type: physical
  - address: 192.168.1.150
    search:
    - maas
    type: nameserver
  version: 1
network_commands:
  builtin:
  - curtin
  - net-meta
  - custom
partitioning_commands:
  builtin:
  - curtin
  - block-meta
  - custom
power_state:
  mode: reboot
reporting:
  maas:
    consumer_key: vnC7UzTtzRU47BvUFj
    endpoint: http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4
    token_key: BW2fFMWap4qWJAvA4w
    token_secret: 3xfcj8MAtuSRNhJpntC5w86yww2LVzhb
    type: webhook
showtrace: true
storage:
  config:
  -...

I released one of the failing nodes (DL380-G4), ran: maas <session> maas set-config name=curtin_verbose value=true 
Deployed the node and waited till it failed.  
Then I ran: maas maaster node get-curtin-config node-1e6e4f34-3e88-11e6-8da5-001a640920e4
and this is the StdOut:
robin@IbmRS1:~$ maas maaster node get-curtin-config node-1e6e4f34-3e88-11e6-8da5-001a640920e4
Success.
Machine-readable output follows:
apt_mirrors:
  ubuntu_archive: http://archive.ubuntu.com//ubuntu
  ubuntu_security: http://archive.ubuntu.com//ubuntu
apt_proxy: http://192.168.1.150:8000/
debconf_selections:
  maas: 'cloud-init   cloud-init/datasources  multiselect MAAS

cloud-init   cloud-init/maas-metadata-url  string http://192.168.1.150/MAAS/metadata/

cloud-init   cloud-init/maas-metadata-credentials  string oauth_token_key=BW2fFMWap4qWJAvA4w&oauth_token_secret=3xfcj8MAtuSRNhJpntC5w86yww2LVzhb&oauth_consumer_key=vnC7UzTtzRU47BvUFj

cloud-init   cloud-init/local-cloud-config  string apt_preserve_sources_list:
    true\napt_proxy: http://192.168.1.150:8000/\nmanage_etc_hosts: false\nmanual_cache_clean:
    true\nreporting:\n  maas: {consumer_key: vnC7UzTtzRU47BvUFj, endpoint: ''http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4'',\n    token_key:
    BW2fFMWap4qWJAvA4w, token_secret: 3xfcj8MAtuSRNhJpntC5w86yww2LVzhb,\n    type:
    webhook}\nsystem_info:\n  package_mirrors:\n  - arches: [i386, amd64]\n    failsafe:
    {primary: ''http://archive.ubuntu.com/ubuntu'', security: ''http://security.ubuntu.com/ubuntu''}\n    search:\n      primary:
    [''http://archive.ubuntu.com/ubuntu'']\n      security: [''http://archive.ubuntu.com/ubuntu'']\n  -
    arches: [default]\n    failsafe: {primary: ''http://ports.ubuntu.com/ubuntu-ports'',
    security: ''http://ports.ubuntu.com/ubuntu-ports''}\n    search:\n      primary:
    [''http://ports.ubuntu.com/ubuntu-ports'']\n      security: [''http://ports.ubuntu.com/ubuntu-ports'']\n

'
install:
  log_file: /tmp/install.log
  post_files:
  - /tmp/install.log
kernel:
  mapping: {}
  package: linux-generic
late_commands:
  maas:
  - wget
  - --no-proxy
  - http://192.168.1.150/MAAS/metadata/latest/by-id/node-1e6e4f34-3e88-11e6-8da5-001a640920e4/
  - --post-data
  - op=netboot_off
  - -O
  - /dev/null
network:
  config:
  - id: eth0
    mac_address: 00:14:c2:c1:fe:ad
    mtu: 1500
    name: eth0
    subnets:
    - address: 10.1.1.152/24
      dns_nameservers: []
      gateway: 10.1.1.100
      type: static
    type: physical
  - id: eth1
    mac_address: 00:14:c2:c1:fe:ac
    mtu: 1500
    name: eth1
    subnets:
    - type: manual
    type: physical
  - address: 192.168.1.150
    search:
    - maas
    type: nameserver
  version: 1
network_commands:
  builtin:
  - curtin
  - net-meta
  - custom
partitioning_commands:
  builtin:
  - curtin
  - block-meta
  - custom
power_state:
  mode: reboot
reporting:
  maas:
    consumer_key: vnC7UzTtzRU47BvUFj
    endpoint: http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4
    token_key: BW2fFMWap4qWJAvA4w
    token_secret: 3xfcj8MAtuSRNhJpntC5w86yww2LVzhb
    type: webhook
showtrace: true
storage:
  config:
  - grub_device: true
    id: cciss!c0d0
    model: LOGICAL VOLUME
    name: cciss!c0d0
    ptable: msdos
    serial: 600508b1001844395353503958500047
    type: disk
    wipe: superblock
  - id: cciss!c0d1
    model: LOGICAL VOLUME
    name: cciss!c0d1
    serial: 600508b1001844395353503958500048
    type: disk
    wipe: superblock
  - id: cciss!c0d2
    model: LOGICAL VOLUME
    name: cciss!c0d2
    serial: 600508b1001844395353503958500049
    type: disk
    wipe: superblock
  - id: cciss!c0d3
    model: LOGICAL VOLUME
    name: cciss!c0d3
    serial: 600508b100184439535350395850004a
    type: disk
    wipe: superblock
  - device: cciss!c0d0
    id: cciss!c0d0-part1
    name: cciss!c0d0-part1
    number: 1
    offset: 4194304B
    size: 36406558720B
    type: partition
    uuid: db74597b-1b1e-432b-ad4f-6df6047f53cb
    wipe: superblock
  - fstype: ext4
    id: cciss!c0d0-part1_format
    label: root
    type: format
    uuid: f0544d0e-7696-4b88-b0d0-8c7b9857b742
    volume: cciss!c0d0-part1
  - device: cciss!c0d0-part1_format
    id: cciss!c0d0-part1_mount
    path: /
    type: mount
  version: 1
verbosity: 3

Next i will try and deploy the failing node with Wily.

I appreciate your easy to understand guidance and I am eager to help with this and any other testing.

Revision history for this message

Robin (robinrego) wrote on 2016-07-01:

#29

Download full text (25.1 KiB)

I was not able to deploy the G4 or the G5 with wily or with Xenial images.

Machine Output G4:

start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: curtin command block-meta
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d0
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d0/cciss!c0d0p1', with holders '[]'
wiping 1M on /dev/cciss/c0d0p1 at offsets [0, -1048576]
clear_holders running on '/sys/class/block/cciss!c0d0', with holders '[]'
wiping 1M on /dev/cciss/c0d0 at offsets [0, -1048576]
labeling device: '/dev/cciss/c0d0' with 'msdos' partition table
Running command ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos'] with allowed return codes [0] (shell=False, capture=False)
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
Running command ['blkid', '-o', 'export', '/dev/cciss/c0d0'] with allowed return codes [0, 2] (shell=False, capture=True)
Writing dname udev rule '['SUBSYSTEM=="block"', 'ACTION=="add|change"', 'ENV{DEVTYPE}=="disk"', 'ENV{ID_PART_TABLE_UUID}=="9c73f80b"', 'SYMLINK+="disk/by-dname/cciss!c0d0"']'
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d0
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d1
get_path_to_storage_volume for volume cciss!c0d1
Processing serial 600508b1001844395353503958500048 via udev to 600508b1001844395353503958500048
devsync for /dev/cciss/c0d1
Running command ['partprobe', '/dev/cciss/c0d1'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d1 now exists
return volume path /dev/cciss/c0d1
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d1', with holders '[]'
wiping 1M on /dev/cciss/c0d1 at offsets...

I was not able to deploy the G4 or the G5 with wily or with Xenial images.

Machine Output G4:

start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: curtin command block-meta
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d0
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d0/cciss!c0d0p1', with holders '[]'
wiping 1M on /dev/cciss/c0d0p1 at offsets [0, -1048576]
clear_holders running on '/sys/class/block/cciss!c0d0', with holders '[]'
wiping 1M on /dev/cciss/c0d0 at offsets [0, -1048576]
labeling device: '/dev/cciss/c0d0' with 'msdos' partition table
Running command ['parted', '/dev/cciss/c0d0', '--script', 'mklabel', 'msdos'] with allowed return codes [0] (shell=False, capture=False)
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
Running command ['blkid', '-o', 'export', '/dev/cciss/c0d0'] with allowed return codes [0, 2] (shell=False, capture=True)
Writing dname udev rule '['SUBSYSTEM=="block"', 'ACTION=="add|change"', 'ENV{DEVTYPE}=="disk"', 'ENV{ID_PART_TABLE_UUID}=="9c73f80b"', 'SYMLINK+="disk/by-dname/cciss!c0d0"']'
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d0
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d1
get_path_to_storage_volume for volume cciss!c0d1
Processing serial 600508b1001844395353503958500048 via udev to 600508b1001844395353503958500048
devsync for /dev/cciss/c0d1
Running command ['partprobe', '/dev/cciss/c0d1'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d1 now exists
return volume path /dev/cciss/c0d1
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d1', with holders '[]'
wiping 1M on /dev/cciss/c0d1 at offsets [0, -1048576]
get_path_to_storage_volume for volume cciss!c0d1
Processing serial 600508b1001844395353503958500048 via udev to 600508b1001844395353503958500048
devsync for /dev/cciss/c0d1
Running command ['partprobe', '/dev/cciss/c0d1'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d1 now exists
return volume path /dev/cciss/c0d1
Running command ['blkid', '-o', 'export', '/dev/cciss/c0d1'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: cciss!c0d1. Skipping dname.
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d1
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d2
get_path_to_storage_volume for volume cciss!c0d2
Processing serial 600508b1001844395353503958500049 via udev to 600508b1001844395353503958500049
devsync for /dev/cciss/c0d2
Running command ['partprobe', '/dev/cciss/c0d2'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d2 now exists
return volume path /dev/cciss/c0d2
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d2', with holders '[]'
wiping 1M on /dev/cciss/c0d2 at offsets [0, -1048576]
get_path_to_storage_volume for volume cciss!c0d2
Processing serial 600508b1001844395353503958500049 via udev to 600508b1001844395353503958500049
devsync for /dev/cciss/c0d2
Running command ['partprobe', '/dev/cciss/c0d2'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d2 now exists
return volume path /dev/cciss/c0d2
Running command ['blkid', '-o', 'export', '/dev/cciss/c0d2'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: cciss!c0d2. Skipping dname.
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d2
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d3
get_path_to_storage_volume for volume cciss!c0d3
Processing serial 600508b100184439535350395850004a via udev to 600508b100184439535350395850004a
devsync for /dev/cciss/c0d3
Running command ['partprobe', '/dev/cciss/c0d3'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d3 now exists
return volume path /dev/cciss/c0d3
Running command ['mdadm', '--assemble', '--scan'] with allowed return codes [0, 1, 2] (shell=False, capture=True)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
clear_holders running on '/sys/class/block/cciss!c0d3', with holders '[]'
wiping 1M on /dev/cciss/c0d3 at offsets [0, -1048576]
get_path_to_storage_volume for volume cciss!c0d3
Processing serial 600508b100184439535350395850004a via udev to 600508b100184439535350395850004a
devsync for /dev/cciss/c0d3
Running command ['partprobe', '/dev/cciss/c0d3'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d3 now exists
return volume path /dev/cciss/c0d3
Running command ['blkid', '-o', 'export', '/dev/cciss/c0d3'] with allowed return codes [0, 2] (shell=False, capture=True)
Can't find a uuid for volume: cciss!c0d3. Skipping dname.
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d3
start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring partition: cciss!c0d0-part1
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
c0d0 logical_block_size_bytes: 512
adding partition 'cciss!c0d0-part1' to disk 'cciss!c0d0' (ptable: 'msdos')
partnum: 1 offset_sectors: 2048 length_sectors: 71106559
Running command ['parted', '/dev/cciss/c0d0', '--script', 'mkpart', 'primary', '2048s', '71108607s'] with allowed return codes [0] (shell=False, capture=True)
get_path_to_storage_volume for volume cciss!c0d0-part1
get_path_to_storage_volume for volume cciss!c0d0
Processing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d0
devsync for /dev/cciss/c0d0
Running command ['partprobe', '/dev/cciss/c0d0'] with allowed return codes [0, 1] (shell=False, capture=False)
Running command ['udevadm', 'settle'] with allowed return codes [0] (shell=False, capture=False)
devsync happy - path /dev/cciss/c0d0 now exists
return volume path /dev/cciss/c0d01
An error occured handling 'cciss!c0d0-part1': ValueError - %s: not an existing file or block device
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: failed: configuring partition: cciss!c0d0-part1
finish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: failed: curtin command block-meta
Traceback (most recent call last):
  File "/curtin/curtin/commands/main.py", line 210, in main
    ret = args.func(args)
  File "/curtin/curtin/commands/block_meta.py", line 62, in block_meta
    meta_custom(args)
  File "/curtin/curtin/commands/block_meta.py", line 1136, in meta_custom
    handler(command, storage_config_dict)
  File "/curtin/curtin/commands/block_meta.py", line 647, in partition_handler
    mode=info.get('wipe'))
  File "/curtin/curtin/block/__init__.py", line 678, in wipe_volume
    quick_zero(path, partitions=False)
  File "/curtin/curtin/block/__init__.py", line 589, in quick_zero
    raise ValueError("%s: not an existing file or block device")
ValueError: %s: not an existing file or block device
%s: not an existing file or block device
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: b'start: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: curtin command block-meta\nstart: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d0\nget_path_to_storage_volume for volume cciss!c0d0\nProcessing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d0\nRunning command [\'mdadm\', \'--assemble\', \'--scan\'] with allowed return codes [0, 1, 2] (shell=False, capture=True)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\nclear_holders running on \'/sys/class/block/cciss!c0d0/cciss!c0d0p1\', with holders \'[]\'\nwiping 1M on /dev/cciss/c0d0p1 at offsets [0, -1048576]\nclear_holders running on \'/sys/class/block/cciss!c0d0\', with holders \'[]\'\nwiping 1M on /dev/cciss/c0d0 at offsets [0, -1048576]\nlabeling device: \'/dev/cciss/c0d0\' with \'msdos\' partition table\nRunning command [\'parted\', \'/dev/cciss/c0d0\', \'--script\', \'mklabel\', \'msdos\'] with allowed return codes [0] (shell=False, capture=False)\nget_path_to_storage_volume for volume cciss!c0d0\nProcessing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d0\nRunning command [\'blkid\', \'-o\', \'export\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 2] (shell=False, capture=True)\nWriting dname udev rule \'[\'SUBSYSTEM=="block"\', \'ACTION=="add|change"\', \'ENV{DEVTYPE}=="disk"\', \'ENV{ID_PART_TABLE_UUID}=="9c73f80b"\', \'SYMLINK+="disk/by-dname/cciss!c0d0"\']\'\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d0\nstart: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d1\nget_path_to_storage_volume for volume cciss!c0d1\nProcessing serial 600508b1001844395353503958500048 via udev to 600508b1001844395353503958500048\ndevsync for /dev/cciss/c0d1\nRunning command [\'partprobe\', \'/dev/cciss/c0d1\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d1 now exists\nreturn volume path /dev/cciss/c0d1\nRunning command [\'mdadm\', \'--assemble\', \'--scan\'] with allowed return codes [0, 1, 2] (shell=False, capture=True)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\nclear_holders running on \'/sys/class/block/cciss!c0d1\', with holders \'[]\'\nwiping 1M on /dev/cciss/c0d1 at offsets [0, -1048576]\nget_path_to_storage_volume for volume cciss!c0d1\nProcessing serial 600508b1001844395353503958500048 via udev to 600508b1001844395353503958500048\ndevsync for /dev/cciss/c0d1\nRunning command [\'partprobe\', \'/dev/cciss/c0d1\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d1 now exists\nreturn volume path /dev/cciss/c0d1\nRunning command [\'blkid\', \'-o\', \'export\', \'/dev/cciss/c0d1\'] with allowed return codes [0, 2] (shell=False, capture=True)\nCan\'t find a uuid for volume: cciss!c0d1. Skipping dname.\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d1\nstart: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d2\nget_path_to_storage_volume for volume cciss!c0d2\nProcessing serial 600508b1001844395353503958500049 via udev to 600508b1001844395353503958500049\ndevsync for /dev/cciss/c0d2\nRunning command [\'partprobe\', \'/dev/cciss/c0d2\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d2 now exists\nreturn volume path /dev/cciss/c0d2\nRunning command [\'mdadm\', \'--assemble\', \'--scan\'] with allowed return codes [0, 1, 2] (shell=False, capture=True)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\nclear_holders running on \'/sys/class/block/cciss!c0d2\', with holders \'[]\'\nwiping 1M on /dev/cciss/c0d2 at offsets [0, -1048576]\nget_path_to_storage_volume for volume cciss!c0d2\nProcessing serial 600508b1001844395353503958500049 via udev to 600508b1001844395353503958500049\ndevsync for /dev/cciss/c0d2\nRunning command [\'partprobe\', \'/dev/cciss/c0d2\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d2 now exists\nreturn volume path /dev/cciss/c0d2\nRunning command [\'blkid\', \'-o\', \'export\', \'/dev/cciss/c0d2\'] with allowed return codes [0, 2] (shell=False, capture=True)\nCan\'t find a uuid for volume: cciss!c0d2. Skipping dname.\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d2\nstart: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring disk: cciss!c0d3\nget_path_to_storage_volume for volume cciss!c0d3\nProcessing serial 600508b100184439535350395850004a via udev to 600508b100184439535350395850004a\ndevsync for /dev/cciss/c0d3\nRunning command [\'partprobe\', \'/dev/cciss/c0d3\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d3 now exists\nreturn volume path /dev/cciss/c0d3\nRunning command [\'mdadm\', \'--assemble\', \'--scan\'] with allowed return codes [0, 1, 2] (shell=False, capture=True)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\nclear_holders running on \'/sys/class/block/cciss!c0d3\', with holders \'[]\'\nwiping 1M on /dev/cciss/c0d3 at offsets [0, -1048576]\nget_path_to_storage_volume for volume cciss!c0d3\nProcessing serial 600508b100184439535350395850004a via udev to 600508b100184439535350395850004a\ndevsync for /dev/cciss/c0d3\nRunning command [\'partprobe\', \'/dev/cciss/c0d3\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d3 now exists\nreturn volume path /dev/cciss/c0d3\nRunning command [\'blkid\', \'-o\', \'export\', \'/dev/cciss/c0d3\'] with allowed return codes [0, 2] (shell=False, capture=True)\nCan\'t find a uuid for volume: cciss!c0d3. Skipping dname.\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: SUCCESS: finished: configuring disk: cciss!c0d3\nstart: cmd-install/stage-partitioning/builtin/cmd-block-meta: started: configuring partition: cciss!c0d0-part1\nget_path_to_storage_volume for volume cciss!c0d0\nProcessing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d0\nget_path_to_storage_volume for volume cciss!c0d0\nProcessing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d0\nc0d0 logical_block_size_bytes: 512\nadding partition \'cciss!c0d0-part1\' to disk \'cciss!c0d0\' (ptable: \'msdos\')\npartnum: 1 offset_sectors: 2048 length_sectors: 71106559\nRunning command [\'parted\', \'/dev/cciss/c0d0\', \'--script\', \'mkpart\', \'primary\', \'2048s\', \'71108607s\'] with allowed return codes [0] (shell=False, capture=True)\nget_path_to_storage_volume for volume cciss!c0d0-part1\nget_path_to_storage_volume for volume cciss!c0d0\nProcessing serial 600508b1001844395353503958500047 via udev to 600508b1001844395353503958500047\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d0\ndevsync for /dev/cciss/c0d0\nRunning command [\'partprobe\', \'/dev/cciss/c0d0\'] with allowed return codes [0, 1] (shell=False, capture=False)\nRunning command [\'udevadm\', \'settle\'] with allowed return codes [0] (shell=False, capture=False)\ndevsync happy - path /dev/cciss/c0d0 now exists\nreturn volume path /dev/cciss/c0d01\nAn error occured handling \'cciss!c0d0-part1\': ValueError - %s: not an existing file or block device\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: failed: configuring partition: cciss!c0d0-part1\nfinish: cmd-install/stage-partitioning/builtin/cmd-block-meta: FAIL: failed: curtin command block-meta\nTraceback (most recent call last):\n  File "/curtin/curtin/commands/main.py", line 210, in main\n    ret = args.func(args)\n  File "/curtin/curtin/commands/block_meta.py", line 62, in block_meta\n    meta_custom(args)\n  File "/curtin/curtin/commands/block_meta.py", line 1136, in meta_custom\n    handler(command, storage_config_dict)\n  File "/curtin/curtin/commands/block_meta.py", line 647, in partition_handler\n    mode=info.get(\'wipe\'))\n  File "/curtin/curtin/block/__init__.py", line 678, in wipe_volume\n    quick_zero(path, partitions=False)\n  File "/curtin/curtin/block/__init__.py", line 589, in quick_zero\n    raise ValueError("%s: not an existing file or block device")\nValueError: %s: not an existing file or block device\n%s: not an existing file or block device\n'

curtin config G4:
robin@IbmRS1:~$ maas maaster node get-curtin-config node-1e6e4f34-3e88-11e6-8da5-001a640920e4
Success.
Machine-readable output follows:
apt_mirrors:
  ubuntu_archive: http://archive.ubuntu.com//ubuntu
  ubuntu_security: http://archive.ubuntu.com//ubuntu
apt_proxy: http://192.168.1.150:8000/
debconf_selections:
  maas: 'cloud-init   cloud-init/datasources  multiselect MAAS

cloud-init   cloud-init/maas-metadata-url  string http://192.168.1.150/MAAS/metadata/

cloud-init   cloud-init/maas-metadata-credentials  string oauth_token_key=bFafkykFbKGEmnJQUd&oauth_token_secret=spHRWZ6MVYGurACNA3VNbNpMLQHUzX9y&oauth_consumer_key=evURzZ65FXhUNhXfQm

cloud-init   cloud-init/local-cloud-config  string apt_preserve_sources_list:
    true\napt_proxy: http://192.168.1.150:8000/\nmanage_etc_hosts: false\nmanual_cache_clean:
    true\nreporting:\n  maas: {consumer_key: evURzZ65FXhUNhXfQm, endpoint: ''http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4'',\n    token_key:
    bFafkykFbKGEmnJQUd, token_secret: spHRWZ6MVYGurACNA3VNbNpMLQHUzX9y,\n    type:
    webhook}\nsystem_info:\n  package_mirrors:\n  - arches: [i386, amd64]\n    failsafe:
    {primary: ''http://archive.ubuntu.com/ubuntu'', security: ''http://security.ubuntu.com/ubuntu''}\n    search:\n      primary:
    [''http://archive.ubuntu.com/ubuntu'']\n      security: [''http://archive.ubuntu.com/ubuntu'']\n  -
    arches: [default]\n    failsafe: {primary: ''http://ports.ubuntu.com/ubuntu-ports'',
    security: ''http://ports.ubuntu.com/ubuntu-ports''}\n    search:\n      primary:
    [''http://ports.ubuntu.com/ubuntu-ports'']\n      security: [''http://ports.ubuntu.com/ubuntu-ports'']\n

'
install:
  log_file: /tmp/install.log
  post_files:
  - /tmp/install.log
kernel:
  mapping: {}
  package: linux-generic
late_commands:
  maas:
  - wget
  - --no-proxy
  - http://192.168.1.150/MAAS/metadata/latest/by-id/node-1e6e4f34-3e88-11e6-8da5-001a640920e4/
  - --post-data
  - op=netboot_off
  - -O
  - /dev/null
network:
  config:
  - id: eth0
    mac_address: 00:14:c2:c1:fe:ad
    mtu: 1500
    name: eth0
    subnets:
    - address: 10.1.1.151/24
      dns_nameservers: []
      gateway: 10.1.1.100
      type: static
    type: physical
  - id: eth1
    mac_address: 00:14:c2:c1:fe:ac
    mtu: 1500
    name: eth1
    subnets:
    - type: manual
    type: physical
  - address: 192.168.1.150
    search:
    - maas
    type: nameserver
  version: 1
network_commands:
  builtin:
  - curtin
  - net-meta
  - custom
partitioning_commands:
  builtin:
  - curtin
  - block-meta
  - custom
power_state:
  mode: reboot
reporting:
  maas:
    consumer_key: evURzZ65FXhUNhXfQm
    endpoint: http://192.168.1.150/MAAS/metadata/status/node-1e6e4f34-3e88-11e6-8da5-001a640920e4
    token_key: bFafkykFbKGEmnJQUd
    token_secret: spHRWZ6MVYGurACNA3VNbNpMLQHUzX9y
    type: webhook
showtrace: true
storage:
  config:
  - grub_device: true
    id: cciss!c0d0
    model: LOGICAL VOLUME
    name: cciss!c0d0
    ptable: msdos
    serial: 600508b1001844395353503958500047
    type: disk
    wipe: superblock
  - id: cciss!c0d1
    model: LOGICAL VOLUME
    name: cciss!c0d1
    serial: 600508b1001844395353503958500048
    type: disk
    wipe: superblock
  - id: cciss!c0d2
    model: LOGICAL VOLUME
    name: cciss!c0d2
    serial: 600508b1001844395353503958500049
    type: disk
    wipe: superblock
  - id: cciss!c0d3
    model: LOGICAL VOLUME
    name: cciss!c0d3
    serial: 600508b100184439535350395850004a
    type: disk
    wipe: superblock
  - device: cciss!c0d0
    id: cciss!c0d0-part1
    name: cciss!c0d0-part1
    number: 1
    offset: 4194304B
    size: 36406558720B
    type: partition
    uuid: db74597b-1b1e-432b-ad4f-6df6047f53cb
    wipe: superblock
  - fstype: ext4
    id: cciss!c0d0-part1_format
    label: root
    type: format
    uuid: f0544d0e-7696-4b88-b0d0-8c7b9857b742
    volume: cciss!c0d0-part1
  - device: cciss!c0d0-part1_format
    id: cciss!c0d0-part1_mount
    path: /
    type: mount
  version: 1
verbosity: 3

Revision history for this message

DeeVee (deevee) wrote on 2016-07-04:

#30

Getting the same issue with HP 400i Smart Array ctrl.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-07:

#31

I have published a new curtin package in my ppa which should contain a complete fix.

So far one user has reported that they are able to complete an installation with curtin at revision 414, if any more users who have reported this want to try the package out its available at:
https://launchpad.net/~wesley-wiedenmeier/+archive/ubuntu/test2/+packages

I am going to try to add some verification into curtin's vmtests for cciss devices using fake paths, but I am not sure how well that is going to work so having verification from physical systems is really useful.

Revision history for this message

Gustaf Nilsson (gustafnilsson) wrote on 2016-07-07:

#32

Thanks for the patch. The new revision works like a charm for me. Good job!

Revision history for this message

DeeVee (deevee) wrote on 2016-07-08:

#33

Sorry, newbie here, could you pass along instructions to apply the fix, I will test as well.

Thanks!

Revision history for this message

DeeVee (deevee) wrote on 2016-07-08:

#34

Got it figured out for install; now got the following:

An error occured handling 'cciss/c0d0': FileNotFoundError - [Errno 2] No such file or directory: '/tmp/tmp64mzihdf/scratch/rules.d/cciss/c0d0'
[Errno 2] No such file or directory: '/tmp/tmp64mzihdf/scratch/rules.d/cciss/c0d0'
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'block-meta', 'custom']
Exit code: 3
Reason: -
Stdout: b"An error occured handling 'cciss/c0d0': FileNotFoundError - [Errno 2] No such file or directory: '/tmp/tmp64mzihdf/scratch/rules.d/cciss/c0d0'\n[Errno 2] No such file or directory: '/tmp/tmp64mzihdf/scratch/rules.d/cciss/c0d0'\n"
Stderr: ''

Andres Rodriguez (andreserl) on 2016-07-08

Changed in maas:
status:	Confirmed → Invalid

Revision history for this message

Robin (robinrego) wrote on 2016-07-08:

#35

I tried the new curtin package

robin@IbmRS1:~$ sudo apt-cache policy python3-curtin
python3-curtin:
  Installed: 0.1.0~bzr414-0ubuntu1
  Candidate: 0.1.0~bzr414-0ubuntu1
  Version table:
*** 0.1.0~bzr414-0ubuntu1 0
        500 http://ppa.launchpad.net/wesley-wiedenmeier/test2/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr385-0ubuntu1 0
        500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main amd64 Packages
     0.1.0~bzr227-0ubuntu1~14.04.1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
     0.1.0~bzr126-0ubuntu1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
-----------------------------------------------------------------------------------

  robin@IbmRS1:~$ sudo apt-cache policy maas
maas:
  Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
  Candidate: 1.9.3+bzr4577-0ubuntu1~trusty1
  Version table:
*** 1.9.3+bzr4577-0ubuntu1~trusty1 0
        500 http://ppa.launchpad.net/maas/stable/ubuntu/ trusty/main amd64 Packages
        100 /var/lib/dpkg/status
     1.7.6+bzr3376-0ubuntu3~14.04.1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
     1.5.4+bzr2294-0ubuntu1.2 0
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
     1.5+bzr2252-0ubuntu1 0
        500 http://ca.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages

-----------------------------------------------------------------------------------

Initially it appeared to work as one HP DL380 G5 servers showed STDOUT similar to that when deployment is successful. However I decided to delete all nodes and try again with enlisting, commissioning and deploying. The result is that the three HP servers show failed deployment as a result.

I will retry with a fresh install and post results.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-08:

#36

Hi DeeVee,

Could you please post the configuration that was given to curtin when you installed? You can ask maas for the configuration with:
maas <session> node get-curtin-config <system-id>

I believe that what happened in your case was that 'id' attribute for the disk at path '/dev/cciss/c0d0' was 'cciss/c0d0', which has a slash in it. Due to the way curtin generates dname rules, a slash in the id of a device with a name attribute would cause curtin to be unable to write a dname rule link file, as the OS would interpret the slash as a new directory instead of just part of the filename.

If this is the case, then we have to decide whether it would be better for curtin to replace a slash in the id attribute for storage config elements when generating dname rules or for maas not to emit storage config ids with special characters.

Revision history for this message

Blake Rouse (blake-rouse) wrote on 2016-07-08:

#37

Wesley,

"name: " should be used for dname not "id: ". "id: " should only be for curtin to reference other items in the yaml.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-08:

#38

The 'name' attr is used for the actual target of the link in /dev/disk/by-dname/ but the configuration file itself has a filename based on 'id' at the moment. I have another branch where I have dname rules all generated in a single file during a later part of the install process. This may be a good time to update that branch and merge it into this one, so that the id is no longer used for the name of the rules file.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-08:

#39

The curtin deb in my ppa has been updated to no longer base the filenames for the dname .rules files on the storage config element id. This should resolve the remaining issue.

If anyone would like to test, the package has been published in lp:wesley-wiedenmeier/test2 for yakkety, xenial and trusty.

Thanks

Revision history for this message

DeeVee (deevee) wrote on 2016-07-08:

#40

Hi Wesley,

I got a failed deployment again;

Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'
Setting up swapspace version 1, size = 8 GiB (8589930496 bytes)
no label, UUID=8bbfdae8-9228-47ec-b256-99807d12d2bb
[Errno 21] Is a directory: '/tmp/tmp0wfxjsqg/scratch/rules.d/cciss'
Installation failed with exception: Unexpected error while running command.
Command: ['curtin', 'curthooks']
Exit code: 3
Reason: -

I will post the commands you requested earlier in a bit

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-08:

#41

Hi DeeVee,

Sorry about that, thanks for testing so many times.

With curtin version curtin_0.1.0~bzr415-0ubuntu1, the lastest in the repo, the name of the dname rule file is just based on the 'name' attribute of the disk, the 'id' is no longer used for the filename. I think what's happened for you is that the name of the disk contained a slash. From the error you posted I woul dguess that the 'name' attribute of your cciss device is 'cciss/c0d0' or something like that.

Because the name attribute is used as the target for a link in /dev/disk/by-dname it can't really have a slash in it.

Revision history for this message

Ryan Harper (raharper) wrote on 2016-07-08:

#42

On Fri, Jul 8, 2016 at 5:53 PM, Wesley Wiedenmeier <
<email address hidden>> wrote:

> Hi DeeVee,
>
> Sorry about that, thanks for testing so many times.
>
> With curtin version curtin_0.1.0~bzr415-0ubuntu1, the lastest in the
> repo, the name of the dname rule file is just based on the 'name'
> attribute of the disk, the 'id' is no longer used for the filename. I
> think what's happened for you is that the name of the disk contained a
> slash. From the error you posted I woul dguess that the 'name' attribute
> of your cciss device is 'cciss/c0d0' or something like that.
>
> Because the name attribute is used as the target for a link in /dev/disk
> /by-dname it can't really have a slash in it.
>

I bet maas is generating those names from what it discovered in
commissioning.
We likely need to sanitize the name, much like with serial numbers, so
something
like replacing ['/', ' ',] with '_'

>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs-all
> https://bugs.launchpad.net/bugs/1562249
>
> Title:
> Failed to deploy machine with HP Smart Array Raid 6i
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1562249/+subscriptions
>

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-09:

#43

I can sanitize the names in the config file, but we should probably also emit a warning back to the controller that the name's been changed so we don't surprise a user who may have been expecting a name with a special character.

Revision history for this message

DeeVee (deevee) wrote on 2016-07-09:

#44

Thanks guys; I'll keep testing if it will keep hope alive.

This is a proof of concept for me / lab and I was given a bunch of DL360 G5's with P400i controllers, so it's a brick wall I'm hitting right now.

Only caveat this next week is that I'm off camping with my boys from Monday to Thursday, so my testing may be delayed.

But definitely; Thanks for you assistance!

I don't know if you still need the maas CLI output,even this is giving me issues.

Thanks again,

Darren

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#45

Installation steps Edit (4.4 KiB, text/plain)

My clean install test continues to produced Failed Deployment unfortunately on HP DL380 G4 & G5 Servers.

maas:
Installed: 1.9.3+bzr4577-0ubuntu1~trusty1
python3-curtin:
Installed: 0.1.0~bzr415-0ubuntu1

See attached files for detailed info:
'Fresh Test Jul10' which has my notes and steps involved in testing.

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#46

Curtin Config DL380G4 Edit (4.1 KiB, text/plain)

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#47

Curtin Config HPDl380g5-28Ghz Edit (4.3 KiB, text/plain)

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#48

Curtin Config HpDL380G5-30Ghz Edit (4.0 KiB, text/plain)

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#49

Curtin Config Deployed IBMx3650b Edit (4.4 KiB, text/plain)

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#50

Machine Output Edit (59.1 KiB, text/plain)

Please not that inspite of M3-hpDL380G5-28Ghz showing 'curtin installation finished' the node still show failed deployment status.

See attached file for machine ouptut of deployed Ibm x3650 and failed HpDL380's

Revision history for this message

Robin (robinrego) wrote on 2016-07-10:

#51

Finally success !!!
Using Xenial images for deployment allowed those HP nodes to Deploy Successfully.
Thank you.

Revision history for this message

Wesley Wiedenmeier (wesley-wiedenmeier) wrote on 2016-07-11:

#52

There's a new package in the ppa that sanatizes dnames. It should be able to work in the case encountered earlier where installation failed due to dnames containing a slash.

The behavior of the sanatization is to replace any characters other than A-Z, a-z, 0-9, -, and _ with a dash and issue a warning that the dname had to be changed to ensure that the user is notified.

Revision history for this message

DeeVee (deevee) wrote on 2016-07-11:

#53

It worked !!!!! Woohoo!

Thank you!

Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'
Setting up swapspace version 1, size = 4 GiB (4294963200 bytes)
no label, UUID=6be45c8e-433b-41c3-a56d-fecee46d5003
Error: Partition(s) 1 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
Replacing config file /etc/default/grub with new version
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Installing for i386-pc platform.
Installation finished. No error reported.
--2016-07-11 15:35:11-- http://192.168.x.x/MAAS/metadata/latest/by-id/4y3h7y/
Connecting to 192.168.x.x:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '/dev/null'

0K 67.1K=0s

2016-07-11 15:35:11 (67.1 KB/s) - '/dev/null' saved [2]

curtin: Installation finished.

Revision history for this message

Robin (robinrego) wrote on 2016-07-17:

#54

Download full text (3.5 KiB)

Deploying those HP nodes with 6i and P400 controllers works when deploying from the maas UI. However, I am not able to successfully complete an Openstack Autopilot installation with the fix in ppa 418. The HP nodes coninue to get status of 'Failed deployment'

python3-curtin:
Installed: 0.1.0~bzr418-0ubuntu1
I can provide logs but was wondering if there is something else that I should be aware of.. such as whether or not i would need to modify or wait till the fix is applied to any openstack files or repositorys.

From: DeeVee <email address hidden>
To: <email address hidden>
Sent: Monday, July 11, 2016 11:42 AM
Subject: [Bug 1562249] Re: Failed to deploy machine with HP Smart Array Raid 6i

It worked !!!!! Woohoo!

Thank you!

Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'
Setting up swapspace version 1, size = 4 GiB (4294963200 bytes)
no label, UUID=6be45c8e-433b-41c3-a56d-fecee46d5003
Error: Partition(s) 1 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
Replacing config file /etc/default/grub with new version
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Installing for i386-pc platform.
Installation finished. No error reported.
--2016-07-11 15:35:11-- http://192.168.x.x/MAAS/metadata/latest/by-id/4y3h7y/
Connecting to 192.168.x.x:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '/dev/null'

0K 67.1K=0s

2016-07-11 15:35:11 (67.1 KB/s) - '/dev/null' saved [2]

curtin: Installation finished.

--
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1562249

Title:
Failed to deploy machine with HP Smart Array Raid 6i

Status in curtin:
  In Progress
Status in Landscape Server:
  Invalid
Status in MAAS:
  Invalid

Bug description:
Attempting to deploy a machine with a HP Smart Array Raid 6i fails.
Installation output contains:

  Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
  An error occured handling 'cciss!c0d0': OSError - [Errno 2] No such file or directory: '/sys/block/c0d0/holders'
  [Errno 2] No such file or directory: '/sys/block/c0d0/holders'
  Installation failed with exception: Unexpected error while running command.
  Command: ['curtin', 'block-meta', 'custom']
  Exit code: 3
  Reason: -
  Stdout: "Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, prob...

Deploying those HP nodes with 6i and P400 controllers works when deploying from the maas UI.  However, I am not able to successfully complete an Openstack Autopilot installation with the fix in ppa 418.  The HP nodes coninue to get status of 'Failed deployment'
 
python3-curtin:
  Installed: 0.1.0~bzr418-0ubuntu1
I can provide logs but was wondering if there is something else that I should be aware of.. such as whether or not i would need to modify or wait till the fix is applied to any openstack files or repositorys.

From: DeeVee <1562249@bugs.launchpad.net>
 To: robinrego@yahoo.com 
 Sent: Monday, July 11, 2016 11:42 AM
 Subject: [Bug 1562249] Re: Failed to deploy machine with HP Smart Array Raid 6i
   
It worked !!!!!  Woohoo!

Thank you!

Leaving 'diversion of /etc/init/ureadahead.conf to /etc/init/ureadahead.conf.disabled by cloud-init'
Setting up swapspace version 1, size = 4 GiB (4294963200 bytes)
no label, UUID=6be45c8e-433b-41c3-a56d-fecee46d5003
Error: Partition(s) 1 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use.  As a result, the old partition(s) will remain in use.  You should reboot now before making further changes.
Replacing config file /etc/default/grub with new version
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.4.0-28-generic
Found initrd image: /boot/initrd.img-4.4.0-28-generic
done
Installing for i386-pc platform.
Installation finished. No error reported.
--2016-07-11 15:35:11--  http://192.168.x.x/MAAS/metadata/latest/by-id/4y3h7y/
Connecting to 192.168.x.x:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: '/dev/null'

0K                                                        67.1K=0s

2016-07-11 15:35:11 (67.1 KB/s) - '/dev/null' saved [2]

curtin: Installation finished.

-- 
You received this bug notification because you are subscribed to the bug
report.
https://bugs.launchpad.net/bugs/1562249

Title:
  Failed to deploy machine with HP Smart Array Raid 6i

Status in curtin:
  In Progress
Status in Landscape Server:
  Invalid
Status in MAAS:
  Invalid

Bug description:
  Attempting to deploy a machine with a HP Smart Array Raid 6i fails.
  Installation output contains:

Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.
  An error occured handling 'cciss!c0d0': OSError - [Errno 2] No such file or directory: '/sys/block/c0d0/holders'
  [Errno 2] No such file or directory: '/sys/block/c0d0/holders'
  Installation failed with exception: Unexpected error while running command.
  Command: ['curtin', 'block-meta', 'custom']
  Exit code: 3
  Reason: -
  Stdout: "Error: Partition(s) 5 on /dev/cciss/c0d0 have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes.\nAn error occured handling 'cciss!c0d0': OSError - [Errno 2] No such file or directory: '/sys/block/c0d0/holders'\n[Errno 2] No such file or directory: '/sys/block/c0d0/holders'\n"

To manage notifications about this bug go to:
https://bugs.launchpad.net/curtin/+bug/1562249/+subscriptions

Revision history for this message

Robin (robinrego) wrote on 2016-07-21:

#55

How can I force Xenial images to load on nodes during Openstack Autopilot Install.
Currently or by default, trusty images are being loaded.

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2016-07-21:

#56

That's not configurable in the autopilot. You can upload xenial images (or any other) into the cloud yourself using standard openstack tools, like "glance" or the horizon GUI.

Revision history for this message

Robin (robinrego) wrote on 2016-07-30:

#57

Im not quite sure if I have understood Andreas's answer to my previous question.

I am trying to carry out an openstack autopilot install on 6 machines.
M0 - IBM x3650 - Maas Server, Ubuntu 14.04 LTS
M1 - IBM x3650 - Node1
M2 - HP DL380 G5 - (P400 Raid controller) - Node 2
M3 - HP DL380 G5 - (P400 Raid controller) - Node 3
M4 - HP DL380 G4 - (6i Raid Controller) - Node 4
M5 - PC - node 5.

The reason for this Bug was that M2, M3 and M4 would fail deployment during the openstack cloud deployment.
This occurs after selecting the openstack components and then selecting hardware on which to deploy the cloud and then clicking on install.

I am using the ppa packages provided by Wesley and these allow successful manual deployment from the Maas UI only when I select 16.04 images for deployment (in maas UI settungs). However, during the autopilot install, landscape deploys the same machines to install the cloud but uses a 14.04 image and thus these HP nodes fail deployment.

My question is is there a way to make these HP machines request a wily or xenial image for successful cloud deployment.

Thanks.

Wesley Wiedenmeier (wesley-wiedenmeier) on 2016-07-31

Changed in curtin:
status:	In Progress → Fix Committed

Ryan Harper (raharper) on 2016-10-03

description:

updated

Revision history for this message

Andy Whitcroft (apw) wrote on 2016-10-05: Please test proposed package

#58

Hello Robin, or anyone else affected,

Accepted curtin into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/curtin/0.1.0~bzr425-0ubuntu1~16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags:

added: verification-needed

Andy Whitcroft (apw) on 2016-10-05

Changed in curtin (Ubuntu):
status:	New → Fix Released
Changed in curtin (Ubuntu Xenial):
status:	New → Fix Committed

Revision history for this message

Jon Grimm (jgrimm) wrote on 2016-10-06:

#59

Robin (robinrigo), or anyone that has commented thus far that this affects them:

Any chance you could verify the fix in trusty proposed works for you. See

https://bugs.launchpad.net/landscape/+bug/1562249/comments/58

Thanks!!

Revision history for this message

Jon Grimm (jgrimm) wrote on 2016-10-06:

#60

Err. I meant xenial-proposed.

Scott Moser (smoser) on 2016-10-07

Changed in curtin (Ubuntu Trusty):
importance:	Undecided → Medium
status:	New → Confirmed

Revision history for this message

Robin (robinrego) wrote on 2016-10-07:

#61

Working on testing trusty proposed .. but unsure on how to enable trusty proposed on ubuntu server. I will read up an figure it out, but any help would be appreciated.

Thanks!!

Revision history for this message

Robin (robinrego) wrote on 2016-10-10:

#62

MaaS UI hp nodes fail deployment.PNG Edit (54.3 KiB, image/png)

Hi Jon and Scott.. I tried to test the fix but I am unable to do so. I am looking for someone to help me with this. Here is what I tried so far.

First I was hoping to test this fix for ubuntu Server 14.04.5 LTS. Even after a fresh install, I had no success. So I figured the fix here is only for Xenial... and so I started over.

After a fresh install of Xenial server 16.04 LTS. I did the following to install the -proposed pkg.

sudo nano /etc/apt/sources.list:

added line -- > deb http://archive.ubuntu.com/ubuntu/ xenial-proposed restricted main multiverse universe

then, sudo nano /etc/apt/preferences.d/proposed-updates

Added the following lines -->

Package: *
Pin: release a=xenial-proposed
Pin-Priority: 400

then sudo apt-get upgrade

then .. robin@MaaS:~$ sudo apt-cache policy python3-curtin
python3-curtin:
  Installed: 0.1.0~bzr399-0ubuntu1~16.04.1
  Candidate: 0.1.0~bzr399-0ubuntu1~16.04.1
  Version table:
     0.1.0~bzr425-0ubuntu1~16.04.1 400
        400 http://archive.ubuntu.com/ubuntu xenial-proposed/main amd64 Packages
        400 http://archive.ubuntu.com/ubuntu xenial-proposed/main i386 Packages
*** 0.1.0~bzr399-0ubuntu1~16.04.1 500
        500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main i386 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr365-0ubuntu1 500
        500 http://ca.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
        500 http://ca.archive.ubuntu.com/ubuntu xenial/main i386 Packages

I am guessing I am not able to get the -proposed pkg installed ad I must be doing something wrong. I tried to enlist, coommission and deploy nodes with this .. and the result is that only the hp nodes fail deployment. see attached maas UI screenshot.

I am wondering if I need to install the file: curtin_0.1.0~bzr425.orig.tar.gz but I will need to learn how to how to do that over SSH connection or on the server directly.

Thanks.

Hi Jon and Scott.. I tried to test the fix but I am unable to do so.  I am looking for someone to help me with this.  Here is what I tried so far.

First I was hoping to test this fix for ubuntu Server 14.04.5 LTS.  Even after a fresh install, I had no success. So I figured the fix here is only for Xenial... and so I started over.

After a fresh install of Xenial server 16.04 LTS.  I did the following to install the -proposed pkg.

sudo nano /etc/apt/sources.list:

added line -- > deb http://archive.ubuntu.com/ubuntu/ xenial-proposed restricted main multiverse universe

then, sudo nano /etc/apt/preferences.d/proposed-updates

Added the following lines -->

Package: *
Pin: release a=xenial-proposed
Pin-Priority: 400

then sudo apt-get upgrade

then .. robin@MaaS:~$ sudo apt-cache policy python3-curtin
python3-curtin:
  Installed: 0.1.0~bzr399-0ubuntu1~16.04.1
  Candidate: 0.1.0~bzr399-0ubuntu1~16.04.1
  Version table:
     0.1.0~bzr425-0ubuntu1~16.04.1 400
        400 http://archive.ubuntu.com/ubuntu xenial-proposed/main amd64 Packages
        400 http://archive.ubuntu.com/ubuntu xenial-proposed/main i386 Packages
 *** 0.1.0~bzr399-0ubuntu1~16.04.1 500
        500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
        500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main i386 Packages
        100 /var/lib/dpkg/status
     0.1.0~bzr365-0ubuntu1 500
        500 http://ca.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
        500 http://ca.archive.ubuntu.com/ubuntu xenial/main i386 Packages

I am guessing I am not able to get the -proposed pkg installed ad I must be doing something wrong.  I tried to enlist, coommission and deploy nodes with this .. and the result is that only the hp nodes fail deployment. see attached maas UI screenshot.

I am wondering if I need to install the file: curtin_0.1.0~bzr425.orig.tar.gz  but I will need to learn how to how to do that over SSH connection or on the server directly.

Thanks.

Revision history for this message

Ryan Harper (raharper) wrote on 2016-10-11: Re: [Bug 1562249] Re: Failed to deploy machine with HP Smart Array Raid 6i

#63

Download full text (3.4 KiB)

On Sun, Oct 9, 2016 at 7:04 PM, Robin <email address hidden> wrote:

> Hi Jon and Scott.. I tried to test the fix but I am unable to do so. I
> am looking for someone to help me with this. Here is what I tried so
> far.
>
> First I was hoping to test this fix for ubuntu Server 14.04.5 LTS. Even
> after a fresh install, I had no success. So I figured the fix here is
> only for Xenial... and so I started over.
>

Thanks for giving this a try. Sorry for the confusion; you guessed
correctly; the update is to Xenial.

>
> After a fresh install of Xenial server 16.04 LTS. I did the following
> to install the -proposed pkg.
>
> sudo nano /etc/apt/sources.list:
>
> added line -- > deb http://archive.ubuntu.com/ubuntu/ xenial-proposed
> restricted main multiverse universe
>

All you should need to do on your MAAS host is:

echo "deb http://archive.ubuntu.com/ubuntu/ xenial-proposed restricted
main multiverse universe" | sudo tee -a /etc/apt/sources.list
sudo apt update; sudo apt install curtin

> then, sudo nano /etc/apt/preferences.d/proposed-updates
>
> Added the following lines -->
>
> Package: *
> Pin: release a=xenial-proposed
> Pin-Priority: 400
>

The Pin config is not needed.

>
> then sudo apt-get upgrade
>

I typically specify the package, we don't need you to upgrade everything.

>
> then .. robin@MaaS:~$ sudo apt-cache policy python3-curtin
> python3-curtin:
> Installed: 0.1.0~bzr399-0ubuntu1~16.04.1
> Candidate: 0.1.0~bzr399-0ubuntu1~16.04.1
> Version table:
> 0.1.0~bzr425-0ubuntu1~16.04.1 400
> 400 http://archive.ubuntu.com/ubuntu xenial-proposed/main amd64
> Packages
> 400 http://archive.ubuntu.com/ubuntu xenial-proposed/main i386
> Packages
> *** 0.1.0~bzr399-0ubuntu1~16.04.1 500
> 500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main amd64
> Packages
> 500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main i386
> Packages
> 100 /var/lib/dpkg/status
> 0.1.0~bzr365-0ubuntu1 500
> 500 http://ca.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
> 500 http://ca.archive.ubuntu.com/ubuntu xenial/main i386 Packages
>
> I am guessing I am not able to get the -proposed pkg installed ad I must
> be doing something wrong. I tried to enlist, coommission and deploy
> nodes with this .. and the result is that only the hp nodes fail
> deployment. see attached maas UI screenshot.
>

It appears that the updated curtin isn't installed; Try directly installing
with sudo apt install curtin as your policy shows it's available, but not
yet
installed.

>
> I am wondering if I need to install the file:
> curtin_0.1.0~bzr425.orig.tar.gz but I will need to learn how to how to
> do that over SSH connection or on the server directly.
>

The curtin package needs to be updated only on the MAAS host; MAAS handles
getting curtin over to the ephemeral environment.

>
> Thanks.
>
>
>
> ** Attachment added: "MaaS UI hp nodes fail deployment.PNG"
> https://bugs.launchpad.net/landscape/+bug/1562249/+
> attachment/4758425/+files/MaaS%20UI%20hp%20nodes%20fail%20deployment.PNG
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching sub...

On Sun, Oct 9, 2016 at 7:04 PM, Robin <robinrego@yahoo.com> wrote:

> Hi Jon and Scott.. I tried to test the fix but I am unable to do so.  I
> am looking for someone to help me with this.  Here is what I tried so
> far.
>
> First I was hoping to test this fix for ubuntu Server 14.04.5 LTS.  Even
> after a fresh install, I had no success. So I figured the fix here is
> only for Xenial... and so I started over.
>

Thanks for giving this a try. Sorry for the confusion; you guessed
correctly; the update is to Xenial.

>
> After a fresh install of Xenial server 16.04 LTS.  I did the following
> to install the -proposed pkg.
>
> sudo nano /etc/apt/sources.list:
>
> added line -- > deb http://archive.ubuntu.com/ubuntu/ xenial-proposed
> restricted main multiverse universe
>

All you should need to do on your MAAS host is:

echo "deb http://archive.ubuntu.com/ubuntu/ xenial-proposed restricted
main multiverse universe" | sudo tee -a /etc/apt/sources.list
sudo apt update; sudo apt install curtin

> then, sudo nano /etc/apt/preferences.d/proposed-updates
>
> Added the following lines -->
>
> Package: *
> Pin: release a=xenial-proposed
> Pin-Priority: 400
>

The Pin config is not needed.

>
> then sudo apt-get upgrade
>

I typically specify the package, we don't need you to upgrade everything.

>
> then .. robin@MaaS:~$ sudo apt-cache policy python3-curtin
> python3-curtin:
>   Installed: 0.1.0~bzr399-0ubuntu1~16.04.1
>   Candidate: 0.1.0~bzr399-0ubuntu1~16.04.1
>   Version table:
>      0.1.0~bzr425-0ubuntu1~16.04.1 400
>         400 http://archive.ubuntu.com/ubuntu xenial-proposed/main amd64
> Packages
>         400 http://archive.ubuntu.com/ubuntu xenial-proposed/main i386
> Packages
>  *** 0.1.0~bzr399-0ubuntu1~16.04.1 500
>         500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main amd64
> Packages
>         500 http://ca.archive.ubuntu.com/ubuntu xenial-updates/main i386
> Packages
>         100 /var/lib/dpkg/status
>      0.1.0~bzr365-0ubuntu1 500
>         500 http://ca.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
>         500 http://ca.archive.ubuntu.com/ubuntu xenial/main i386 Packages
>
> I am guessing I am not able to get the -proposed pkg installed ad I must
> be doing something wrong.  I tried to enlist, coommission and deploy
> nodes with this .. and the result is that only the hp nodes fail
> deployment. see attached maas UI screenshot.
>

It appears that the updated curtin isn't installed;  Try directly installing
with sudo apt install curtin as your policy shows it's available, but not
yet
installed.

>
> I am wondering if I need to install the file:
> curtin_0.1.0~bzr425.orig.tar.gz  but I will need to learn how to how to
> do that over SSH connection or on the server directly.
>

The curtin package needs to be updated only on the MAAS host;  MAAS handles
getting curtin over to the ephemeral environment.

>
> Thanks.
>
>
>
> ** Attachment added: "MaaS UI hp nodes fail deployment.PNG"
>    https://bugs.launchpad.net/landscape/+bug/1562249/+
> attachment/4758425/+files/MaaS%20UI%20hp%20nodes%20fail%20deployment.PNG
>
> --
> You received this bug notification because you are subscribed to curtin.
> Matching subscriptions: curtin-bugs, curtin-bugs-all
> https://bugs.launchpad.net/bugs/1562249
>
> Title:
>   Failed to deploy machine with HP Smart Array Raid 6i
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/curtin/+bug/1562249/+subscriptions
>

Revision history for this message

Robin (robinrego) wrote on 2016-10-11:

#64

I was able to install and test the updated curtin pkg "curtin 0.1.0~bzr425-0ubuntu1~16.04.1".

It resolves my problem, I am now able to have all the servers deployed.

Thank you Ryan and everyone else for the updated pkg and guidance in testing.

Robin (robinrego) on 2016-10-11

tags:

added: verification-done
removed: verification-needed

Revision history for this message

Robin (robinrego) wrote on 2016-10-12:

#65

Please let me know if an updated package of this 'curtin 0.1.0~bzr425' would be built for for 14.04 trusty as well.

I would be happy to test it as well as test the Autopilot set up on this rig.

Thanks.

Revision history for this message

Launchpad Janitor (janitor) wrote on 2016-10-17:

#66

This bug was fixed in the package curtin - 0.1.0~bzr425-0ubuntu1~16.04.1

---------------
curtin (0.1.0~bzr425-0ubuntu1~16.04.1) xenial-proposed; urgency=medium

[ Scott Moser ]
* debian/new-upstream-snapshot: add writing of debian changelog entries.

  [ Ryan Harper ]
  * New upstream snapshot.
    - unittest,tox.ini: catch and fix issue with trusty-level mock of open
    - block/mdadm: add option to ignore mdadm_assemble errors (LP: #1618429)
    - curtin/doc: overhaul curtin documentation for readthedocs.org
      (LP: #1351085)
    - curtin.util: re-add support for RunInChroot (LP: #1617375)
    - curtin/net: overhaul of eni rendering to handle mixed ipv4/ipv6 configs
    - curtin.block: refactor clear_holders logic into block.clear_holders and
      cli cmd
    - curtin.apply_net should exit non-zero upon exception. (LP: #1615780)
    - apt: fix bug in disable_suites if sources.list line is blank.
    - vmtests: disable Wily in vmtests
    - Fix the unittests for test_apt_source.
    - get CURTIN_VMTEST_PARALLEL shown correctly in jenkins-runner output
    - fix vmtest check_file_strippedline to strip lines before comparing
    - fix whitespace damage in tests/vmtests/__init__.py
    - fix dpkg-reconfigure when debconf_selections was provided.
      (LP: #1609614)
    - fix apt tests on non-intel arch
    - Add apt features to curtin. (LP: #1574113)
    - vmtest: easier use of parallel and controlling timeouts
    - mkfs.vfat: add force flag for formating whole disks (LP: #1597923)
    - block.mkfs: fix sectorsize flag (LP: #1597522)
    - block_meta: cleanup use of sys_block_path and handle cciss knames
      (LP: #1562249)
    - block.get_blockdev_sector_size: handle _lsblock multi result return
      (LP: #1598310)
    - util: add target (chroot) support to subp, add target_path helper.
    - block_meta: fallback to parted if blkid does not produce output
      (LP: #1524031)
    - commands.block_wipe: correct default wipe mode to 'superblock'
    - tox.ini: run coverage normally rather than separately
    - move uefi boot knowledge from launch and vmtest to xkvm

-- Ryan Harper <email address hidden> Mon, 03 Oct 2016 13:43:54 -0500

This bug was fixed in the package curtin - 0.1.0~bzr425-0ubuntu1~16.04.1

---------------
curtin (0.1.0~bzr425-0ubuntu1~16.04.1) xenial-proposed; urgency=medium

[ Scott Moser ]
  * debian/new-upstream-snapshot: add writing of debian changelog entries.

[ Ryan Harper ]
  * New upstream snapshot.
    - unittest,tox.ini: catch and fix issue with trusty-level mock of open
    - block/mdadm: add option to ignore mdadm_assemble errors  (LP: #1618429)
    - curtin/doc: overhaul curtin documentation for readthedocs.org
      (LP: #1351085)
    - curtin.util: re-add support for RunInChroot  (LP: #1617375)
    - curtin/net: overhaul of eni rendering to handle mixed ipv4/ipv6 configs
    - curtin.block: refactor clear_holders logic into block.clear_holders and
      cli cmd
    - curtin.apply_net should exit non-zero upon exception.  (LP: #1615780)
    - apt: fix bug in disable_suites if sources.list line is blank.
    - vmtests: disable Wily in vmtests
    - Fix the unittests for test_apt_source.
    - get CURTIN_VMTEST_PARALLEL shown correctly in jenkins-runner output
    - fix vmtest check_file_strippedline to strip lines before comparing
    - fix whitespace damage in tests/vmtests/__init__.py
    - fix dpkg-reconfigure when debconf_selections was provided.
      (LP: #1609614)
    - fix apt tests on non-intel arch
    - Add apt features to curtin.  (LP: #1574113)
    - vmtest: easier use of parallel and controlling timeouts
    - mkfs.vfat: add force flag for formating whole disks  (LP: #1597923)
    - block.mkfs: fix sectorsize flag  (LP: #1597522)
    - block_meta: cleanup use of sys_block_path and handle cciss knames
      (LP: #1562249)
    - block.get_blockdev_sector_size: handle _lsblock multi result return
      (LP: #1598310)
    - util: add target (chroot) support to subp, add target_path helper.
    - block_meta: fallback to parted if blkid does not produce output
      (LP: #1524031)
    - commands.block_wipe:  correct default wipe mode to 'superblock'
    - tox.ini: run coverage normally rather than separately
    - move uefi boot knowledge from launch and vmtest to xkvm

-- Ryan Harper <ryan.harper@canonical.com>  Mon, 03 Oct 2016 13:43:54 -0500

Changed in curtin (Ubuntu Xenial):
status:	Fix Committed → Fix Released

Revision history for this message

Martin Pitt (pitti) wrote on 2016-10-17: Update Released

#67

The verification of the Stable Release Update for curtin has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message

Scott Moser (smoser) wrote on 2017-12-15: Fixed in Curtin 17.1

#68

This bug is believed to be fixed in curtin in 17.1. If this is still a problem for you, please make a comment and set the state back to New

Thank you.

Changed in curtin:
status:	Fix Committed → Fix Released

Landscape Server

Failed to deploy machine with HP Smart Array Raid 6i

Bug Description

Related branches

Other bug subscribers

Related questions

Bug attachments

Remote bug watches

	Status	Importance	Assigned to
Landscape Server	Invalid	Undecided	Unassigned
MAAS	Invalid	Undecided	Unassigned
curtin	Fix Released	Undecided	Unassigned
curtin (Ubuntu)	Fix Released	Undecided	Unassigned
Trusty	Confirmed	Medium	Unassigned
Xenial	Fix Released	Undecided	Unassigned