Several panda boards have a bad master image

Bug #824178 reported by Paul Larson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
LAVA Validation Lab
Fix Released
Critical
Dave Pigott

Bug Description

...or that's my best guess at least, based on what we saw at the connect. Somewhere, there was an image floating around at the connect with a new hwpack that had a bug where the kernel would not recognize eth0. This is causing *big* problems when it tries to send back the results and doesn't see an eth0. Did these change??? If not, then something is changing the master image kernel, which is really bad.

root@master:/tmp/lava/results# ifconfig -a
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

sit0 Link encap:IPv6-in-IPv4
          NOARP MTU:1480 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

root@master:/tmp/lava/results# uname -a
Linux master 3.0.0-1002-linaro-omap #3~ppa~natty-Ubuntu SMP PREEMPT Fri Jul 29 01:18:40 UTC 2011 armv7l armv7l armv7l GNU/Linux

This particular example was on panda03

Paul Larson (pwlars)
Changed in lava-lab:
importance: Undecided → Critical
assignee: nobody → Dave Pigott (dpigott)
Revision history for this message
Paul Larson (pwlars) wrote :

Ok, I know what's happening now. First, here's the known list of affected boards:
panda01
panda02
panda03
[panda04 is down still, so I can't check]
beaglexm01
beaglexm03

We have a new test running now, bootchart, which calls flash_kernel, installing a new kernel on the default partition for this board regardless of the fact that we ran it in chroot. I'm working on a workaround for this, and have disabled bootchart in the daily runs for now. In the meantime, we need to get the boards fixed, starting with the panda boards, because we currently have *no* panda boards available for testing.

Revision history for this message
Paul Larson (pwlars) wrote :

Update! I was able to remotely restore panda01 and panda02 and beaglexm01. So we are now just down 2 panda boards and a beaglexm, and one of those pandas was already down.

Revision history for this message
Dave Pigott (dpigott) wrote :

beaglexm03 and panda03 seem to boot up into the master image just fine, and eth0 is up. I'll enable them and see if everything goes properly now.

As I said previously, panda04 has not returned from Connect, so it won't be back up until I have it in my hands and can install it in the farm.

Revision history for this message
Paul Larson (pwlars) wrote :

panda03 and panda04 still have a 3.0 kernel on them, and in particular, panda03 gets some strange errors sometimes. If you could reimage those, that would be good.

Changed in lava-lab:
milestone: none → 2011.09
Revision history for this message
Dave Pigott (dpigott) wrote :

Have re-flashed both and put them back online. Some weirdness with the hwpacks. Last good one was on Aug 17th. After that u-boot.bin is missing.

Revision history for this message
Paul Larson (pwlars) wrote :

I think this is complete right? If not, please reopen and explain why, but I know all 4 of these boards are back online.

Changed in lava-lab:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.