Connectivity issues with BCM5722 based Nic on Dell T110ii server running 12.04 LTS

Bug #1048817 reported by Jeff Lane 
26
This bug affects 3 people
Affects Status Importance Assigned to Milestone
firmware-nonfree (Debian)
Fix Released
Unknown
linux (Ubuntu)
Won't Fix
Medium
Unassigned

Bug Description

NOTE: I am NOT the requester, I am converting a question from answers.launchpad.net to a bug for the real requester: https://launchpad.net/~robert-grizilo

OS: Ubuntu 12.04 Server LTS x64
HW: Dell PowerEdge T110 ii

Integrated nic BCM5722 has connectivity issues, from time to time stops working, kicks out from running any session (web, ssh... whatever), and sometimes comes back, sometimes not, and in between you can't connect, you can't even ping.

I've tried to disable all bios IPMI features, but the problem didn't go away.
I've tried on all speeds (10/100/1000) using ethtool, but the problem didn't go away.
I've tried disabling various offload features using ethtool, but the problem didn't go away.
I've installed second nic Intel 82579 and is working perfectly on the same network switch and cables.

After every reboot the bcm5722 is working but then again after fiew minutes behave strange (the LED are showing correct states) and no connectivity.
The server is brand new so I think is not a hardware issue.

Jeff Lane  (bladernr)
affects: ubuntu-certification → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1048817

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: precise
Revision history for this message
Kent Baxley (kentb) wrote :

I'd also like to see logs from this system and more details about how to reproduce. I have a T110ii that does not experience these problems, even with a GA release of 12.04. I've done a simlutaneous download of files of size 3.2GB and 1.6GB via www and haven't had anything drop out yet. While those files were downloading, I also scp'ed some 3.2GB files over to another system. So far, so good.

Revision history for this message
Jeff Lane  (bladernr) wrote :

Hi Robert,

Let me summarize the information we need from you regarding your server and the NIC issue you're seeing.

1: You mention 12.04 on this. Have you run other previous versions of Ubuntu on this machine without seeing this issue? What versions?

2: Have you tried 12.10 (in development)?

3: What is the firmware level of your BCM5722 NIC? This sounds suspiciously like a firmware problem that has been seen in the past, so this could be a simple matter of updating the firmware on your Broadcom NIC. To get this information, just run the following command and paste the output into your reply:

ethtool -i eth0

that should get the info for your onboard NIC. (make sure you haven't disabled it in BIOS first), and since you installed that additional NIC, make sure you are pointing ethtool at the right device.

4: As Brad and Kent requested, please run the following: apport-collect 1048817
this will attach various system logs that will help us debug this issue.

Revision history for this message
Kent Baxley (kentb) wrote :

From what I can tell, the most current firmware for the Broadcom NIC is 7.2.14 for this system. I just updated mine from 6.2.1.

On my test T110ii, I have updated the Ubuntu OS to the latest 12.04.1 server code (kernel and all) in addition to updating the BCM NIC firmware. The BIOS on my T110ii is current at 2.0.5.

I've kicked off an iperf test on the T110ii. Iperf is basically a network benchmarking tool, but, I'm going to let it run for a few hours to see if I drop any connections at any point along the way.

Revision history for this message
Kent Baxley (kentb) wrote :

I kicked off an iperf test and let it run for about 5 hours on this server.

With the latest 12.04 kernel and latest firmware on the broadcom card, I did not get any interruptions in service on the NIC during the test run.

On my T110ii here, things seem to be working OK with the onboard brodcom 5722 NIC.

Revision history for this message
Kent Baxley (kentb) wrote :

Ethtool information for the NIC in my T110ii, which has so far *not* given me any problems:

Settings for eth0:
 Supported ports: [ TP ]
 Supported link modes: 10baseT/Half 10baseT/Full
                         100baseT/Half 100baseT/Full
                         1000baseT/Half 1000baseT/Full
 Supported pause frame use: No
 Supports auto-negotiation: Yes
 Advertised link modes: 10baseT/Half 10baseT/Full
                         100baseT/Half 100baseT/Full
                         1000baseT/Half 1000baseT/Full
 Advertised pause frame use: Symmetric
 Advertised auto-negotiation: Yes
 Speed: 1000Mb/s
 Duplex: Full
 Port: Twisted Pair
 PHYAD: 1
 Transceiver: internal
 Auto-negotiation: on
 MDI-X: Unknown
 Supports Wake-on: g
 Wake-on: d
 Current message level: 0x000000ff (255)
          drv probe link timer ifdown ifup rx_err tx_err
 Link detected: yes

driver: tg3
version: 3.121
firmware-version: 7.2.14 bc 5722-v3.11
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

Revision history for this message
Jeff Lane  (bladernr) wrote :

@Robert: another thought, have you tried different cables? What is the server plugged into? Could the port on your switch be going bad?

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Robert Grizilo (robert-grizilo) wrote : apport information

AlsaDevices:
 total 0
 crw-rw---T 1 root audio 116, 1 Sep 9 11:23 seq
 crw-rw---T 1 root audio 116, 33 Sep 9 11:23 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu5
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=45d339f3-112b-4955-8d0e-e943ca7194e8
InstallationMedia: Ubuntu-Server 12.04 LTS "Precise Pangolin" - Release amd64 (20120424.1)
MachineType: Dell Inc. PowerEdge T110 II
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB:

ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-23-generic root=UUID=2a6cd5aa-2acd-4039-a967-3e193c14f8f6 ro
ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-23-generic N/A
 linux-backports-modules-3.2.0-23-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directory
Tags: precise
Uname: Linux 3.2.0-23-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 03/13/2012
dmi.bios.vendor: Dell Inc.
dmi.bios.version: 2.0.5
dmi.board.name: 015TH9
dmi.board.vendor: Dell Inc.
dmi.board.version: A08
dmi.chassis.type: 17
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvr2.0.5:bd03/13/2012:svnDellInc.:pnPowerEdgeT110II:pvr:rvnDellInc.:rn015TH9:rvrA08:cvnDellInc.:ct17:cvr:
dmi.product.name: PowerEdge T110 II
dmi.sys.vendor: Dell Inc.

tags: added: apport-collected
Revision history for this message
Robert Grizilo (robert-grizilo) wrote : AcpiTables.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : BootDmesg.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : IwConfig.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : Lspci.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : Lsusb.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : ProcModules.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : UdevDb.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : UdevLog.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote : WifiSyslog.txt

apport information

Revision history for this message
Robert Grizilo (robert-grizilo) wrote :

ethtool -i eth0
driver: tg3
version: 3.121
firmware-version: 6.2.1 bc 5722-v3.11
bus-info: 0000:02:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes

switched to 10Mbps - not one ping
switched to 100Mbps - 3 of 10 ping pass (randomly)
switched to 1000Mbps - not one ping
the switch is some cisco GbE auto-everithing

with: ethtool -K eth0 rx off tx off - same behaviour
with: ethtool -K eth0 tso off - same behaviour
after limited testing it seems like hardware problem,
same UTP cable moved on server side to some noname USB-NIC (dm9601) and the connection works perfectly.

Revision history for this message
Kent Baxley (kentb) wrote :

Thanks, Robert.

If updating the OS to the latest code and also the firmware on the NIC to 7.2.14 doesn't help , then I'd recommend possibly opening up a service call on the system.

Changed in firmware-nonfree (Debian):
status: Unknown → New
Changed in firmware-nonfree (Debian):
status: New → Incomplete
Revision history for this message
Jeff Lane  (bladernr) wrote :

Unable to reproduce on hardware here. Looks like firmware is at least one or two levels down on the failing system. System we have access to with latest firmware works fine.

Changed in linux (Ubuntu):
status: Incomplete → Opinion
Revision history for this message
Jeff Lane  (bladernr) wrote :

The debian bug for this hasn't been touched since January of last year and this bug has sat idle about as long. Can we close this bug now?

Revision history for this message
Robert Grizilo (robert-grizilo) wrote :

Since the server was allready in production, I've installed Intel PCIe NIC adapter 'cause I had no more time to experiment, and that NIC is working without any problems. Thx, folks far all suggestions.

Changed in linux (Ubuntu):
status: Opinion → Incomplete
Revision history for this message
Jeff Lane  (bladernr) wrote :

Changed to Won't Fix based on Robert's last comment and that we were unable to reproduce this on similar hardware.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
Changed in firmware-nonfree (Debian):
status: Incomplete → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.