High IO-Wait, load avgs and poor performance with E200i HW-RAIDed disks under 8.04.1, 8.10 server editions

Asked by Joonas Koivunen

Hi,

I've just began installing our new server (HP ML350G5, integrated E200(i) controller without battery backed write cache, 5 disks in RAID 1+0 with one spare). Filesystems used are ext3, relatime.

On 8.04.1 Server edition I noticed I was getting very poor performance on bonnie++ writes. dstat and htop show that all 4 cores are 100% in WAIT, but 15-20MB/s peaks can be observed from dstat or bonnie++ results.

HP, or their cciss (RAID drivers) team don't seem to want to respond any questions of this nature, so I decided to do distribution upgrade to 8.10.

Now, in 8.10, I suddenly get 28-34MB/s writing peaks and the overall bonnie++ result has doubled. Nothing on the hardware settings was changed. But still, all 4 cores are 100% in WAIT.

It's like no DMA is being used when writing, but reading turns out nice results like 300MB/s peaks, bonnie++ averages them to 150MB/s.

Any ideas? Current kernel is 2.6.27-7-server, and I've tried (on this latest kernel) deadline, anticipatory and cfg io-schedulers with no change in the speeds.

I'm running bonnie++ with -s 10000M (which equals twice the amount of RAM I have).

Also writing with dd from a file that is completly in RAM (no reads are made) straight to a partition gives same 5-15MB/s results.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Solved by:
Joonas Koivunen
Solved:
Last query:
Last reply:
Revision history for this message
Joonas Koivunen (joonas-koivunen) said :
#1

It'd seem that now when I enabled drive's write cache feature (which is inheritly dangerious) , I get writes up to 250-300MB/s, with the previous IO-Waits.

Starts to sound more reasonable.. And sounds like I need to purchase that BBWC and UPS, so that I can use both caches safely. Assuming there will be hard lockups :) There's rather much data to be lost in a second with speeds this high -- with the additional chance of filesystem failure.