Slow write performance

Asked by Rustam A

I'm testing swift 1.4.4 setup with 4 nodes/zones on RHEL 5.7. I ran into the problem of slow writes. Using swift-bench we generated load, writing hundreds of 4K files. Results:
 - Writes - ~3 PUTs/sec (very slow)
 - Reads - ~25 GETs/sec (ok)

It's clear that writes are not limited by I/O. To prove that, I increased file size to 8K, 32K and 64K. In all cases, even with 64K, I had same result - 3 PUTs/sec.

In the logs of the object servers I can see that each PUT operation actually took around 0.3 sec which is inline with 3 PUTs/ sec reported by swift-bench.

Is that expected performance for such small cluster? Can I break down those 0.3 seconds spent for PUT operation to see where's the bottleneck?

Any advice for troubleshooting this is welcome.

Question information

Language:
English Edit question
Status:
Expired
For:
OpenStack Object Storage (swift) Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Rustam A (rustam-code) said :
#1

Right now I tried everything, but PUT speed remains very slow. In order to speedup I tried various combination of workers, disabled disk mount check, enabled DEBUG log level, checked source code to understand logic, finally added another node. No errors, and I was able only to squeeze out 5 PUTs/sec.

I described full setup here: http://paste.openstack.org/show/3885/

The only thing which looks suspicious to me is these errors:

Dec 18 04:01:28 ec01 object-server ERROR container update failed with 10.0.1.3:6001/d01 (saving for async update later): Timeout (3s) (txn: txdf95ad5a10844ee0b74d70d8a7638082)
Dec 18 04:01:28 ec01 object-server ERROR container update failed with 10.0.1.2:6001/d01 (saving for async update later): Timeout (3s) (txn: txee2545ba4610430fa3a6a166ca50c574)
Dec 18 04:01:28 ec01 object-server ERROR container update failed with 10.0.1.8:6001/d01 (saving for async update later): Timeout (3s) (txn: tx2546b29b15c643ec90a122a753dfddd3)

I don't really know where else to dig. Any help appreciated.

Revision history for this message
Launchpad Janitor (janitor) said :
#2

This question was expired because it remained in the 'Open' state without activity for the last 15 days.