Reshard causing blocks to write IO
ceph version 12.2.8 Luminous
Hi,
We are new to Ceph and the most recent issue I came across at what seems random is resharding, causing a complete pause to writes to the cluster. We have a 6x data node, 3x monitor (shared with 3 of the data nodes) and 3x RGW configuration for a total of 9 physical nodes in this particular cluster.
When we execute our load tests which are intensive, we are stable at 6.2 GBps throughput (5.4 GBps writes / 900 MBps reads). The next morning we see there was an hour of disruption early in the morning. The errors are very clear and correlate perfectly:
2019-04-12 09:26:40.649588 7f53f8e51700 0 NOTICE: resharding operation on bucket index detected, blocking
2019-04-12 09:26:40.660673 7f53ce5fc700 0 NOTICE: resharding operation on bucket index detected, blocking
2019-04-12 09:26:41.581099 7f53fae55700 0 NOTICE: resharding operation on bucket index detected, blocking
2019-04-12 09:26:46.088665 7f53f2644700 0 NOTICE: resharding operation on bucket index detected, blocking
2019-04-12 09:26:46.160134 7f53f7e4f700 0 NOTICE: resharding operation on bucket index detected, blocking
2019-04-12 09:27:25.659575 7f53f8e51700 0 block_while_
2019-04-12 09:27:25.671357 7f53ce5fc700 0 block_while_
2019-04-12 09:27:25.671641 7f53f8e51700 0 NOTICE: resharding operation on bucket index detected, blocking
I see mentioning of thresholds as well which I'm unfamiliar with:
2019-04-12 10:36:44.158795 7f53dae15700 0 check_bucket_
The limit check appears as per below:
[
{
"user_id": "svc.cb1local",
"buckets": [
{
},
{
}
]
}
]
If someone can provide me a link or explanation on what resharding is and how to go about this, that would be most helpful. Apologies if this has been asked - I didn't see anything when doing a search on the word shard or reshard.
Thanks,
Aaron
Question information
- Language:
- English Edit question
- Status:
- Expired
- For:
- Ubuntu Edit question
- Assignee:
- No assignee Edit question
- Last query:
- Last reply: