404 errors after rebalance object ring
I've SAIO cluster with 30 HDDs with weight=6000 on each device.
Then I've expanded my cluster and added new 45 HDDs which have the same size, and set weigh=300 for each (to prevent high load from replication and rebalance affect)
Then I waited 1 day, and saw in my logs that replication process was very slow.
-- 162/161696 (0.10%) partitions replicated in 21300.66s (0.01/sec, 5899h remaining) --
I've buckets with millions of small objects. And then I decide to increase weight of new HDDs to 6000 and rebalance ring. After this I noticed that load on this new disks was increased, but I got another problem, some objects wasn't accessible from storage, and debug showed me that there was wrong paths to this object in new object.ring.gz file
Some debug of object location
---------
swift-get-nodes /etc/swift/
Account AUTH_d26a7f8b5e
Partition 63316 Hash f7545a8af6daa9b
Server:Port Device 10.0.0.2:6000 21 Server:Port Device 10.0.0.3:6000 18 Server:Port Device 10.1.0.1:6000 11 Server:Port Device 10.0.0.3:6000 4 [Handoff] Server:Port Device 10.1.0.1:6000 23 [Handoff] Server:Port Device 10.0.0.2:6000 15 [Handoff]
curl -g -I -XHEAD "http://
Use your own device location of servers: such as "export DEVICE=/srv/node" ssh 10.0.0.2 "ls -lah ${DEVICE:
-------------
I've checked and find this object in another location
10.0.0.2: /4/objects/
So it is looks like object exist, but has wrong path to it and I'm getting 404 not found(
----
Then I've return previous object.ring.gz file and paths to objects was restored.
So my question: do I need to wait when replication process is finished (5899h !) and only then rebalance my ring with new device weights ?
Question information
- Language:
- English Edit question
- Status:
- Expired
- Assignee:
- No assignee Edit question
- Last query:
- Last reply: