OpenStack Object Storage (swift)

Write per-zone (datacentre)

Asked by Ben Rowland on 2012-12-12

Hi,

I'm evaluating Swift and have a question about the Ring Builder and zones.

Say I have 2 datacentres and I want object PUTs to always be made synchronously to each of the 2 DCs, but I have potentially many devices in any one DC. I could define the 2 DCs as separate zones, where (let's say for simplicity) zone 1 has one device, and zone 2 has two devices. So I could declare a replica count of 3, thus forcing a majority of writes to go to the 2 zones. So, I create a simple ring (with 16 partitions for simplicity) as follows:

swift-ring-builder /tmp/ring create 4 3 1
swift-ring-builder /tmp/ring add z1-<server1>:8888/sda1 100.0
swift-ring-builder /tmp/ring add z2-<server2>:8888/sda1 100.0
swift-ring-builder /tmp/ring add z2-<server3>:8888/sda1 100.0
swift-ring-builder /tmp/ring rebalance

The abbreviated output of unpickling the ring:

array('H', [1, 2, 2, 0, 2, 0, 0, 0, 0, 2, 1, 2, 2, 1, 1, 1]),
array('H', [0, 0, 0, 1, 0, 2, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0]),
array('H', [2, 1, 1, 2, 1, 1, 1, 2, 2, 1, 2, 1, 1, 2, 2, 2])

This seems surprising as the documentation says "For a given partition number, each replica’s device will not be in the same zone as any other replica’s device." But here each partition has a replica at both of the devices in zone 2.

Does Swift do anything in this case to ensure that during writes, the majority of 2 nodes actually reside in different zones? For example, the single device in zone 1 could be unavailable - this would leave a majority of 2 nodes available in zone 2, so the write would only be done there?

Many thanks,

Ben

Question information

Language:: English Edit question

Status:: Solved

For:: OpenStack Object Storage (swift) Edit question

Assignee:: No assignee Edit question

Solved by:: Samuel Merritt

Solved:: 2012-12-13

Last query:: 2012-12-13

Last reply:: 2012-12-12

Link existing bug

Revision history for this message

Samuel Merritt (torgomatic) said on 2012-12-12:

Well, you've got 2 zones and 3 replicas, so you pretty much have to have >1 replica within a zone. That's just the pigeonhole principle at work. The docs should probably say something like ""For a given partition number, each replica’s device will not be in the same zone as any other replica’s device *as long as there are at least as many zones as replicas*."

The quorum of write nodes is not distributed in any particular way, so it's true that a write could go entirely to zone 2 if the device in zone 1 is down. Once the fault in zone 1 is repaired, replication will ensure that a copy of the object makes its way to zone 1.

Revision history for this message

Ben Rowland (ben-rowland) said on 2012-12-13:

Thanks Samuel Merritt, that solved my question.

To post a message you must log in.

Ask a question

Edit question

OpenStack Object Storage (swift)

Write per-zone (datacentre)

Question information

Related bugs

Related FAQ:

Subscribers