OpenStack Object Storage (swift)

Failover principle of Swift

Asked by Fatih Güçlü Akkaya on 2012-11-15

Hi,

During my test i came across following behaviour:

My environment consists of 1 proxy node and 6 storage nodes (account/container/object). I close three nodes and tried to make some gets to certain objects. I have discovered that swift proxy tried to reach the nodes which are closed thus receiving connection timeouts. I guess i did not configure my ring properly against failover. What am i missing in my configuration? You can see my configuration below:

ring configuration with partition power 18 replication 3

account-server.conf

[DEFAULT]
bind_ip = 0.0.0.0
workers = 8

[pipeline:main]
pipeline = account-server

[app:account-server]
use = egg:swift#account

[account-replicator]
run_pause=900

[account-auditor]

[account-reaper]

container-server.conf

[DEFAULT]
bind_ip = 0.0.0.0
workers = 8

[pipeline:main]
pipeline = container-server

[app:container-server]
use = egg:swift#container

[container-replicator]
run_pause=900

[container-updater]

[container-auditor]

object-server.conf

[DEFAULT]
bind_ip = 0.0.0.0
workers = 8

[pipeline:main]
pipeline = object-server

[app:object-server]
use = egg:swift#object

[object-replicator]
run_pause=900
ring_check_interval=900

[object-updater]

[object-auditor]

Question information

Language:: English Edit question

Status:: Solved

For:: OpenStack Object Storage (swift) Edit question

Assignee:: No assignee Edit question

Solved by:: Fatih Güçlü Akkaya

Solved:: 2012-11-16

Last query:: 2012-11-16

Last reply:: 2012-11-15

Link existing bug

This question was reopened

2012-11-15 by Fatih Güçlü Akkaya

Revision history for this message

Fatih Güçlü Akkaya (gucluakkaya) said on 2012-11-15:

Sorry for the previous comment accidently pressed solved button. Here is my account,container and object rings.

id zone ip address port name weight partitions balance meta
             0 1 ip1 6002 sdb1 100.00 131072 0.00
             1 2 ip2 6002 sdb1 100.00 131072 0.00
             2 3 ip3 6002 sdb1 100.00 131072 0.00
             3 4 ip4 6002 sdb1 100.00 131072 0.00
             4 5 ip5 6002 sdb1 100.00 131072 0.00
             5 6 ip6 6002 sdb1 100.00 131072 0.00

             0 1 ip1 6001 sdb1 100.00 131072 0.00
             1 2 ip2 6001 sdb1 100.00 131072 0.00
             2 3 ip3 6001 sdb1 100.00 131072 0.00
             3 4 ip4 6001 sdb1 100.00 131072 0.00
             4 5 ip5 6001 sdb1 100.00 131072 0.00
             5 6 ip6 6001 sdb1 100.00 131072 0.00

             0 1 ip1 6000 sdb1 100.00 131072 0.00
             1 2 ip2 6000 sdb1 100.00 131072 0.00
             2 3 ip3 6000 sdb1 100.00 131072 0.00
             3 4 ip4 6000 sdb1 100.00 131072 0.00
             4 5 ip5 6000 sdb1 100.00 131072 0.00
             5 6 ip6 6000 sdb1 100.00 131072 0.00

Revision history for this message

clayg (clay-gerrard) said on 2012-11-15:

If you only stop two nodes do you still have problems?

It seems like on three replica system, with 50% of the cluster offline - there's going to be some objects that only exist on downed nodes?

-clayg

Revision history for this message

Fatih Güçlü Akkaya (gucluakkaya) said on 2012-11-16:

Thank you for your answer. It seems like our application inserting and retrieving containers had some problem. You are right that for 50% of cluster being offline some object cannot be retrieved and after more test i verified that failover work properly, if one node is down swift will look for another node.

Sorry for the inconvenience.

To post a message you must log in.

Ask a question

Edit question

OpenStack Object Storage (swift)

Failover principle of Swift

Question information

Related bugs

Related FAQ:

This question was reopened

Subscribers