Gearman Jobserver Failover

Asked by Felix Gorodishter on 2010-05-24

It appears that currently libgearman does not allow for proper failover to utilize a second jobserver.

I run two gearman job servers and if i take the first one down, the jobs simply fail with an error of:
   gearman_connection_flush:could not connect

(For what it's worth, I use the gearman PHP extension, currently on version 0.7.0, to interface with gearman)

It appears the error is coming from line 488 of libgearman/connection.c where the connection->addrinfo_next is NULL.

Any advice would be much appreciated.

Question information

Language:
English Edit question
Status:
Answered
For:
Gearman Edit question
Assignee:
No assignee Edit question
Last query:
2010-07-11
Last reply:
2010-11-19
timuckun (timuckun) said : #1

Has this been fixed in a subsequent version?

This appears to still be a problem in the current release.

Howard Ha (bluespire) said : #3

I have the exact same problem as of Nov 19 2010, using the PHP PECL extension 0.7.0 The full error I get is:

 gearman_connection_flush:lost connection to server (32).

I have additional observations about this:

1) The order in which you add servers determines which server is ignored. It appears that the server that must not go down is the last of 2 servers.
2) Try this: if you addServer() A and then B servers, then shut down gearmand on server A everything works fine. Shut down gearmand on server B and you get the error.

Howard Ha (bluespire) said : #4

I have the exact same problem as of Nov 19 2010, using the PHP PECL extension 0.7.0 The full error I get is:

 gearman_connection_flush:lost connection to server (32).

I have additional observations about this:

1) The order in which you add servers determines which server is ignored. It appears that the server that must not go down is the last of 2 servers.
2) Try this: if you addServer() A and then B servers, then shut down gearmand on server A everything works fine. Shut down gearmand on server B and you get the error.

Can you help with this problem?

Provide an answer of your own, or ask Felix Gorodishter for more information if necessary.

To post a message you must log in.