SIGTERM signal handler in Swift services

Asked by Anuj

Hello,

I have configured container-server to create 2 worker threads during execution. The container-server is running on 0.0.0.0 and port 7001. I used below command to start service.

$> swift-init container-server start

swift 6841 1 0 01:01 ? 00:00:00 /usr/bin/python /usr/bin/swift-container-server /fs/container-server.conf
swift 6941 6841 0 01:01 ? 00:00:00 /usr/bin/python /usr/bin/swift-container-server /fs/container-server.conf
swift 6945 6841 0 01:01 ? 00:00:00 /usr/bin/python /usr/bin/swift-container-server /fs/container-server.conf

I am facing problems when I killed pid 6841 explicitly with ”kill -9 6841” command. PID 6841 is terminating but child processes 6941 and 6945 are still residing in memory. So, I am getting the below error while restarting container-server. It is not binding with 0.0.0.0:7001 again.

fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek)
read(3, "Traceback (most recent call last):\n", 8192) = 35
read(3, " File \"/usr/bin/swift-container-server\", line 22, in <module>\n run_wsgi(conf_file, 'container-server', default_port=6001, **options)\n", 8157) = 137
read(3, " File \"/usr/lib/python2.6/site-packages/swift/common/wsgi.py\", line 125, in run_wsgi\n", 8020) = 86
read(3, " sock = get_socket(conf, default_port=kwargs.get('default_port', 8080))\n", 7934) = 75
read(3, " File \"/usr/lib/python2.6/site-packages/swift/common/wsgi.py\", line 93, in get_socket\n", 7859) = 87
read(3, " bind_addr[0], bind_addr[1], bind_timeout))\n", 7772) = 47
read(3, "Exception: Could not bind to 0.0.0.0:7001 after trying for 30 seconds\n", 7725) = 70

Same behavior I observed for object-server and account-server. Does it mean that SIGTERM signal is not handled ?

Thanks,
-Anuj

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Object Storage (swift) Edit question
Assignee:
No assignee Edit question
Solved by:
Anuj
Solved:
Last query:
Last reply:
Revision history for this message
clayg (clay-gerrard) said :
#1

I think the dash nine sends SIGKILL - not SIGTERM, there's no way to "not implement" SIGKILL that I know of.

For some implementations of kill (sometimes it a shell builtin) you can specify the pid as a negative number to send the signal to the whole group.

I think swift-orphans might help you out... or you can use SIGTERM, which is implemented in common.wsgi & common.daemon; and should attempt to reap the child processes.

Revision history for this message
Anuj (anuj-garg) said :
#2

Thanks, swift-orphans can help me here.