attaching to percona-server with gdb disconnects clients

Bug #805805 reported by Greg Hazel
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Laurynas Biveinis
5.1
Fix Released
High
Laurynas Biveinis
5.5
Fix Released
High
Laurynas Biveinis

Bug Description

Simply attaching gdb (then detaching immediately) to a running Percona mysqld instance causes Percona-server to disconnect clients. Same happens if you attach gdb then continue. Clients get "Lost connection to MySQL server during query" but reconnecting works fine and the server seem to be healthy after that. Using telnet to test shows the socket close when gdb attaches. This bug makes using "pstack", "poor man's profilier" or even Aspersa's "connect" tool impossible without disconnecting clients.

Percona-server 5.5.13 (installed from yum repo), CentOS 5.6 (running on EC2. also happens with Amazon AMI), gdb 7.0.1

Interestingly, it does not happen with original mysqld (5.5.10 from webtatic or 5.5.13 built from source).

Tags: gdb

Related branches

description: updated
description: updated
description: updated
Changed in percona-server:
assignee: nobody → Valentine Gostev (longbow)
Changed in percona-server:
importance: Undecided → Low
Revision history for this message
Alexey Kopytov (akopytov) wrote :

It looks rather serious. Why was it set to low importance?

Changed in percona-server:
importance: Low → High
Revision history for this message
Valentine Gostev (longbow) wrote :

reproduced with 5.5.17

Changed in percona-server:
status: New → In Progress
Revision history for this message
Valentine Gostev (longbow) wrote :

It looks like gdb while attaching issues SIGSTOP and SIGCONT signals to mysqld process. PS drops connections, while vanilla mysql does not.

Another way to reproduce:
kill -STOP $mysqld_pid
kill -CONT $mysqld_pid
PS interrupts query, vanilla continues running query

Changed in percona-server:
status: In Progress → Confirmed
Revision history for this message
Takenori Akagi (anonimo) wrote :

Confirmed on 5.5

Changed in percona-server:
status: Confirmed → Triaged
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Tested it on both 5.5.22 Percona Server and MySQL, and confirmed on Percona 5.5.22

Following patch fixes it:
==================================

--- sql/net_serv.cc 2012-05-06 23:20:51.968530130 +0530
+++ /tmp/net_serv.cc 2012-05-06 21:11:01.505038180 +0530
@@ -835,7 +835,7 @@

          DBUG_PRINT("info",("vio_read returned %ld errno: %d",
                             (long) length, vio_errno(net->vio)));
-#if !defined(NO_ALARM) && (!defined(__WIN__) || defined(MYSQL_SERVER))
+#if !defined(__WIN__) || defined(MYSQL_SERVER)
          /*
            We got an error that there was no data on the socket. We now set up
            an alarm to not 'read forever', change the socket to non blocking

====================================================

Tested the patch as well.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

The way I discovered it is as follows:

1. I did a strace while sending STOP/CONT signals, here is the strace https://gist.github.com/d5deace9f70fc0514083

        I compared it to strace for MySQL 5.5.22, and I noticed that there is a shutdown(2) which is called for the new-sock spawned for the new connection -- https://gist.github.com/d5deace9f70fc0514083#gistcomment-305803

2. Next, I did gdb attach for the same process, with same STOP/CONT signals and mysql client; here is the gdb backtrace -- https://gist.github.com/c4a76c2342c50088d6bd

3. Next, I did stepping inside do_handle_one_connection and noticed

      3.1

  for (;;)
  {
    bool rc;
    bool create_user= TRUE;

    rc= thd_prepare_connection(thd);
    if (rc)
    {
      create_user= FALSE;
      goto end_thread;
    }

    while (thd_is_connection_alive(thd))
    {
      mysql_audit_release(thd);
      if (do_command(thd)) ------> it returned here and caused it to return on signal
 break;
    }

4. So, I compared the code, between PS and MySQL, for functions do_command calls, and noticed the diff -- !defined(NO_ALARM) -- which I have posted as difference.

5. Tested it with the change and it seems to work.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Raghu, thanks for your excellent analysis and fix.

Valentine, would it be easy to create a regression test for the test suite based on this? If it's not very easy, IMHO we can fix without the test in this case.

Stewart Smith (stewart)
Changed in percona-server:
assignee: Valentine Gostev (longbow) → nobody
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Not reproducible on 5.1. MTR testcase for this bug is same as for bug 1060136:

--source include/not_windows.inc

let $mysqld_pid_file=`SELECT @@GLOBAL.pid_file`;

system kill -STOP `cat $mysqld_pid_file`;
system kill -CONT `cat $mysqld_pid_file`;

# Server gone!
SELECT 2+2;

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

By code analysis, attaching GDB to mysqld while it is writing to socket will result in the same issue. This is harder to reproduce though.

Revision history for this message
Mario Splivalo (mariosplivalo) wrote :

This is easy to reproduce - issue 'select sleep(90)' in mysql-cli, and then resize the terminal.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Mario -

This bug concerns with the server not being able to handle SIGSTOP/SIGCONT. The issue of terminal resize is client not handling SIGWINCH correctly and that is bug 925343, fixed in the upcoming release.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-489

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.