query optimizer ignores secondary indexes after restoration, at least until OPTIMIZE TABLE

Asked by Remi Broemeling on 2013-04-09

Software Involved:
 - Debian 6.0.7
 - Percona Server 5.5.29-rel30.0-451.squeeze
 - Percona XtraBackup 2.0.6-521.squeeze

We've recently begun experiencing issues with our innobackupex backups.

Specifically, after restoration; the server has massive problems answering simple queries (queries that *should* take a few fractions of a second instead take more than a minute).
After looking into this, I've found that the issue is because the optimizer is outright ignoring some of the restored secondary indexes.

For example, given the query:

SELECT COUNT(DISTINCT `contacts`.`id`)
  FROM `contacts`
  INNER JOIN `groups` ON `contacts`.`group_id` = `groups`.`id`
  INNER JOIN `user_group_memberships` ON `groups`.`id` = `user_group_memberships`.`group_id`
  INNER JOIN `users` ON `user_group_memberships`.`user_id` = `users`.`id`
WHERE `contacts`.`type` IN ('Company') AND `users`.`account_id` = 0123456789 \G

Directly after a restoration, EXPLAIN shows:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: contacts
         type: ALL
possible_keys: index_contacts_on_group_id_and_updated_at,index_contacts_on_group_id_and_number
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1
        Extra: Using where
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: contacts.group_id
         rows: 1
        Extra: Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: user_group_memberships
         type: ref
possible_keys: index_user_group_memberships_on_group_id,index_user_group_memberships_on_user_id_and_group_id
          key: index_user_group_memberships_on_group_id
      key_len: 5
          ref: groups.id
         rows: 1
        Extra: Using where
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: eq_ref
possible_keys: PRIMARY,index_users_on_account_id
          key: PRIMARY
      key_len: 4
          ref: user_group_memberships.user_id
         rows: 1
        Extra: Using where

And then, after optimizing the involved tables (`contacts`, `groups`, `user_group_memberships`, and `users`), EXPLAIN shows:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: users
         type: ref
possible_keys: PRIMARY,index_users_on_account_id
          key: index_users_on_account_id
      key_len: 5
          ref: const
         rows: 1
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: user_group_memberships
         type: ref
possible_keys: index_user_group_memberships_on_group_id,index_user_group_memberships_on_user_id_and_group_id
          key: index_user_group_memberships_on_user_id_and_group_id
      key_len: 5
          ref: users.id
         rows: 1
        Extra: Using where; Using index
*************************** 3. row ***************************
           id: 1
  select_type: SIMPLE
        table: groups
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: user_group_memberships.group_id
         rows: 1
        Extra: Using index
*************************** 4. row ***************************
           id: 1
  select_type: SIMPLE
        table: contacts
         type: ref
possible_keys: index_contacts_on_group_id_and_updated_at,index_contacts_on_group_id_and_number
          key: index_contacts_on_group_id_and_number
      key_len: 5
          ref: user_group_memberships.group_id
         rows: 116
        Extra: Using where

The first query plan is highly suspicious (only 1 row per table?) and executes in over a minute (70-90 seconds), whereas the second query plan executes in 0.01s. This makes perfectly normal/reasonable queries completely unusable on a restored dataset until all of the tables are optimized. I could work around this by simply optimizing all of the tables after a dataset restoration; but doing so is extremely expensive, so if there is any way for me to work around this problem and have the dataset correct and usable as soon as it is restored, that would be preferrable.

Our backup procedure looks like this:

innobackupex --defaults-file="<PATH TO BACKUP CNF>" --safe-slave-backup --slave-info --no-timestamp "<TARGET DIR>"
innobackupex --apply-log --export "<TARGET DIR>"
mysqldump --no-data --single-transaction "<DATABASE>" > "<TARGET DIR>/<DATABASE>.ddl.sql"

This lets us restore bits and pieces of the database (up to the entire thing).

Our restoration procedure looks like this:

mysql -e '
  SET GLOBAL innodb_import_table_from_xtrabackup=1;
  DROP DATABASE IF EXISTS `<DATABASE>`;
  CREATE DATABASE `<DATABASE>` /*!40100 DEFAULT CHARACTER SET utf8 */;
'
mysql "<DATABASE>" < "<SOURCE DIR>/<DATABASE>.ddl.sql"
for FILE in "<SOURCE DIR>/"*.exp; do
  TABLE="$(basename "${FILE}" .exp)"
  mysql -e 'ALTER TABLE `<DATABASE>`.`'"${TABLE}"'` DISCARD TABLESPACE;'
  mv -f "<SOURCE DIR>/${TABLE}.exp" "/var/lib/mysql/<DATABASE>/${TABLE}.exp"
  mv -f "<SOURCE DIR>/${TABLE}.ibd" "/var/lib/mysql/<DATABASE>/${TABLE}.ibd"
  mysql -e 'ALTER TABLE `<DATABASE>`.`'"${TABLE}"'` IMPORT TABLESPACE;'
done

The restoration works without significant errors; but the proper indexes/query execution plans simply don't work until the tables are optimized. Log entries for the table imports from mysql-error.log are below:

130409 21:13:09 InnoDB: Error: page 0 log sequence number 3990169120440
InnoDB: is in the future! Current system log sequence number 19635334322.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.
InnoDB: Import: The extended import of <DATABASE>/contacts is being started.
InnoDB: Import: 7 indexes have been detected.
InnoDB: Progress in %: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 done.
...
130409 21:16:47 InnoDB: Error: page 0 log sequence number 3950126966142
InnoDB: is in the future! Current system log sequence number 19635463071.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.
InnoDB: Import: The extended import of <DATABASE>/groups is being started.
InnoDB: Import: 2 indexes have been detected.
InnoDB: Progress in %: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 done.
...
130409 21:26:02 InnoDB: Error: page 0 log sequence number 3982840113438
InnoDB: is in the future! Current system log sequence number 19635691515.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.
InnoDB: Import: The extended import of <DATABASE>/user_group_memberships is being started.
InnoDB: Import: 3 indexes have been detected.
InnoDB: Progress in %: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 done.
...
130409 21:26:06 InnoDB: Error: page 0 log sequence number 3986296616915
InnoDB: is in the future! Current system log sequence number 19635708107.
InnoDB: Your database may be corrupt or you may have copied the InnoDB
InnoDB: tablespace but not the InnoDB log files. See
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
InnoDB: for more information.
InnoDB: Import: The extended import of <DATABASE>/users is being started.
InnoDB: Import: 7 indexes have been detected.
InnoDB: Progress in %: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 done.

It's worth mentioning that this all worked perfectly until we updated to the newest versions of XtraBackup and Percona Server about a month ago. The backup/restoration process hasn't seen any meaningful changes in quite a few months, and restorations have always worked before without requiring OPTIMIZE TABLE. I'm wondering if the new version(s) broke secondary indexes in a non-obvious way.

Any ideas?

Question information

Language:
English Edit question
Status:
Solved
For:
Percona XtraBackup moved to https://jira.percona.com/projects/PXB Edit question
Assignee:
No assignee Edit question
Solved by:
Alexey Kopytov
Solved:
2013-04-10
Last query:
2013-04-10
Last reply:
2013-04-10
Best Alexey Kopytov (akopytov) said : #1

IMPORT TABLESPACE does not update table statistics, neither in MySQL nor in Percona Server (regardless of the innodb_import_table_from_xtrabackup value). Which means the optimizer will be using stale (i.e. corresponding to the table before DISCARD TABLESPACE) stats, hence suboptimal query execution plans. The workaround is to update them explicitly with ANALYZE TABLE (which is cheaper than OPTIMIZE TABLE) right after import.

Not sure why it seemed to work correctly before the upgrade. It might be because someone/something did SHOW TABLE STATUS, or ANALYZE TABLE after IMPORT TABLESPACE and before the table was accessed. Or persistent statistics was used previously (i.e. innodb_use_sys_stats_table), but not anymore.

Hi Alexey --

Indeed, ANALYZE TABLE completely solves the issue; and is quite inexpensive/quick -- so it is no problem to insert into our backup restoration procedure.

I'm not sure how we had it working before. I'm looking at the old restoration shell script before, and it does the same process that I've listed above -- i.e. it doesn't do an ANALYZE TABLE, nor an OPTIMIZE TABLE -- but it seemed to work well enough. I think that you must be correct that we must have been issuing SHOW TABLE STATUS commands or taking other actions that implicitly told MySQL to update it's table statistics.

The version upgrade must have been a red-herring, with some more subtle shift in our workflow changing how things _actually_ worked.

Many thanks for providing guidance on this -- it's just a one-line fix for us. Thanks very much!

Thanks Alexey Kopytov, that solved my question.