Question #173364 “myloader freezes/hangs” : Questions : MySQL Data Dumper

Revision history for this message

Andrew Hutchings (linuxjedi) said on 2011-10-06:

#1

Hi Daniel,

What version of mydumper are you using?

myloader should have --verbose=? when ? is a number, --help will give you details.

You may find that if you have a particularly large table and it wasn't broken up into pieces by mydumper (ie. --rows wasn't used or it has a varchar pkey) then that table will be restored single-threaded and may take a while to complete.

MySQL's SHOW PROCESSLIST would give a better idea of what is going on.

Revision history for this message

Daniel Kaelin (dkaelin) said on 2011-10-06:

#2

I am actually using version 0.51 of the dumper. The latest iteration released 8/12/2011.

So for the verbosity I have to specify what level of verbosity then. That would make sense. I guess I just missed that aspect when reading through the documentation. The verbosity option can appear anywhere in the myloader command?

If I am breaking the mydumper into pieces do you have a recommendation on how many rows I should use when grouping the data? The databases are from Moodle installations so the number of rows is usually under 10,000 in any given table.

Can you elaborate on the varchar pkey? I am not familiar with that particular terminology.

I appreciate the response.

Thanks

Daniel

Revision history for this message

Daniel Kaelin (dkaelin) said on 2011-10-06:

#3

The installation is being run on CentOS as well.

If you feel like adding any details onto your current documentation these were the dependencies I had to install before I was able to compile the files and get mydumper and myloader working on the server properly.

All of these below installed through the yum install (insert module name)

Dependencies:
cmake
make
gcc
gcc-c++
zlib

Revision history for this message

Andrew Hutchings (linuxjedi) said on 2011-10-07:

#4

Hi Daniel,

Yes the --verbose= can be anywhere in the command parameters.

The number of rows is entirely dependant on your judgement, maybe 2500 or 5000 to start with (note that the figure is a rough guide and mydumper will make adjustments based on InnoDB statistics).

By varchar pkey I mean if you have a primary key that is not an integer based column mydumper cannot yet estimate how to break the table up so it is dumped as one large dump file. That is a feature we plan to implement in the future.

If you have an apparent hang again please run SHOW FULL PROCESSLIST in the MySQL command line and paste it here so we can see what myloader is doing.

Revision history for this message

Daniel Kaelin (dkaelin) said on 2011-10-07:

#5

It appears your suggestions are working perfectly for the myloader. Thank you for the input on that.

I was watching the restoration of one of the databases using the myloader command and I was curious how the module handles sequential files when it is loading them into the server.

The mdl_log was the largest table from the restore with 1.3 million rows. I noticed the threads were jumping all over the place as far as what parts were being loaded back in.

Starts with 409, then 511, 355, 276, 115, etc.

** Message: Thread 5 restoring `mdl_bryan`.`mdl_log` part 409
** Message: Thread 8 restoring `mdl_bryan`.`mdl_log` part 511
** Message: Thread 4 restoring `mdl_bryan`.`mdl_log` part 355
** Message: Thread 3 restoring `mdl_bryan`.`mdl_log` part 276
** Message: Thread 7 restoring `mdl_bryan`.`mdl_question_states` part 57
** Message: Thread 7 restoring `mdl_bryan`.`mdl_log` part 115
** Message: Thread 6 restoring `mdl_bryan`.`mdl_log` part 81
** Message: Thread 1 restoring `mdl_bryan`.`mdl_log` part 152
** Message: Thread 2 restoring `mdl_bryan`.`mdl_log` part 287
** Message: Thread 5 restoring `mdl_bryan`.`mdl_question_states` part 10
** Message: Thread 5 restoring `mdl_bryan`.`mdl_log` part 59
** Message: Thread 8 restoring `mdl_bryan`.`mdl_role_allow_override` part 0
** Message: Thread 8 restoring `mdl_bryan`.`mdl_questionnaire_resp_single` part 1
** Message: Thread 8 restoring `mdl_bryan`.`mdl_message_read` part 3
** Message: Thread 8 restoring `mdl_bryan`.`mdl_log` part 45
** Message: Thread 4 restoring `mdl_bryan`.`mdl_log` part 139
** Message: Thread 3 restoring `mdl_bryan`.`mdl_log` part 127
** Message: Thread 7 restoring `mdl_bryan`.`mdl_log` part 451
** Message: Thread 6 restoring `mdl_bryan`.`mdl_question_states` part 152
** Message: Thread 6 restoring `mdl_bryan`.`mdl_question_states` part 34
** Message: Thread 6 restoring `mdl_bryan`.`mdl_log` part 343

Also would you be able to provide any additional information in regards to the daemon mode? Aside from the below documentation you have available I didn't see much else on it.

*** From Your Documentation ***

Mydumper has a daemon mode which will snapshot the dump data every so often whilst continuously retreiving the binary log files. This gives a continuous consistent backup right up to the point where the database server fails. To use this you simply need to use the --daemon option.

In the following example mydumper will use daemon mode, creating a snapshot every half an hour and log to an output file:

mydumper --daemon --snapshot-interval=30 --logfile=dump.log

*** end ***

A few questions I had about the daemon mode.

What information gets put in the log file if you have that enabled?

Are you supposed to specify a database in this mode or is it designed to cover all the databases on a server?

If it does cover all of the databases how does it know where to place the various parts / deltas / updates for each database without the directory option being used?

Is the daemon only designed to periodically update the dumps? I didn't see anything about a daemon for myloader if this was to be ran on a backup database server. Is that best handled in a server script through a CRON to periodically run the myloader to continuously update the database while mydumper is updating the dumps through the daemon?

Once again I appreciate your responsiveness and your help.

Thanks,

Daniel

It appears your suggestions are working perfectly for the myloader. Thank you for the input on that.

I was watching the restoration of one of the databases using the myloader command and I was curious how the module handles sequential files when it is loading them into the server.

The mdl_log was the largest table from the restore with 1.3 million rows. I noticed the threads were jumping all over the place as far as what parts were being loaded back in.

Starts with 409, then 511, 355, 276, 115, etc.

** Message: Thread 5 restoring `mdl_bryan`.`mdl_log` part 409
** Message: Thread 8 restoring `mdl_bryan`.`mdl_log` part 511
** Message: Thread 4 restoring `mdl_bryan`.`mdl_log` part 355
** Message: Thread 3 restoring `mdl_bryan`.`mdl_log` part 276
** Message: Thread 7 restoring `mdl_bryan`.`mdl_question_states` part 57
** Message: Thread 7 restoring `mdl_bryan`.`mdl_log` part 115
** Message: Thread 6 restoring `mdl_bryan`.`mdl_log` part 81
** Message: Thread 1 restoring `mdl_bryan`.`mdl_log` part 152
** Message: Thread 2 restoring `mdl_bryan`.`mdl_log` part 287
** Message: Thread 5 restoring `mdl_bryan`.`mdl_question_states` part 10
** Message: Thread 5 restoring `mdl_bryan`.`mdl_log` part 59
** Message: Thread 8 restoring `mdl_bryan`.`mdl_role_allow_override` part 0
** Message: Thread 8 restoring `mdl_bryan`.`mdl_questionnaire_resp_single` part 1
** Message: Thread 8 restoring `mdl_bryan`.`mdl_message_read` part 3
** Message: Thread 8 restoring `mdl_bryan`.`mdl_log` part 45
** Message: Thread 4 restoring `mdl_bryan`.`mdl_log` part 139
** Message: Thread 3 restoring `mdl_bryan`.`mdl_log` part 127
** Message: Thread 7 restoring `mdl_bryan`.`mdl_log` part 451
** Message: Thread 6 restoring `mdl_bryan`.`mdl_question_states` part 152
** Message: Thread 6 restoring `mdl_bryan`.`mdl_question_states` part 34
** Message: Thread 6 restoring `mdl_bryan`.`mdl_log` part 343

Also would you be able to provide any additional information in regards to the daemon mode? Aside  from the below documentation you have available I didn't see much else on it.

*** From Your Documentation ***

Mydumper has a daemon mode which will snapshot the dump data every so often whilst continuously retreiving the binary log files. This gives a continuous consistent backup right up to the point where the database server fails. To use this you simply need to use the --daemon option.

In the following example mydumper will use daemon mode, creating a snapshot every half an hour and log to an output file:

mydumper --daemon --snapshot-interval=30 --logfile=dump.log

*** end ***

A few questions I had about the daemon mode.

What information gets put in the log file if you have that enabled?

Are you supposed to specify a database in this mode or is it designed to cover all the databases on a server?

If it does cover all of the databases how does it know where to place the various parts / deltas / updates for each database without the directory option being used?

Is the daemon only designed to periodically update the dumps? I didn't see anything about a daemon for myloader if this was to be ran on a backup database server. Is that best handled in a server script through a CRON to periodically run the myloader to continuously update the database while mydumper is updating the dumps through the daemon?

Once again I appreciate your responsiveness and your help.

Thanks,

Daniel

Revision history for this message

Launchpad Janitor (janitor) said on 2011-10-23:

#6

This question was expired because it remained in the 'Open' state without activity for the last 15 days.

Revision history for this message

Daniel Kaelin (dkaelin) said on 2011-10-24:

#7

Any input on the daemon from my post above?

Revision history for this message

Andrew Hutchings (linuxjedi) said on 2011-11-04:

#8

Hi Daniel,

I apologise for the delay, I have been on vacation.

We should probably do something about the random order issue, I'll file a bug on this. On InnoDB at least we should in theory get higher performance for sequential inserts.

> What information gets put in the log file if you have that enabled?

Just time and date info of snapshots IIRC, basically what is normally sent to the terminal.

> Are you supposed to specify a database in this mode or is it designed to cover all the databases on a server?

Both are possible.

> If it does cover all of the databases how does it know where to place the various parts / deltas / updates for each database without the directory option being used?

It stores two entire snapshots at any time at the moment, in the future I would like it to store deltas or something similar instead. It will automatically create subdirectories for these snapshots as well as a symlink pointing to the last good snapshot.

> Is the daemon only designed to periodically update the dumps? I didn't see anything about a daemon for myloader if this was to be ran on a backup database server. Is that best handled in a server script through a CRON to periodically run the myloader to continuously update the database while mydumper is updating the dumps through the daemon?

It is designed to periodically create a dump of the data as well as continuously retrieving the binary logs. This way you have an up-to-the-second backup of your data. Unfortunately I have not completed the code for myloader to process the binlogs yet.

MySQL Data Dumper

myloader freezes/hangs

Question information

Can you help with this problem?

Subscribers

MySQL Data Dumper

myloader freezes/hangs

Question information

Related bugs

Related FAQ:

Can you help with this problem?

Subscribers