How about compressing binlog?

Asked by choury

Hi All,

I'm a dba of Tencent Inc. We make extensive use of MySQL as our database in our business.

But in our businesses (mainly are games), there is usually a lot of binlog generated in a short time. For example, a game called "QQ Dancer", it has one million online players at the same time, we use more than 200 machines as its database server, they generate about 1.6T binlog per hour altogether. This is the result of using MIXED mode, it will be even larger in ROW mode, However, we have more than 10 games just like it. Such a large amount of binlog will not only take up disk space, it will also take a lot of network bandwidth, and it is very difficult to make a long-distance backup. We have searched for solution and only found these:
https://bugs.mysql.com/bug.php?id=48396
https://bugs.mysql.com/bug.php?id=46435
I don't know why it hasn't been implemented so long. Given this, we made an idea of compressing a binlog when generating and we have already implemented it.

The solution is as follows:
We added some event types for the compressed edition of event, there are:
     QUERY_COMPRESSED_EVENT,
     WRITE_ROWS_COMPRESSED_EVENT,
     UPDATE_ROWS_COMPRESSED_EVENT,
     DELETE_POWS_COMPRESSED_EVENT.
These events inheritance the uncompressed editor events. One of their constructor functions and write function have been overridden for uncompressing and compressing. Anything but this is totally the same. And the format of these events can be described by this picture:
http://i.imgur.com/4Kf80Tr.png

On slave, The IO thread will uncompress and convert them When it receiving the events from the master.
So the SQL and worker threads can be stay unchanged.

We also added two options for this feature: "log_bin_compress " and "log_bin_compress_min_len", the former is a switch of whether the binlog should be compressed and the latter is the minimum length of sql statement(in statement mode) or record(in row mode) that can be compressed. All can be described by the following code:

 if binlog_format == statement {
          if log_bin_compress == true and query_len >= log_bin_compress_min_len
             create a Query_compressed_log_event;
          else
             create a Query_log_event;
 }
 if binlog_format == row {
          if log_bin_compress == true and record_len >= log_bin_compress_min_len
             create a Write_rows_compressed_log_event(when INSERT)
          else
             create a Write_log_event(when INSERT);
 }

The complete change for Percona 5.6 can be found by:
https://github.com/choury/percona-server/commit/bdf5a83164ff19a5017cde6507427c0b5bc70645
We have tested it on some of our games for months, and the result is obvious: the amount of binlog is reduced by 42% ~ 70%. We will be very glad if you can accept our patch.

If you have any other questions, please don't hesitate to reply to me!

Question information

Language:
English Edit question
Status:
Expired
For:
Percona Server moved to https://jira.percona.com/projects/PS Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Launchpad Janitor (janitor) said :
#1

This question was expired because it remained in the 'Open' state without activity for the last 15 days.