How innodb guarantee first writer firstly write redo ?

Asked by Lewis Chan

Hi. A situation like this:
Trx A and B are concurrent,

A writes page P first,
B writes page P after A

As we know, log is written into global log buffer when mtr commits

How to guarantee A's redo is always ahead of B ?

Question information

Language:
English Edit question
Status:
Answered
For:
MariaDB Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:

This question was reopened

Revision history for this message
Marko Mäkelä (dr-m) said :
#1

First of all, InnoDB follows the ARIES protocol, also known as write-ahead logging. It must not write out modified data pages before it has durably written the corresponding redo log records. The redo log covers not only index and BLOB pages, but also undo log pages, transaction metadata (including whether a transaction is committed), and page allocation metadata.

Your question could be answered by the 2018 presentation
Deep Dive: InnoDB Transactions and Write Paths
https://vimeo.com/258533150
https://docs.google.com/presentation/d/1bP3yh57B58yfGDd-34TZ5MeiacIdaseaTLpd_Ov5NI8/

There are more recent presentation that touch the same area, say:
https://fosdem.org/2022/schedule/event/mariadb_innodb/

Any modification of a page will be protected by a non-shared page latch that will be held for a short time. This is typically microseconds, at most milliseconds when an operation needs to wait for some other pages to be read into the buffer pool. Page latches and redo log are combined in mini-transactions.

In your example, both transactions would consist of multiple mini-transactions. At the minimum, one mini-transaction to allocate undo log and write the first undo log record covering the change, another to change the page, and finally one to mark the transaction as committed.

The mini-transaction commit a.k.a. mtr_t::commit() is serialized: an exclusive latch will be acquired to assign a log sequence number (LSN) to the mini-transaction. The LSN is the logical time of InnoDB, similar to the Oracle SCN (system change number). This and the page latches are ultimately what will serialize the two transactions.

Revision history for this message
Lewis Chan (baiwfg2) said :
#2

Thanks Marko Mäkelä, that solved my question.

Revision history for this message
Lewis Chan (baiwfg2) said :
#3

Hi, @Marko

I'm still confused with your term "page latches". Many material say the following:

> MySQL uses table-level locking for ISAM, MyISAM, MEMORY (HEAP), page-level locking for BDB tables, and row-level locking for InnoDB tables.

So for innodb, it doesn't use page lock, right ? Then what do you mean by "page latch" ? Is there any differences between "page lock" and "page latch" in innodb world ?

According to this(https://jira.mariadb.org/browse/MDEV-16232), I guess you mean "buf_block_t::lock" ? But buf_lock_t is not the real database page frame, it's just a wrapper of the page. So buf_block_t::lock is just ordinary memory lock which protects the struct from concurrent access, right ?

Revision history for this message
Sergei Golubchik (sergii) said :
#4

Correct. A "latch" in InnoDB is, as far as I understand, a short-term lock, like a mutex, that, indeed, protects the struct from concurrent access.

Can you help with this problem?

Provide an answer of your own, or ask Lewis Chan for more information if necessary.

To post a message you must log in.