Comment 63 for bug 1861359

Revision history for this message
Andrea Righi (arighi) wrote :

TL;DR @seth-arnold, as a test can you try to set the following options?

  $ echo $((32 * 1024 * 1024)) | sudo tee /proc/sys/vm/dirty_bytes
  $ echo $((32 * 1024 * 1024)) | sudo tee /proc/sys/vm/dirty_background_bytes

Repeat the test and see if the system is still unresponsive.

Details below.

This is what I think it's happening in this last scenario: interactive performance killed when a large I/O writer is running.

The large I/O writer generates a lot of dirty pages, nothing is forcing to sync those pages to the backing store until the dirty_ratio (=20%) / dirty_background_ratio (=10%) thresholds are hit. And they can be quite high with the default settings in systems with a lot of RAM.

For example in a system with 16GB of free/reclaimable memory, the amount of dirty memory that is allowed before a writer is actively forced to flush those pages to the backing store is: 16GB * 20 / 100 = 3.2GB. Flusher threads are started when the amount of dirty pages is 16GB * 10 / 100 = 1.6GB of dirty memory.

So, if the writer doesn't stop, it will consume all the free pages in the system and at that point we are going to have a lot of dirty pages. Then the kernel needs to decide what to do to free up some pages.

Reclaimable memory is the first choice: cached clean pages that already have a copy on the corresponding backing store are easy to reclaim, because they just need to be dropped from the page cache (no I/O involved). Dirty pages are more expensive to reclaim, because they need to be flushed to the backing store before freeing up the page. Same with anonymous memory that needs to be flushed to the swap device, before being able to re-use the page.

So when the system starts to reclaim some pages, we see some swap activity and we also see some I/O due to the flushing of the dirty pages. I think the system becomes sluggish, because there are too many dirty pages, the kernel is spending too much time to select the right pages to reclaim and interactive performance is killed.

This looks like a bug/regression in the kernel and I think we should definitely investigate more and track down the reason of the problem. In the meantime, as a test to prove this thoery I think we could try to reduce the amount of allowed dirty pages in the system, tuning the dirty thresholds: vm.dirty_bytes and vm.dirty_background_bytes (using the *_bytes tuners to have a more fine-grained control on those thresholds) and see if there are some benefits in the specific scenario reported by Seth.