Why so many context switches and page faults

Asked by tomdean

> uname -rsi
Linux 5.8.0-59-generic x86_64
> lsb_release -ds
Ubuntu 20.04.2 LTS

CPU: AMD Threadripper 3970X
RAM: 32G

I have a short application that just runs a loop just to load the CPU (at bottom).

> /usr/bin/time -f "elapsed %esec cpu %P ctx-sw (vol %w uvol %c) pg-flt %R io %F" ck-load
elapsed 3.89sec cpu 99% ctx-sw (vol 1 uvol 350) pg-flt 76 io 0

Why does this have 350 unvoluntary context switches?
76 page faults?

nice -n -20 does not help.

[code]
/*
 * load.c - create a load on the cpu
 */
#include <math.h>
#include <stdint.h> /* int64_t */
#include <stdlib.h> /* llabs */

volatile long double pi;
volatile long double u;
volatile long double v;
volatile long double w;
volatile long double x;
volatile long double y;
volatile long double z;

volatile int64_t a;
volatile int64_t b;
volatile int64_t c;
volatile int64_t d;
volatile int64_t e;
volatile int64_t f;

volatile int64_t loop_count;

int main() {
    pi = atanl(1.0);
    w = -2.0 * pi;

    loop_count = 0;

    for (loop_count=0; loop_count < 10000000; loop_count++) {
        x = sinl(w);
        a = llabs(x);
        y = cosl(w);
        b = llabs(y);
        z = tanl(w);
        c = llabs(z);
        u = expl(fabsl(y));
        d = llabs(u);
        v = sqrtl(fabsl(x*y));
        e = llabs(v);
        a += b;
        c += d;
        e += a * b;
        f = c * d;
        w += 0.000001;
        if ( w > 2.0 * pi) {
            w = -2.0 * pi;
        }
    }
    return 0;
}

[\code]

Question information

Language:
English Edit question
Status:
Answered
For:
Ubuntu Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Bernard Stafford (bernard010) said :
#1

You are using a AMD Ryzen Threadripper 3970X with 32 Cores which has 64 Threads to load the one CPU.
The loop is to insure all of the 64 Threads of the CPU load completely.

Revision history for this message
Manfred Hampl (m-hampl) said :
#2

From a mathematical point of view the expression "pi = atanl(1.0);" is wrong, because pi usually denotes the ratio of a circle's circumference to its diameter which is four times arctan of 1.

I suggest that you try running your program with different values for loop_count. Does the number of page faults and context switches rise linearly with loop_count, or is it more or less constant (then this would point towards the effort needed for starting the program). Is there any effect on these numbers by higher and lower other load of the system?

Revision history for this message
tomdean (tomdean) said :
#3

On 7/19/21 8:55 AM, Manfred Hampl wrote:
> Your question #698038 on Ubuntu changed:
> https://answers.launchpad.net/ubuntu/+question/698038

I saw that. This is only to load the cpu, the value does not matter.
But, I fixed it anyway.

loop-count
100000 elapsed 0.05 cpu 98% ctx-sw (vol 1 uvol 6) pg-flt 76 io 0
1000000 elapsed 0.51 cpu 99% ctx-sw (vol 1 uvol 46) pg-flt 75 io 0
10000000 elapsed 4.30 cpu 99% ctx-sw (vol 1 uvol 399) pg-flt 75 io 0
100000000 elapsed 42.64 cpu 99% ctx-sw (vol 1 uvol 4030) pg-flt 74 io 0

Looks like context switches do increase linearly with the time the
process is running. page faults stay the same.

Revision history for this message
tomdean (tomdean) said :
#4

On 7/19/21 12:45 PM, tomdean wrote:
> Your question #698038 on Ubuntu changed:
> https://answers.launchpad.net/ubuntu/+question/698038
>
> Status: Answered => Open
>
> You are still having a problem:
> On 7/19/21 8:55 AM, Manfred Hampl wrote:
>> Your question #698038 on Ubuntu changed:
>> https://answers.launchpad.net/ubuntu/+question/698038
>
> I saw that. This is only to load the cpu, the value does not matter.
> But, I fixed it anyway.
>
> loop-count
> 100000 elapsed 0.05 cpu 98% ctx-sw (vol 1 uvol 6) pg-flt 76 io 0
> 1000000 elapsed 0.51 cpu 99% ctx-sw (vol 1 uvol 46) pg-flt 75 io 0
> 10000000 elapsed 4.30 cpu 99% ctx-sw (vol 1 uvol 399) pg-flt 75 io 0
> 100000000 elapsed 42.64 cpu 99% ctx-sw (vol 1 uvol 4030) pg-flt 74 io 0
>
> Looks like context switches do increase linearly with the time the
> process is running. page faults stay the same.
>

I hit send too soon.

The page faults are starting and ending the process.

Looks like the context switches are time slices.

Revision history for this message
Manfred Hampl (m-hampl) said :
#5

According to https://stackoverflow.com/a/21778209

An involuntary context switch occurs when a thread has been running too long (usually something like 10 ms) without making a system call that blocks and there are processes waiting for the CPU.

This fits well to your figures
100000 elapsed 0.05 uvol 6, once every 8.3 ms
1000000 elapsed 0.51 uvol 46, once every 11.1 ms
10000000 elapsed 4.30 uvol 399, once every 10.8 ms
100000000 elapsed 42.64 uvol 4030, once every 10.6 ms

Can you help with this problem?

Provide an answer of your own, or ask tomdean for more information if necessary.

To post a message you must log in.