No announcement yet.

Full CPU load on dual AMD EPYC system

  • Filter
  • Time
  • Show
Clear All
new posts

  • Full CPU load on dual AMD EPYC system

    In case anyone else comes across the same problem,

    BurnInTest didn't full load a HP system with dual AMD EPYC 7601 CPUs on Windows Server 2019 (x64).

    Each EPYC 7601 CPU should support 64 threads. So the system should support 128 threads in total. But with BurnInTest V9.0 build 1013 and below it was observed that only 64 threads were started for the CPU test. Resulting in around 60% CPU load instead of near 100% CPU load.

    With this many cores & threads we suspected the problem might be related to CPU processor groups. A processor groups contains 64 threads. So this machine should have had 2 processor groups. But BurnInTest supports 4 processor groups (256 cores). So it should have been OK.

    But after debugging it seems that this machine is reporting that it has 8 active processor groups. Which is seemingly wrong.

    According to the Microsoft documentation on processor groups there should be two groups.

    When the system starts, the operating system creates processor groups and assigns logical processors to the groups. If the system is capable of hot-adding processors, the operating system allows space in groups for processors that might arrive while the system is running. The operating system minimizes the number of groups in a system. For example, a system with 128 logical processors would have two processor groups with 64 processors in each group, not four groups with 32 logical processors in each group.
    We are not sure if this is an AMD bug, or HP bug, or an attempt to deal with the unusual NUMA memory layout (in which case it is a Microsoft documentation bug).

    So starting from V9.0 build 1014 BurnInTest will support 12 active processor groups (instead of the current 4) as a work around solution. In testing this solved the problem of full CPU load on this AMD EPYC system.