Topic: Hyperthreading
Can disabling of hyperthreading ("virtual CPUs") speed up the response of Pianoteq?
This question came up recently in Silent laptop advice.
I tried to get an impression with the following quick test:
To get a value for an overall-latency I clicked PTQ's virtual keyboard with a mouse and recorded the click-noise and the PC-speaker sound with the internal microphone of that notebook (velocity fixed at 100).
Then I switched off hyperthreading temporarily by starting the kernel with the option "nosmt", which stands for no simultaneous multithreading, and recorded the same PTQ sound again.
The timespan between the mouseclick and the Pianosound measured with audacity is the latency (example screenshot below).
With hyperthreading on (smt) I got an average overall-latency of 17.1 ms (19, 14, 17, 17, 15, 22, 20, 13).
With hyperthreadinng turned off (nosmt) the average was 18.9 ms (18, 16, 22, 13, 16, 22, 19, 20, 24).
No significant difference of latency found in this experiment.
Some technical details just if interested:
With smt:
---------
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 78
Model name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
Stepping: 3
CPU MHz: 2799.902
CPU max MHz: 2800.0000
CPU min MHz: 400.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
Without smt ("nosmt"):
----------------------
$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0,1
Off-line CPU(s) list: 2,3
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 78
Model name: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
Stepping: 3
CPU MHz: 2800.048
CPU max MHz: 2800.0000
CPU min MHz: 400.0000
BogoMIPS: 4800.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0,1
It is an office-notebook equipped with Pianoteq trial v6.5.3 just for this test. Only optimisation was setting the scaling_governor to "performance" on each CPU.
Testsound was Steinway D Prelude at 44.1 kHz and 64 samples buffersize for better latency, than the default 512 (!) samples. OS Debian Linux "Stretch".
Example latency measured in the wave editor: