Well, you gave me an idea, and this weekend I am not very busy so just for curiosity, I made some latency measurements with the oscilloscope.
I covered the following items:
*Time between key stroke and MIDI message (on a Kawai CL35)
*Time between MIDI message and the sound produced on Kawais CL35 and MP11
*Time between MIDI message and the sound produced by Pianoteq
*Difference in latency between GNU/Linux and Windows
Hardware used: OWON SDS 1022 oscilloscope, Kawai CL35 digital piano, Kawai MP11 digital piano, MSI GL62M 7RD laptop (cpu: i7 4770hq), generic Chinese laptop (i7 4710mq), miniDSP 2x4HD audio interface, ESI UDJ6 audio interface, ESI MIDIMate II USB-MIDI interface.
Software: Pianoteq 6, Windows 10, Ubuntu Studio 20.04.
I discovered that it is possible to watch MIDI events on the oscilloscope with a non-invasive method, just attaching the probe to the MIDI cable, without any electrical contact, as shown in the following picture:
https://postimg.cc/fJBRPrfW

This way, I got the waveform corresponding to the MIDI protocol, somewhat distorted , but good enough to trigger the scope. Notice, the duration of a bit, corresponding to 31250 baud, the speed of MIDI serial communication.

In order to get the time at which keys are pressed, I opened the CL35 and connected the other probe to one of the OR diodes of the scan matrix. I don´t know whether that diode corresponds to the first or second sensor. The connection is not visible, but the picture gives an idea of how it was made.

Keys are scanned at 11kHz, as can be appreciated in the waveform:


So, until now, we can have the times between the keys being played, and the notes being sent through the MIDI cable.
Just for comparison, I measured the latency of the digital piano (from keys being played to sound being reproduced). I performed these measurements on my Kawai CL35.

Next, I wanted to know, the latency of a digital piano when playing notes from an external source, in this case I tested both CL35 and MP11. To achieve this, I sent notes from the PC and watched for the time between the arrival of the MIDI message and the production of the sound. In this section, we don’t know the time added by the USB to MIDI conversion, we are only looking at the latency between MIDI and audio.

Same procedure was followed to measure latency of Pianoteq with different interfaces and buffer settings, and, in this case, we can estimate the total latency as:
Total latency= key press to MIDI message latency + MIDI message to sound latency.
It is worth noting that latency is not constant among different successive tests.
Results:
The following values are averages of at least 3 measurements:
*CL35 Key press - MIDI message sent latency: 3.98 ms
*CL35 Key press - sound produced latency: 5.92 ms
*CL35 MIDI event received - sound reproduction latency: 1.97 ms
*MP11 MIDI event received - sound reproduction latency: 2.50 ms
So, if we want, for instance, to play MP11 sounds using CL35 keyboard using a DIN5 MIDI cable (who would do that?) we would hear the sound from the MP11 3.985 + 2.50 = 6.485 ms after having pressed the key, instead of the 5.92 ms that takes the CL35 to play its own sounds. Curiously, the higher end MP11 is a bit slower than the cheaper CL35.
Now let’s turn to PC’s latency, here the image gets a little more complicated because we can set different buffer values. In addition, we can choose in Windows between the native drivers and ASIO4all. For Ubuntu GNU/Linux there are many settings which can be improved but I used the stock image. Sample rates were chosen to be 48kHz and the buffer size 64 and 128 in most cases. Measurements made on the MSI laptop are marked with M, the ones made on the generic laptop with G. Just to add some information, I took old values from this post of mine in pianoworld forum, from year 2013, in which I complained because I wasn’t able to get a decent latency (http://forum.pianoworld.com/ubbthreads...._USB_.html). I had tested some latencies taking the difference in time between audio signals coming from Pianoteq and CL35. In order to make a fair comparison, we need to add 1.97 ms to those old values.

The PC was a laptop with an Intel Pentium 2020m processor. Finally, I took the measurements from this another post, https://forum.modartt.com/viewtopic.php...13#p967413 employing an ODROID XU4 (ARM processor, Raspbian Linux OS) with the UDJ6. This way we get four extra values for UDJ6 interface under Linux. I indicate these last values as 2020M and ARM.
It is worth mentioning that the delays measured with this procedure (timing between MIDI signal and sound) match perfectly with the ones measured with the Teensy microcontroller procedure mentioned before.

Generally speaking, I feel that latencies of about 10 milliseconds are OK. I don’t feel comfortable in those cases which latency is about 15 ms and above. I don’t know if anybody else experimented this, but I get my hands tired if I play with this latency.
To make a comparison, taking the speed of sound in air to be 343 m/s, 3 ms of latency are equivalent to 1.03 m of distance from speakers, so 15 ms of latency would be equivalent to have speakers at about 5 meters, which I don’t think is very far, but I started doing those measurements in 2013 because I noticed the latency (at that time, I thought that the word ASIO was a warranty of 0 latency).
It is interesting that driver ASIO4all performance varied widely for different systems with the same interface. The best option seems to be the internal sound card, but I must point that both ESI UDJ6 and miniDSP 2x4HD are not intended as audio interface for live playing, the first one is for DJs (not offense intended, but they don’t play in the same sense a pianist plays) and miniDSP is more an audio processor than an interface. I wonder how well would an entry level Focusrite / Behringer / M-Audio, etc. perform. By the other hand ALSA is not great but consistent among systems.
Appendix: values of latency
*miniDSP 2x4HD
M, Profile ‘minimum latency’, sample rate: 48 kHz, buffer size: 64 samples, latency: 10.6 ms
G, Profile ‘minimum latency’, sample rate: 48 kHz, buffer size: 64 samples, latency: 11.65 ms
M, Profile ‘minimum latency’, sample rate: 48 kHz, buffer size: 128 samples: 14.19 ms
G, Profile ‘minimum latency’, sample rate: 48 kHz, buffer size: 128 samples: 15.14 ms
M, Windows Audio Exclusive mode, 48 kHz, 192 samples: 23.28 ms
M, ASIO4all, 48 kHz, 64 samples: 13.84 ms
G, ASIO4all, 48 kHz, 64 samples: 15.36 ms
M, ASIO4all, 48 kHz, 128 samples: 14.3 ms
G, ASIO4all, 48 kHz, 128 samples: 16.4 ms
M, Linux (ALSA direct hardware without any conversion), 64 samples: 15.16 ms.
G, Linux (ALSA direct hardware without any conversion), 64 samples: 15.19 ms.
M, Linux (ALSA direct hardware without any conversion), 128 samples: 23.6 ms.
G, Linux (ALSA direct hardware without any conversion), 128 samples: 22.35 ms.
*Onboard sound chip
M, ASIO4all, 64 samples: 9.27 ms
M, ASIO4all, 128 samples: 12.6 ms
M, Linux (ALSA direct hardware without any conversion), 64 samples: not usable (pops).
M, Linux (ALSA direct hardware without any conversion), 128 samples: 11.33 ms.
G, ASIO4all, 64 samples: 9.28 ms
G, ASIO4all, 128 samples: 11.88 ms
G, Linux (ALSA direct hardware without any conversion), 64 samples: 8.68 ms.
G, Linux (ALSA direct hardware without any conversion), 128 samples: 11.77 ms.
*ESI UDJ6
M, ESI driver, 48 kHz, 96 samples (interface default): 10.44 ms
M, ESI driver, 48 kHz, 128 samples: 12.66 ms
G, ESI driver, 48 kHz, 128 samples: 13.57 ms
M, ASIO4all, 48 kHz, 64 samples: 37.47 ms
G, ASIO4all, 48 kHz, 64 samples: 7.76 ms
M, ASIO4all, 48 kHz, 128 samples: 42.63 ms
G, ASIO4all, 48 kHz, 128 samples: 10.99 ms
M, ALSA (direct hardware without any conversion), 48 kHz, 64 samples: 12.10 ms
G, ALSA, 48 kHz, 64 samples: 12.06 ms
M, ALSA, 48 kHz, 128 samples: 20.33 ms
G, ALSA, 48 kHz, 128 samples: 20.5 ms
2020M, ALSA, 48 kHz, 64 samples: 15.9 ms
2020M, ALSA, 48 kHz, 128 samples: 15.9 ms
ARM, ALSA, 48 kHz, 64 samples: 12.1 ms
ARM, ALSA, 48 kHz, 64 samples: 20.3 ms