hardware wise I usually go with this: mp11se - usb - pc(pianoteq standalone) - rme hdsp9632 - JDL Atom amp - sennheiser hd600.
I have tested multiple variants, the latency is lowest in this setup, with the total latency of ~3.4ms compared to internal sound. 5-pin midi unfortunately has 32.125kbaud limit, which adds a lot to latency and jitter.
Otherwise I use reaper for additional processing. I always have some gripe with pianoteq's default sound as they are too dark and muddy to my ears. One thing is it's designed to reproduce unmodified live sound of the acoustic but it's even darker than that. Most recordings especially pop on the other hand has very bright sound. For a clean sound and headphone monitoring I choose the basic preset (model D, 290VC etc, i.e. no-adjective preset) and binaural and turn off all effects except EQ. For recording I opt for close mic setup and do additional processing with external reverb and comps. If you feels that it sounds decent in headphone it is probably fine in recording, given that you have decent headphone, whilst the problem lies with the monitoring since there are a lot more variables with monitors, such as frequency profile, placement, directivity and room modes.
You can start with adjusting the mic setup. The aim for monitoring to "play like real acoustic" and "play like recording" are very different. Lots of presets are for the former and you could have selected one of these but the monitor is not up for this task. Presets for recording are easier to analyze and can utilize reference recording for this task. e.g. using fabfilter Q to match the frequency profile of pianoteq to a recording. But adjust the mic first. I feel Pianoteq sound has tendency of weak central image and lackluster spatial definition. So for speakers a clean solid source sound is crucial, this can be helped with narrower stereo width. Add a tiny bit of delay and plate reverb using high quality external plugins to recreate the spatial aspect. Additionally you can separate the close mic and remote mic output and send them to different processing chains and add delays and reverbs accordingly.
I don't use Velpro but a custom jsfx to calibrate the per-key velocities using data from key presses. The mp11se has a lot of variability in key responses and it can be calibrated to as even as possible. But it also makes the touch a little bit odd, like losing a tiny bit of connection between the physical idiosyncrasies and the aural one, so that finger control is a bit reduced.