Hi Philippe,
Thanks for your kind welcoming words ! true I am a new poster, but I come to read the forum from time to time, for a long time. :-)
And thank you for your answer. So I undersatnd, you use automated procedure for "rough tuning". Probably a kind of "brute force" approach, I imagine that you have a "MIDI test" procedure, and you "explore" all (key) the parameters combinatory, and select the best candidate (or best promising approaches), that is best fitness score against target sounds. I speculate, that propably you can make some "fourrier" analysis of input sound and output sound, to measure "signal" proximity. Work for a "sample", but fitness should also consider series of samples (as sound evolve with time). Probably htere are other methods (looks like a complicated prblme to me,) but anyway this could represent a lot of computations.
Do you need some kind of "super computer" or "cloud computing power" for this...
Considering you last sentence : "requires some aesthetic judgement which is not easy to put inside an objective function of a neural network". I am not a specialist, but I have seen how image recognition is based on training network by a set of "good response" set (tagged values describing images). So if we have a data set of "sounds" coming from known pianos, we coud train it to recognize sounds. So the network could say, I recognize this sound as the A piano at x%, B piano at y%. Then chnaging a parameter of the engine, you could see, how it "close the gap" to a particular piano.
The funny thing, is that you could use "sampled pianos" to genrate your dataset !
Makes sens ? or perhaps the recognition is too complicated (as for speech recognition, where 95% to 99% is the difference between uselles and usefull recognition, but this success ratio is almost impossible to achieve and requires so much data to converge). In that case, manual process could appear like the best effort/reward approach...
Last edited by ziczack (01-10-2017 21:32)