Topic: Offline version of the Magenta Audio to Midi transcription tool

As a following to the discussion about the audio to midi transcription tool begun in the video contest winning entries thread, I thought it would be preferable to open a new thread.
The tool mentioned by budo is indeed quite remarkable and can be really usefull together with Pianoteq
https://magenta.tensorflow.org/onsets-frames

After a further use of the online tool and some search I found something that could be usefull for anyone interested to try:
a very easy to use offline version of the tool:
https://github.com/azuwis/magenta_transcribe

I first found an interesting thread about the Magenta tool in the pianoworld forum.
http://forum.pianoworld.com/ubbthreads....ost2867819
But then I was really thrilled to discover that in this thread user Shinji Ikari provided a complete package he builded that allows you to use this tool offline (see link above)!

A great thing is that except Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019 vc_redist_x64.exe there is nothing to install .
All necessary files (inclusiv the necessary Python version) are in one folder and one just need to start the program "MagentaTranscribe.exe" from within the folder and then select an audio file.
The program (a command prompt) then automatically converts the audio file to a midi file.

Some words about the online versions
There are two online versions of the Magenta tool:
Piano Scribe "https://piano-scribe.glitch.me/"
It worked very well with shorter audio files but it seems as if longer files may or may not be converted, or it can take a very long time.
I also could find a few reports about this issue

I had more success with a longer file using the second online version:
https://colab.research.google.com/noteb...tion.ipynb
But it did not work with Firefox as I was unable to upload a file.
I had to use Chrome (allowing third party cookies)
It is also a little bit more time consuming as you have to virtually install the environment setup into Google Drive using your Google account.
Uploading a file was also not really fast.

Therefore the offline version provided by Shinji Ikari is a great alternative, very fast and usefull.
Thanks to him for providing this complete package.

BTW I just converted offline the Ravel version of the Pavane and it is amazing how well Ravels agogic is translated

I would like to add that in no way this tool is able to produce results comparable to what UrtextMIDI did with "Pianoteq Rubinstein".
Indeed there is a lot of art and fine work necessary to achieve this kind of results.
As I already mentioned dynamic and articulation are not really well translated with the Magenta tool.
But I find the agogic to be quite fine and the translation may be the base for further tweaking.

Last edited by teacue (03-12-2021 21:55)

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:

As a following to the discussion about the audio to midi transcription tool begun in the video contest winning entries thread, I thought it would be preferable to open a new thread.

Great idea, thanks.

Questions:

1) does it work with piano-only or other instruments too?

2) does it work for orchestrations?

3) separating multiple instruments in different MIDI channels?

Thanks

Re: Offline version of the Magenta Audio to Midi transcription tool

Thank you for this very good idea.
I tested it on an extract of a rare concert of Duke Ellington alone at the piano ( Duke at Goutelas, vinyl 1966).  I did a test with a melodyne demo and the result is impressive on some passages.  It is necessary however to rework the midi file a lot, first of all the software does not find the use of the sustain pedal which it is thus necessary to add, Duke use it a lot. Some of Duke's growls or noises are considered as notes and some harmonics too. And sometimes some notes are considered as harmonics and are deleted by the software. But Melodyne allows you to recover these notes. I did another test with magenta but I could only do it on a very short extract and I will try again. I'd love having the Duke playing for me on my Pianoteq but it's I think hours and hours of work... 

Translated with www.DeepL.com/Translator (free version)

Re: Offline version of the Magenta Audio to Midi transcription tool

dv wrote:

1) does it work with piano-only or other instruments too?

2) does it work for orchestrations?

3) separating multiple instruments in different MIDI channels?

Thanks

As I discovered this transcription tool just a few days ago thanks to budos mentioning in the Pianoteq contest thread, I really do not know a lot about it.
But according to the description here "https://magenta.tensorflow.org/oaf-js" it seems to be optimized for solo piano performances.

Still I did a try with a solo classical guitar music piece (polyphonic) and obtained a really good midi translation with only a few glitches.

I assume this answers your three questions.

YvesTh wrote:

I did another test with magenta but I could only do it on a very short extract.

Was ist because of the online version?
Did you try the offline package?
The online version seems to have some difficulty with the length of a file but not the offline version.

Re: Offline version of the Magenta Audio to Midi transcription tool

Here is an example of an audio to midi transcription of the Magenta tool.
https://forum.modartt.com/uploads.php?f...anoteq.mp3

I choosed a Piano Roll Recording of the "Pavane pour une infante défunte" played by Ravel himself:
https://www.youtube.com/watch?v=7ASYm3K_PwM

It may not be an ideal example as it is a transcription of a transcription, but well, I love this interpretation.

I did not change anything to the Midi data of the transcription because I am not a pianist and also it should be an example of what this tool is doing on its own.
The Pianoteq preset is HB Steinway D Classical with "Long Plate" as reverb.

Re: Offline version of the Magenta Audio to Midi transcription tool

That is so super-cool!

Like a human to piano roll to digital conversion, seemingly keeping nuance, etc.

And I thought that Edison's wax drum recordings would be the oldest auditory recordings...at least for the piano, this technology seems to have transcended that.

:-)

- David

Re: Offline version of the Magenta Audio to Midi transcription tool

dklein wrote:

That is so super-cool!

Like a human to piano roll to digital conversion, seemingly keeping nuance, etc.

And I thought that Edison's wax drum recordings would be the oldest auditory recordings...at least for the piano, this technology seems to have transcended that.

:-)

I am just a humble, low skilled piano player, doing it only for my own enjoyment - but the even older technique, hand played piano rolls, often by the composers themselves, are kind of a treasure. Even if they lack dynamics "per key" or have no dynamics at all.

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:
YvesTh wrote:

I did another test with magenta but I could only do it on a very short extract.

Was ist because of the online version?
Did you try the offline package?
The online version seems to have some difficulty with the length of a file but not the offline version.

Hello,
I have installed the "magenta transcribe" package and "microsoft visual C++" but it doesn't work on my PC or I don't know how to use it. So I use the demo version of melodyne.

Re: Offline version of the Magenta Audio to Midi transcription tool

i'm glad onsets and frames is getting more love ... they did a great job with it and their work really is a significant achievement. 

i myself installed it (on a linux system) using conda:

https://docs.conda.io/en/latest/

it should also work on Mac/PC.  there are other models in the magenta project that are also very fun to play with.  one, for instance, tries to write a four-voice chorale given a short melody.  another model attempts to generate improvisations based on sample music. 

regarding some of the questions asked above, the software was developed to produce midi of solo piano recordings.  there's a lot that goes into it, but the main point (in this kind of machine learning) is that the computer develops its own algorithm to process the wav files and produce midi by training itself on a large dataset.  this dataset has lots of snippets of pairs (audio, corresponding midi).  the model runs repeatedly on the dataset and teaches itself by assigning a score to its most recent attempt and modifying the algorithm to try to do better.  it needs many generations of this before it can reliably handle audio data it hasn't seen before.  this means it can't handle other kinds of music processing problems (like separating instruments, or analyzing music other than solo piano). 

on the other hand, just because it was designed to run on solo piano, it doesn't mean one can't try other things to see what happens   i did this a while ago; it's a lot of fun to feed it music made from different instruments to see what happens.  here are three examples:

https://hearthis.at/budosaurus/5thwav/

https://hearthis.at/budosaurus/01-black-dogwav/

https://hearthis.at/budosaurus/impwav/

Re: Offline version of the Magenta Audio to Midi transcription tool

YvesTh wrote:

I have installed the "magenta transcribe" package and "microsoft visual C++" but it doesn't work on my PC or I don't know how to use it.

As I am Usually not good at building environment from GifHub, I was really happy to see that a fully working package had been provided.

In case you inadvertantly missed something here are the steps on how to use it:

1. Go to:
https://github.com/azuwis/magenta_transcribe

2. Go to the paragraph  "Simple Gui for Onsets and Frames Piano Transcription Tool / How to use"

3. Follow points 1 and 2 of the paragraph "How to use"
It means:

4. Go to the link written in point 1 of paragraph "How to use"
(Here the original link: https://docs.microsoft.com/en-US/cpp/wi...=msvc-170)

5. Look for the file "vc_redist.x64.exe" which is under "Architecture/X64"
Here the Permalink: https://aka.ms/vs/17/release/vc_redist.x64.exe

6. Download and install the file "vc_redist.x64.exe"

7. Go back to "https://github.com/azuwis/magenta_transcribe" and go to the second point of the paragraph "How to use"

8. Download and unpack the file "MagentaTranscribe.zip" (There is nothing to install, just unpack)
You should now have a folder on your hard drive called "MagentaTranscribe"

9. This is all you have to do in order to be able to run the Magenta tool offline

10. Go to the folder "MagentaTranscribe", look for the file "MagentaTranscribe.exe" and double click on it

11 This should open the "Command Prompt" and directly after this, without you doing anything, it should open a window where you can  search and select an audio file.

7. Navigate to your audiofile and select it

8. The tool begins now to work on its own, it should not last very long
When it is finished you can read in the Command Prompt "Press Enter to exit"

9. That's all.
Normally a midi file should be in the folder of the original audio file

It's working perfectly on my Windows 10 system

YvesTh wrote:

So I use the demo version of melodyne.

This is of course the kings tool

Re: Offline version of the Magenta Audio to Midi transcription tool

budo wrote:

i'm glad onsets and frames is getting more love ... they did a great job with it and their work really is a significant achievement.

Indeed

budo wrote:

on the other hand, just because it was designed to run on solo piano, it doesn't mean one can't try other things to see what happens   i did this a while ago; it's a lot of fun to feed it music made from different instruments to see what happens.  here are three examples:

Great examples

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:
YvesTh wrote:

I have installed the "magenta transcribe" package and "microsoft visual C++" but it doesn't work on my PC or I don't know how to use it.

As I am Usually not good at building environment from GifHub, I was really happy to see that a fully working package had been provided.

In case you inadvertantly missed something here are the steps on how to use it:

1. Go to:
https://github.com/azuwis/magenta_transcribe

2. Go to the paragraph  "Simple Gui for Onsets and Frames Piano Transcription Tool / How to use"

3. Follow points 1 and 2 of the paragraph "How to use"
It means:

4. Go to the link written in point 1 of paragraph "How to use"
(Here the original link: https://docs.microsoft.com/en-US/cpp/wi...=msvc-170)

5. Look for the file "vc_redist.x64.exe" which is under "Architecture/X64"
Here the Permalink: https://aka.ms/vs/17/release/vc_redist.x64.exe

6. Download and install the file "vc_redist.x64.exe"

7. Go back to "https://github.com/azuwis/magenta_transcribe" and go to the second point of the paragraph "How to use"

8. Download and unpack the file "MagentaTranscribe.zip" (There is nothing to install, just unpack)
You should now have a folder on your hard drive called "MagentaTranscribe"

9. This is all you have to do in order to be able to run the Magenta tool offline

10. Go to the folder "MagentaTranscribe", look for the file "MagentaTranscribe.exe" and double click on it

11 This should open the "Command Prompt" and directly after this, without you doing anything, it should open a window where you can  search and select an audio file.

7. Navigate to your audiofile and select it

8. The tool begins now to work on its own, it should not last very long
When it is finished you can read in the Command Prompt "Press Enter to exit"

9. That's all.
Normally a midi file should be in the folder of the original audio file

It's working perfectly on my Windows 10 system

YvesTh wrote:

So I use the demo version of melodyne.

This is of course the kings tool

Thank you very much for your help,
I tried again the installation but it don't work on my PC (too old I think) but I tried on another computer and it works when the wav file is not too long.

Re: Offline version of the Magenta Audio to Midi transcription tool

YvesTh wrote:

Thank you very much for your help,
I tried again the installation but it don't work on my PC (too old I think) but I tried on another computer and it works when the wav file is not too long.

that was a very impressive installation description from teacue  

regarding the length of the wav file, on my machine the tool can handle fairly long performances but certainly not arbitrarily long.  if the program requests too much memory from the system as it analyzes the file, it kills itself.  i've tried to figure out a workaround but no luck so far.  20-40 min performances seem ok, but beyond that it dies.

Re: Offline version of the Magenta Audio to Midi transcription tool

I managed to create the midi file of Duke Ellington's chorus by splitting the wav file in several sections. The result is very impressive on many passages and the swing is perfectly respected. After corrections of various elements (the recording is from 1961 and the piano has a lot of harmonics that disturbed the software) I added the sustain pedal, very important on some passages, all this with Studio One. Everything works well on my DAW with pianoteq in vst but when I export the result in a midi file to read it with pianoteq standelone, the pedal sustain part disappears. Does anyone know how to create a midi file with the pedal sustain.
Thanks

Translated with www.DeepL.com/Translator (free version)

Re: Offline version of the Magenta Audio to Midi transcription tool

Thanks, budo,  Clever demonstrations.

,,,and one of the only times that I've heard Led Zeppelin mentioned on the PIanoteq forum (b.t.w., have you guys heard the new Alison Kraus and Robert Plant album, Raise the Roof?

- David

Re: Offline version of the Magenta Audio to Midi transcription tool

Please, how do you insert an audio player ?
Is there a topic in the forum about pictures, video or audio insertion ?
What is the method step by step ?
Thank you very much for your help

Re: Offline version of the Magenta Audio to Midi transcription tool

YvesTh wrote:

Please, how do you insert an audio player ?
Is there a topic in the forum about pictures, video or audio insertion ?
What is the method step by step ?
Thank you very much for your help

you can just put your Duke Ellington video on youtube and then put the full url into the text of the post.  the forum software seems to be clever enough to handle everything.

Re: Offline version of the Magenta Audio to Midi transcription tool

Here is my test of magenta :

Original file : Duke Ellington solo piano at Goutelas in 1966

https://forum.modartt.com/uploads.php?f...ait%29.mp3

Pianoteq File "Steinway B" with midi file from magenta ( with some corrections).

https://forum.modartt.com/uploads.php?f...ess%29.mp3

I will try to make many other necessary corrections.

Main problems on the midi file.
Many excess notes had to be removed.
Low notes coming from Duke's grunts, high notes coming from very strong harmonics, high notes coming from clapping, the software not being able to detect the sustain pedal, the notes are extended individually and therefore we don't have the sympathetic resonance of the other strings with pianoteq. The ideal would be to restore the pedal effects in the midi file (I added some quickly). Some missing notes too. It's will be a big work to do but very interesting.
The original file is about 2 x 10 minutes long.

Last edited by YvesTh (07-12-2021 23:03)

Re: Offline version of the Magenta Audio to Midi transcription tool

Thanks, Yves.  I have gone from 'wowed by the technology' to now having glimpsed behind the curtain and seen all the work that must go on behind the scenes.  Clearly, it's not 'turn-key' technology the way that shooting a digital photo of an old print or slide is, where auto-exposure makes that pretty easy now-a-days.  I am sure that an 'auto-musicality' function can't be too far behind!  ;-)

- David

Re: Offline version of the Magenta Audio to Midi transcription tool

@ YvesTh
A nice and interesting transcription of Satin Doll

As a non-pianist I would like to learn in this context how to restore the sustain pedal.
Has someone some tips?

As one can clearly hear in YvesTh transcription much of the dynamic get lost.
Beside of editing each note manually what could be done to get the whole dynamic range?
I tried editing the velocity curve but found out that it is not easy to achieve good results.

Last edited by teacue (08-12-2021 17:06)

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:

As a non-pianist I would like to learn in this context how to restore the sustain pedal.
Has someone some tips?

If the notes held are unplayable by the pianist (finger spread) it means that the sustain pedal has been used. Otherwise you have to listen carefully to the original file to see if you can hear the particular resonance of the piano with the pedal depressed. The problem is that pianists often use the pedal on short sequences of notes that are very difficult to determine, and the time and speed of releasing the pedal is just as important as the time and speed of depressing it. I have on the "Duke" file a lot of trouble to determine everywhere these parameters....
For classical music you can read the score indications but for a jazz improvisation you have only your ears...

Translated with www.DeepL.com/Translator (free version)

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:

As a non-pianist I would like to learn in this context how to restore the sustain pedal.
Has someone some tips?

@YvesTh gave a great answer with everything i wanted to say about this.  i just wanted to add that a longer-term way to solve the problem would be to develop a new magenta model that does this.  it would take as input the same data as onsets and frames (wav file) and produce a guess about application of the sustain pedal.  it could be trained on lots of audio snippets, the same as onsets and frames.  it should actually be easier to do this than what onsets and frames already does, because it's training for a binary state (pedal is on vs pedal is off).  of course it could be made more complicated because there are lots of coloristic effects one can do with the pedal that go beyond on/off, but at least this would be a start.

Re: Offline version of the Magenta Audio to Midi transcription tool

@ YvesTh
Very interesting points, thank you for this.
A not easy task and probably even more difficult for a non-pianist.

@ budo
A new magenta model would be indeed welcome.

You both did not answer my question about dynamic.
Any idea about this?

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:

You both did not answer my question about dynamic.
Any idea about this?

You are right, on my test the dynamic needs to be reworked. I think that the choice of the pianoteq instrument used is very important, to that it is necessary to rework the velocity curve for the reading of the midi file and even sometimes to modify individually the velocity of some notes.

Re: Offline version of the Magenta Audio to Midi transcription tool

Midi file extract :

With pedal sustain addition at the begining and various corrections :

https://forum.modartt.com/uploads.php?f...idi%29.mid
With velocity curve :
Global Velocity = [0, 26, 46, 72, 102; 0, 23, 44, 79, 127]
Duke Ellington was playing on a Steinway for this recording.
Pictures on this link :
https://ellington.se/2021/02/25/ellingt...elas-1966/

Last edited by YvesTh (12-12-2021 18:33)

Re: Offline version of the Magenta Audio to Midi transcription tool

@ YvesTh
Thank you for your thoughts

Re: Offline version of the Magenta Audio to Midi transcription tool

teacue wrote:

@ YvesTh
Very interesting points, thank you for this.
A not easy task and probably even more difficult for a non-pianist.

@ budo
A new magenta model would be indeed welcome.

I very much doubt that a model for automatic pedal detection can ever be created.
From a sonic perspective, it is impossible to differentiate if a note is sustained by the key or by the pedal (yes, the resonances are different if other notes are played at the same time, but I bet my money that the machine learning model focuses only on fundamental frequencies or otherwise the analysis would be incredibly complex). Moreover, you have the sustain pedal that lifts all dampers, and the sostenuto pedal that prevents lifted dampers from dropping. So, when a note is held, it is impossible to know which of the three methods is being used:  key, sustain pedal, sostenuto pedal.

The only way is applying a "feasibility" test: each finger is doing what? Does this requires more than 10 fingers?

Re: Offline version of the Magenta Audio to Midi transcription tool

Vagporto wrote:
teacue wrote:

@ YvesTh
Very interesting points, thank you for this.
A not easy task and probably even more difficult for a non-pianist.

@ budo
A new magenta model would be indeed welcome.

I very much doubt that a model for automatic pedal detection can ever be created.
From a sonic perspective, it is impossible to differentiate if a note is sustained by the key or by the pedal (yes, the resonances are different if other notes are played at the same time, but I bet my money that the machine learning model focuses only on fundamental frequencies or otherwise the analysis would be incredibly complex). Moreover, you have the sustain pedal that lifts all dampers, and the sostenuto pedal that prevents lifted dampers from dropping. So, when a note is held, it is impossible to know which of the three methods is being used:  key, sustain pedal, sostenuto pedal.

The only way is applying a "feasibility" test: each finger is doing what? Does this requires more than 10 fingers?

I totally agree with you...

Re: Offline version of the Magenta Audio to Midi transcription tool

Vagporto wrote:

I very much doubt that a model for automatic pedal detection can ever be created.

I totally agree with you...

respectfully i must disagree.  or at least i have to say, these arguments are not convincing to me.

the point is the way these models work.  they are doing absolutely nothing like the kind of quantitative reasoning a human might do to evaluate whether the pedal has been depressed.  all they do is try to design an algorithm by analyzing lots and lots of data that has already been marked as "pedal depressed" or "pedal not depressed".  the software makes a guess, scores itself, and then improves its guess.  the mathematics behind the process is very interesting and can't really be gone through here, but the point is there is a well-defined way for the software to modify its algorithm based on its performance.  after many many iterations of this, assuming the model is designed well, it will perform very well on the training set, and also on new input it's never seen before.  how it's "thinking" is completely different from how a human actually thinks.

already a lot of the sonic ambiguity mentioned is present in solo piano and is handled very well by onsets and frames.  for instance, how can we distinguish the fundamental of a note and some higher overtone?  the tone of a piano note is very different when it's played loudly or softly.  how do we distinguish that?  we don't ... we just train the model on data and let it sort it out internally.  (to be fair, one weakness of the current model is detecting velocity.  the midi velocity values are reasonable but not close to a human performance, imo.  still it's doing a remarkable job.)

of course the only way to demonstrate success is succeed.  i'd like to try at some point, but i don't know when i'll have time.  maybe someone else will try

Re: Offline version of the Magenta Audio to Midi transcription tool

On a very good quality audio file I think that the software could indeed do as you define it, but the interest to process in midi a very good file is limited. To revive a poor quality file as I tried to do with "Duke" the software has great difficulties. And indeed it takes very frequently harmonics for notes, it can of course progress but the difference between a strong harmonic and the note really played softly can be extremely difficult to detect on an old file extremely parasited by other noises. And in this case the resonances with or without pedals are sometimes very blurred. However, reworking the midi file by listening to the original in parallel is very interesting,

Re: Offline version of the Magenta Audio to Midi transcription tool

budo wrote:

respectfully i must disagree.  or at least i have to say, these arguments are not convincing to me.

And I, also respectfully, invite you to read the technical paper by the authors of the method called Onsets and Frames, and you will see why I raised my doubts that I maintain. The deconvolution techniques that are at the core of the method, are not capable of making that distinction, in other than in notes in clear sonic separation of other notes. And even then I believe the separation could only be made by internal comparison, ie by comparing the sonic signature of the same note played in several different moments.

It is much more possible (although it would amount to the complexity of the best AI programming) to calculate pedaling through the estimation of finger position and movements.

Re: Offline version of the Magenta Audio to Midi transcription tool

Vagporto wrote:
budo wrote:

respectfully i must disagree.  or at least i have to say, these arguments are not convincing to me.

And I, also respectfully, invite you to read the technical paper by the authors of the method called Onsets and Frames, and you will see why I raised my doubts that I maintain. The deconvolution techniques that are at the core of the method, are not capable of making that distinction, in other than in notes in clear sonic separation of other notes. And even then I believe the separation could only be made by internal comparison, ie by comparing the sonic signature of the same note played in several different moments.

It is much more possible (although it would amount to the complexity of the best AI programming) to calculate pedaling through the estimation of finger position and movements.

no problem, we all have respect here .  i have read the paper.  i'm not advocating doing what they did, or tweaking it, but rather something much more vague: developing a new magenta model.  i have no idea if it's possible, and i'm not an expert, but i also am not convinced that progress couldn't be made.  if one had told me 10 years ago that onsets and frames would exist, i would have been very suprised (actually i would have been just as surprised at the existence of Pianoteq, but i guess that's another topic).  any model would be great, whether based on estimation of finger position or whatever else one thinks is the right way forward.  i was just drawn more to a purely sonic model as a first approximation.  but i probably will never take it on anyway ... too many other incomplete projects.