Topic: Voice control of PianoTeq parameters? Simple to implement?

(Edited for clarity.)

Experimenting with the speech recognition program included in Windows Vista, today, I found that all of the PianoTeq commands that are linked to keyboard keys (opening the Mic pane, or pressing N to go to the next preset, etc) work fine. Say "Space" and the midi file plays. Say it again to stop playback. Say "Press N" to move to the next preset. (Using a headset with a mic attached, of course, and my system, at least, lets me hear the audio and give commands. One could also use a mic on a headset while outputting the sound to monitors or a better set of headphones.)

So the obvious thought is--what if letters could be assigned to each parameter\slider or to a midi CC? We could say a letter (U for Unison width, for example), and then a number to move the slider. This almost already works--see below.

Absurd? No, no. For much editing, this arrangement would get rid of mouse clicking or searching a midi-cc interface for the correct knob. In other words, you wouldn't have to take your hands off of the piano keyboard while editing. Even now, it's already nice to just say "press N" to move to the next preset while playing, or "press C" to compare the edited preset with the original, without having to reach for the mouse. The other advantages include being able to just put a monitor behind the midi keyboard, so there is no need to swivel around to click on things, and giving the visually handicapped access to PianoTeq.

And the speech control program plays well with PianoTeq. No dropouts or other problems. It's accurate, too. Complex statements can confuse speech-to-text and voice control programs, but simple commands like "Down 20" work well.

It might be very, very simple to implement: just allow the assigning of letters to activate a midi cc or slider in the PT Help\"View and edit key mappings" dialog box. The Microsoft program does everything else. This already almost works: If you click on a slider to activate it, saying "Up" moves the slider one step. (Somehow the commands are reversed: "Up" moves the slider left, and "Down" moves it right.) "Down 20" moves the slider 20 steps. It's more simple and accurate than trying to move 20 steps by hand. All that's needed is the ability to assign letters to sliders.

(The Microsoft program also lets you say "Show numbers," which displays a number over each selectable feature\button, et al. Say a number and the program responds by clicking on the button. Now, in PianoTeq, saying this command only causes the various Windows buttons to have numbers--Close, etc.)

Would I always want to have this turned on while playing? No. But for editing sounds, it could work better than midi cc's. (To start speech commands in Vista: Start\Control Panel\Speech Recognition. Takes a few minutes to set up, but works fine so far. Clearly, one should turn off the mic while singing or swearing.)

Last edited by Jake Johnson (29-10-2009 17:27)

Re: Voice control of PianoTeq parameters? Simple to implement?

(Edited to correct some errors and add other ways to edit using voice.)

Just found that:

1. Saying "Page-down" or "Page-up" moves the slider in larger increments, but again, in the opposite direction from what you might expect. You can also say "Page-down two times" to move in still larger increments. However, if the command would move the slider past the far left or right edge, the slider will not move.)

2. If a given slider is selected (clicked on), saying "Tab" moves control to the next slider, which can then be moved with commands like "Down 10," and you can then move through the entire interface by saying "Tab." This procedure gets tricky, however, after going through each pane, since pressing Tab after the last slider turns the control over to closing or opening the pane shutter. (Say "Enter" to do this.) In other words, after moving the last slider in each pane, you must say or press Tab twice to move to the first slider in the next pane. This situation becomes more problematic after the Quadratic Effect slider. Saying\pressing Tab moves the focus onto the EQ\Velocity pane, and it takes many, many Tabs to reach the next slider--the Volume slider.

3. Once a slider is selected, say "Enter," and the slider's setting box opens, with the parameter's numeric settings selected. You can then say "Type 24" for example, to overwrite the setting to position 24. (You may have to say "Type 2" and then "Type 4.") Then say "Enter." Then say "Cancel" to save the setting and close the box. (If you create a setting that is outside the range of the slider, it will default to the original setting, so be careful, testing this, to enter acceptable numbers.) Seems cumbersome? At first, the natural tendency is to pause after stating each command, but that's not needed. You can speak quickly, and you don't have to over-enunciate, to issue the commands. I'm getting better at it--it's less cumbersome than moving back and forth between the midi and computer keyboard.

4. If you open the Preset list, you can say "Press N" and "Press P" to go up and down the list of presets. (But be careful NOT to say "Down" or "Up" or "Page-down" of "Page-up"--these commands will affect the last slider you moved in the main interface, instead of moving through the presets.)

Last edited by Jake Johnson (27-10-2009 14:48)

Re: Voice control of PianoTeq parameters? Simple to implement?

After the overwhelming response to my previous two posts, I felt moved to investigate further. Microsoft offers a free macro\coding kit for the speech engine. You can't change the sound of the voice (which is much better than the old voices), but you can assign complex commands to words or phrases. Starting both the speech engine and the macro utility is almost instantaneous--you just double-click on the macro utility from the Windows Start menu, and both are loaded. The macro utility is stable, undemanding, and transparent, sitting on the Task Bar in case you need to add or edit macros. It must be loaded, however, to use spoken macros.

Several sites to be aware of:

An overview, which many people won't need, walking you through what the macros can do, and downloading and installing the macros program for the speech control system:

http://www.vista4beginners.com/Enhance-...ing-Macros

-----------

The Microsoft download site for the macros utility:

http://www.microsoft.com/downloads/deta...laylang=en

------------------

The Microsoft wiki that explains all of the variables and controls, with brief example coding. Before getting too involved with the coding possibilities, review the simple key presses that can be assigned to spoken words using the various methods that pop up when you double-click on the Macro utility's icon in the Task Bar. A lot can be done very simply, such as assigning a word to run procedures. This feature could become very interesting if we could assign letters to midi cc's or to the sliders. We could say "Unison Down 20" and the slider would jump down 20 steps, and then "Hammer noise Up 2," etc. But for more advanced macros:

http://code.msdn.microsoft.com/wsrmacro...Title=Home


Appears to be more powerful than one might think. You can do silly things, such as have a program speak back to you when you speak to it (which might not be bad for a Help feature), or offer a suggestion when you take a step.  Allows much more, such as opening a program and loading a given preset and entering data. In other words, it's like any other high-level coding thing, but attached to voice control. Seems fairly efficient, and it offers some fun and unusual features. You can make pop-up boxes, in addition to spoken words, appear in response to spoken questions, or open a list of commands for the user to click on based on a question. (You can enter many variations of the question, so several ways of asking a question receive the same oral or written response.) I haven't gone very far into these commands. So far I've just set things up so that whenever I ask or tell the voice to say the word "dog," she responds by saying "I don't want to say 'dog.') One good thing is that the macro doesn't run if I just use the command words out of context--if I use the word "dog" in a sentence, there is no response.

And here's a forum I ran across that seems fairly serious, though small:

Vista Speech forum:

http://www.knowbrainer.com/PubForum/ind...egoryId=13

Cheers. It appears that I found a way to waste a Sunday on all of this...I toyed with speech recognition when it first came out, and I'm surprised by how much better it's gotten. Text-to-speech still stumbles with odd pauses and changes in pitch,  but the voice (named Anna) is less robotic, and controlling programs with speech has gotten much better. Not just more accurate: the  interface is much more straightforward. Judging from the sites I've looked through today, the speech recognition engine in Vista was roundly praised. Apparently the program is almost exactly the same in Windows 7, but with a few improvements on the dictation-to-text side:

http://blogs.msdn.com/tsfaware/

Last edited by Jake Johnson (29-10-2009 17:25)