Ouisper will use signal processing algorithms and machine learning techniques to associate real time vocal tract images with acoustic features which can be used to reconstruct a speech signal.
Vocal tract images are taken using an ultrasound machine. The University of Maryland HATS system was employed to immobilize the speaker’s head and support the transducer beneath the chin. An example image showing tongue contour (tongue tip is to the right) and embedded lip profile image is shown below.
Ultimately, a lighter, wearable system is envisaged.