Towards Multimodal VR Trainer of Voice Emission and Public Speaking: Work-in-Progress
GlossoVR is a virtual reality (VR) application that combines training in public speaking in front of a virtual audience and in voice emission in relaxation exercises. It is accompanied by digital signal processing (DSP) and artificial intelligence (AI) modules which provide automatic feedback on the vocal performance as well as the behavior and psychophysiology of the user. In particular, we address parameters of speech emotions, prosody and timbre, and the user's hand gestures and eye movement. The prototype is in the proof of concept phase, and we are developing it in accordance with the user-centered design paradigm. In this article reports the work in progress, focusing on the approaches, datasets and algorithms applied in the current state of the glossoVR project.