Repository logo
 

Characterising Performative Speech Modes via Analysis by Synthesis


Loading...
Thumbnail Image

Type

Change log

Abstract

Due to technological advances in recent years, there has been increased interest in both Linguistics and Artificial Intelligence research in performative speech, and its acoustic and perceptual properties. However, this speech domain is abstract and highly nuanced, making it highly difficult to characterise in the already complicated realm of expressive speech theories and research.

This thesis attempts to explore perception of performative speech in terms of the Bio-informational Dimensions, which posits that all expressive speech is an evolution from animal instincts to control listener behaviour using modulations along these dimensions. The dimensions that were the primary focus of this work were that of 1) Size Projection, the projected body size of the speaker, and 2) Dynamicity, the vigorousness of the speaker's speech stream. Also of interest was the parameter of vocal quality, which is often correlated with Size Projection but is also difficult to quantify due to gaps between perceptual and productive descriptions.

The main body of this work is comprised of three experiments. The first is a initial experiment exploring performative speech in terms of the Bio-informational Dimensions, with stimuli being resynthesized to different parameters to emulate these vocal modulations. The second experiment utilised a different pitch manipulation method that separates the pitch median and range manipulation processes, which was then compared to the first in terms of listener perception. The final experiment is explored how listeners would perceive dramatic speech when vocal quality was automatically incorporated along with the BIDs, and how this would differ between different performative speaking modes.

The conclusions of the thesis showed that Size Projection and Dynamicity varied in their respective impact on listener perception of performative speech, depending on the intonation of the stimuli base and the freedom listeners had for comparison. This variance revealed high-context dependence of perception of performative speech and expressive speech in general, and the need for further explorations of these highly complex domains.

Description

Date

2025-04-21

Advisors

Post, Brechtje
Knill, Kate

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All rights reserved
Sponsorship
Swarthmore College Lockwood Fellowship

Relationships

Is supplemented by: