abstract
-
There is growing concern that artificial intelligence (AI) conversational agents (e.g., Siri, Alexa) reinforce voice-based social stereotypes. Because little is known about social perceptions of conversational agents’ voices, we investigated the perceptual dimensions that underpin social perceptions of these synthetic voices and the role that acoustic parameters play in these perceptions. In Study 1 (N = 504), Principal Component Analysis of ratings of synthetic voices on a range of traits (trustworthiness, emotional stability, responsibility, sociability, caringness, attractiveness, intelligence, confidence, weirdness, unhappiness, meanness, aggressiveness, dominance, competence, age, masculinity, femininity) suggested that social perceptions of synthetic voices are underpinned by Valence and Dominance components that are highly similar to those previously reported for natural human stimuli. Study 1 also found that scores on the Dominance component were strongly and negatively related to voice pitch. Study 2 (N = 160) found that experimentally manipulating pitch in synthetic voices directly influenced perceptions of their dominance and aggressiveness, but not their competence or trustworthiness. Collectively, these results suggest that greater consideration of the role that voice pitch plays in dominance-related social perceptions when designing conversational agents will be effective in controlling stereotypic perceptions of their voices and the downstream consequences of those perceptions. (This research was supported by the EPSRC grant ‘Designing Conversational Assistants to Reduce Gender Bias’ EP/T023783/1, awarded to BCJ. Daria Altenburg was supported by Grant BOF.24Y.2019.0006.01 of Ghent University, awarded to Adriaan Spruyt.)