Home
Scholarly Works
Reverse-Engineering Speech and Music...
Preprint

Reverse-Engineering Speech and Music Categorization from a Single Sound Source

Abstract

Classifying whether an auditory signal is music or speech is important for both humans and computational systems. Although previous literature suggests that music and speech are easily separable categories, common experimental approaches may bias findings toward this distinction by relying on stimuli from different sound sources and predefined response labels. Here, we use stimulus material from the dùndún drum–a speech surrogate that can signal either speech-related or musical content. We first replicate standard speech-music categorization results (N=108). Then, we depart from the typical experimental procedure by asking new participants (N=180) to sort and label the stimulus material, without predefined categories. Hierarchical clustering of participants’ stimulus groupings reveals multiple organizing dimensions, with the speech–music distinction reliably present but secondary under label-free conditions. By reverse-engineering the relationship between sorting behavior, acoustic features, and semantic labels, we characterize how speech–music categorization relates to other salient perceptual dimensions and how its behavioral prominence depends on task constraints.

Authors

Fink LK; Hörster M; Poeppel D; Wald-Fuhrmann M; Larrouy-Maestri P

Publication date

September 22, 2025

DOI

10.31234/osf.io/2635u_v2

Preprint server

PsyArXiv

Labels

View published work (Non-McMaster Users)

Contact the Experts team