Voice and Speaker Identification: Advances in Forensic Phonetics and Neural Biometrics

Book Title: Computational Criminology: AI Applications in Forensic Science and Criminal Justice

Editors: Dr. Xavier Louis, Dr. Surbhi Girdhar, Ms. Aswathi Chandran Nair, Mr. Ravi Kumar, and Ms. Nandini Katare

Chapter: 17

DOI: https://doi.org/10.59646/704/17

Author: Sherin Shaji

Abstract

The human voice is a rich biometric signal encoding not only linguistic content but distinctive physiological and behavioural characteristics vocal tract morphology, fundamental frequency, formant structure, rhythm, and prosody that render each individual’s speech acoustically unique. Forensic speaker identification (FSI), the discipline concerned with determining whether a questioned voice sample originates from a particular suspect, has undergone a fundamental transformation through the application of deep learning-based automatic speaker verification systems. This chapter traces the evolution of FSI from spectrographic analysis and auditory-phonetic comparison through statistical modelling approaches (GMM-UBM, i-vectors) to the contemporary x-vector and ECAPA-TDNN architectures that represent the state of the art. It examines the forensic-specific challenges of short-duration samples, channel mismatch, disguise, and language variation, evaluates validation frameworks including the Likelihood Ratio approach, and addresses the admissibility, ethical, and civil liberties dimensions of voice biometric evidence.