Title
Spoofing detection employing infinite impulse response - constant Q transform-based feature representations.
Abstract
Speaker recognition researchers acknowledge that systems which aim to verify speakers automatically based on their pronunciation of an utterance are vulnerable to spoofing attacks using voice conversion and speech synthesis technologies. The first automatic speaker verification spoofing and countermeasures challenge (ASVspoof2015) was designed to stimulate interest in this problem among the speaker recognition communities. In the course of the challenge and subsequently, it became clear that the most effective countermeasures against spoofing attacks are low-level acoustic features (typically extracted at 10 ms intervals) designed to detect artifacts in synthetic or voice converted speech. In this work, we demonstrate the effectiveness of the infinite impulse response constant Q transform (IIR-CQT) spectrum-based cepstral coefficients (ICQC) as anti-spoofing front-end. The IIR-CQT spectrum is estimated by filtering the multi-resolution fast Fourier transform with an infinite impulse response filter. These features can be used on their own with a standard Gaussian mixture model backend to detect spoofing attacks or they can be used in tandem with bottleneck features which are extracted from a bottleneck layer in a deep neural network designed to discriminate between synthetic and natural speech. We show that the ICQC features are capable of producing very low equal error rates on the individual spoofing attacks in the ASVspoof2015 data set (0.02% on the known attacks, 0.23% on the unknown attacks, and 0.13% on average). Moreover, with a single decision threshold (common to all of the attacks), the ICQC front end yielded an equal error rate of 0.20%.
Year
Venue
Keywords
2017
European Signal Processing Conference
spoofing detection,ASVspoof2015,GMM,bottleneck features,ICQC
Field
DocType
ISSN
Constant Q transform,Mel-frequency cepstrum,Speech synthesis,Pattern recognition,Spoofing attack,Computer science,Word error rate,Infinite impulse response,Filter (signal processing),Speech recognition,Speaker recognition,Artificial intelligence
Conference
2076-1465
Citations 
PageRank 
References 
2
0.35
10
Authors
2
Name
Order
Citations
PageRank
jahangir alam132038.69
Patrick Kenny22700214.80