Title
On robustness of speech based biometric systems against voice conversion attack
Abstract
Graphical abstractDisplay Omitted HighlightsEvaluation of robustness of SID and SV systems against VC spoofing attack.The vulnerability in decreasing order of VC techniques is GMM, WFW and WFW-.In SV systems, GMM-SVM is more resilient than GMM-UBM for VC impostor attacks.All systems are more robust to cross-gender than intra-gender converted voices.An approach of relating the VC score with SV performance is proposed. Voice conversion (VC) approach, which morphs the voice of a source speaker to be perceived as spoken by a specified target speaker, can be intentionally used to deceive the speaker identification (SID) and speaker verification (SV) systems that use speech biometric. Voice conversion spoofing attacks to imitate a particular speaker pose potential threat to these kinds of systems. In this paper, we first present an experimental study to evaluate the robustness of such systems against voice conversion disguise. We use Gaussian mixture model (GMM) based SID systems, GMM with universal background model (GMM-UBM) based SV systems and GMM supervector with support vector machine (GMM-SVM) based SV systems for this. Voice conversion is conducted by using three different techniques: GMM based VC technique, weighted frequency warping (WFW) based conversion method and its variation, where energy correction is disabled (WFW-). Evaluation is done by using intra-gender and cross-gender voice conversions between fifty male and fifty female speakers taken from TIMIT database. The result is indicated by degradation in the percentage of correct identification (POC) score in SID systems and degradation in equal error rate (EER) in all SV systems. Experimental results show that the GMM-SVM SV systems are more resilient against voice conversion spoofing attacks than GMM-UBM SV systems and all SID and SV systems are most vulnerable towards GMM based conversion than WFW and WFW- based conversion. From the results, it can also be said that, in general terms, all SID and SV systems are slightly more robust to voices converted through cross-gender conversion than intra-gender conversion. This work extended the study to find out the relationship between VC objective score and SV system performance in CMU ARCTIC database, which is a parallel corpus. The results of this experiment show an approach on quantifying objective score of voice conversion that can be related to the ability to spoof an SV system.
Year
DOI
Venue
2015
10.1016/j.asoc.2015.01.036
Appl. Soft Comput.
Keywords
Field
DocType
gmm,wfw,svm,speaker identification,voice conversion,speaker verification
Speaker verification,Speaker identification,Spoofing attack,Support vector machine,Word error rate,Speech recognition,Robustness (computer science),Biometrics,Mixture model,Mathematics
Journal
Volume
Issue
ISSN
30
C
1568-4946
Citations 
PageRank 
References 
7
0.44
38
Authors
2
Name
Order
Citations
PageRank
Monisankha Pal1252.41
Goutam Saha225523.17