重錄語音檢測算法
發(fā)布時間:2018-08-08 12:23
【摘要】:非法認(rèn)證者可通過播放重新錄制合法認(rèn)證者的語音欺騙說話人識別系統(tǒng)以獲得進(jìn)入系統(tǒng)的權(quán)限,為社會安全帶來威脅。因此,重錄語音的檢測具有現(xiàn)實的緊迫性,但相關(guān)的研究報道仍較缺乏。為此,本文提出一種重錄語音的檢測算法。該算法以MFCC(Mel-Frequency Cepstral Coefficients,美爾頻率倒譜系數(shù))的統(tǒng)計量作為SVM(Support Vector Machine,支持向量機(jī))和KNN(K-Nearest Neighbors,K最近鄰)分類方法的特征;除以上兩種分類方法外,本文亦考察使用SAE(Sparse Autoencoder,稀疏自動編碼器)的檢測性能。為模擬現(xiàn)實生活中重錄語音的真實情景,本文實驗通過不同的錄音設(shè)備、錄音距離及錄音環(huán)境對算法進(jìn)行全面的測試。實驗結(jié)果表明,通過增加重錄語音的多樣性作為訓(xùn)練可以使該算法的正確率提高到99.67%,達(dá)到了較好的檢測性能。
[Abstract]:Illegal authenticators can rerecord the legitimate authenticator's voice to deceive the speaker identification system to gain access to the system, which brings a threat to social security. Therefore, the detection of rerecord speech is urgent, but the related research reports are still lacking. Therefore, this paper proposes a detection algorithm for rerecord speech. In this algorithm, the statistics of MFCC (Mel-Frequency Cepstral efficient number) and KNN (K-Nearest neighbor) are used as the features of the SVM (Support Vector Machine, support vector machine and the KNN (K-Nearest neighbor) classification method, in addition to the above two classification methods, This paper also investigates the detection performance using SAE (Sparse Autoencoder, sparse automatic encoder. In order to simulate the real situation of the rerecorded voice in real life, the algorithm is tested by different recording equipment, recording distance and recording environment. The experimental results show that the accuracy of the algorithm can be improved to 99.67 by increasing the diversity of the rerecorded speech, and the detection performance is better.
【作者單位】: 五邑大學(xué)信息工程學(xué)院;廣東技術(shù)師范學(xué)院電子與信息學(xué)院;
【基金】:國家自然科學(xué)基金(61672173,61372193,61072127) 國家自然科學(xué)基金(青年科學(xué)基金)(61100168) 廣東省自然科學(xué)基金(S2013010013311,2014A030313623) 廣東省普通高校特色創(chuàng)新項目(2015KTSCX083)
【分類號】:TN912.3
,
本文編號:2171778
[Abstract]:Illegal authenticators can rerecord the legitimate authenticator's voice to deceive the speaker identification system to gain access to the system, which brings a threat to social security. Therefore, the detection of rerecord speech is urgent, but the related research reports are still lacking. Therefore, this paper proposes a detection algorithm for rerecord speech. In this algorithm, the statistics of MFCC (Mel-Frequency Cepstral efficient number) and KNN (K-Nearest neighbor) are used as the features of the SVM (Support Vector Machine, support vector machine and the KNN (K-Nearest neighbor) classification method, in addition to the above two classification methods, This paper also investigates the detection performance using SAE (Sparse Autoencoder, sparse automatic encoder. In order to simulate the real situation of the rerecorded voice in real life, the algorithm is tested by different recording equipment, recording distance and recording environment. The experimental results show that the accuracy of the algorithm can be improved to 99.67 by increasing the diversity of the rerecorded speech, and the detection performance is better.
【作者單位】: 五邑大學(xué)信息工程學(xué)院;廣東技術(shù)師范學(xué)院電子與信息學(xué)院;
【基金】:國家自然科學(xué)基金(61672173,61372193,61072127) 國家自然科學(xué)基金(青年科學(xué)基金)(61100168) 廣東省自然科學(xué)基金(S2013010013311,2014A030313623) 廣東省普通高校特色創(chuàng)新項目(2015KTSCX083)
【分類號】:TN912.3
,
本文編號:2171778
本文鏈接:http://sikaile.net/kejilunwen/xinxigongchenglunwen/2171778.html
最近更新
教材專著