基于壓縮感知的語音信號處理及優(yōu)化研究

發(fā)布時間：2018-07-20 21:17

【摘要】：傳統(tǒng)的奈奎斯特采樣理論要求采樣頻率不得小于信號最大帶寬的兩倍,采集到的信息少量被保存下來,大部分信息被忽略掉,對硬件設(shè)備要求高且造成了資源浪費。隨著信息科技的迅速發(fā)展,以奈奎斯特采樣定理進(jìn)行信息采集的技術(shù)越來越不能滿足人們對信號處理效率的要求,壓縮感知理論的提出解決了這一問題,它可以將采樣和壓縮同時進(jìn)行。該理論是指信號在滿足稀疏性的前提下,信號的觀測序列由信號與觀測矩陣的乘積得到,且由較少個數(shù)的觀測序列可以準(zhǔn)確地恢復(fù)高維原始信號。語音信號具有良好的可壓縮性,所以壓縮感知理論可以實現(xiàn)語音信號的壓縮重構(gòu)。不同于圖像領(lǐng)域,國內(nèi)外將CS用于語音信號處理領(lǐng)域的研究還比較少,處于起步階段。本文研究了壓縮感知理論,并將壓縮感知理論應(yīng)用到語音信號處理中,描述了語音信號壓縮重構(gòu)的實現(xiàn)過程,介紹了重構(gòu)語音評價方法,并提出了幾點優(yōu)化改進(jìn),具體工作如下:首先,本文圍繞信號稀疏表示、觀測矩陣的構(gòu)造和重構(gòu)算法研究三個主要方面闡述了壓縮感知理論。以專業(yè)語音庫中的男生和女生朗讀語音作為實驗對象,對比兩種常用重構(gòu)算法BP算法和OMP算法在語音信號壓縮重構(gòu)中的表現(xiàn),并研究了壓縮比和幀長對語音重構(gòu)效果的影響。實驗結(jié)果表明,同一重構(gòu)算法下,男聲的重構(gòu)質(zhì)量比女聲的要好;針對同一實驗語音,BP算法要比OMP算法的重構(gòu)效果好。其次,本文分析和比較了幾種壓縮感知常見的觀測矩陣在語音信號壓縮重構(gòu)過程中的性能,并對在不同的實驗條件下觀測矩陣的選取提出建議。實驗表明,壓縮比和幀長是觀測矩陣選取的關(guān)鍵因素。在不同的壓縮比和幀長下,需要選取不同的觀測矩陣,以達(dá)到最好的語音重構(gòu)效果。第三,文中從信號的稀疏表示著手,引入冗余字典中的緊框架算法,可以使得信號得到更加稀疏的表示,并與壓縮感知理論常用的高斯隨機(jī)矩陣進(jìn)行重構(gòu)語音質(zhì)量的比較。實驗結(jié)果表明,緊框架矩陣相對于傳統(tǒng)常用的高斯隨機(jī)矩陣,在語音重構(gòu)過程中取得了更好的效果。第四,文中加入心理聲學(xué)模型中的絕對聽閾,把一些人耳聽不見、無用的信號過濾掉,減少信號非零值,增加信號的稀疏度,以達(dá)到提高重構(gòu)語音質(zhì)量的目的。實驗表明,將絕對聽閾加入到傳統(tǒng)的語音信號壓縮感知后,重構(gòu)語音取得了更好的效果。
[Abstract]:The traditional Nyquist sampling theory requires that the sampling frequency should not be less than twice the maximum bandwidth of the signal, the information collected is preserved in a small amount, most of the information is ignored, and the hardware equipment is required and the resources are wasted. With the rapid development of information technology, the technology of collecting information based on Nyquist sampling theorem is more and more unable to meet the demand of signal processing efficiency. The theory of compressed perception solves this problem. It can sample and compress simultaneously. This theory means that under the condition that the signal is sparse, the observation sequence can be obtained from the product of the signal and the observation matrix, and the high dimensional original signal can be accurately recovered from a small number of observation sequences. Speech signal has good compressibility, so compression sensing theory can realize speech signal compression and reconstruction. Different from the field of image, the research of CS in speech signal processing field is still few, and it is still in its infancy. In this paper, the theory of compression perception is studied and applied to speech signal processing. The realization process of speech signal compression and reconstruction is described, the method of speech reconstruction evaluation is introduced, and some optimization improvements are put forward. The main work is as follows: firstly, the theory of compressed sensing is discussed in this paper, which focuses on the sparse representation of signal, the construction of observation matrix and the research of reconstruction algorithm. Taking male and female students in professional speech corpus as experimental objects, the performance of two common reconstruction algorithms, BP algorithm and OMP algorithm, in speech signal compression and reconstruction are compared, and the effects of compression ratio and frame length on speech reconstruction effect are studied. The experimental results show that the reconstruction quality of male voice is better than that of female voice under the same reconstruction algorithm, and the reconstruction effect of BP algorithm is better than that of OMP algorithm for the same experimental speech. Secondly, this paper analyzes and compares the performance of several common observation matrices in speech signal compression and reconstruction, and gives some suggestions on the selection of observation matrices under different experimental conditions. The experimental results show that compression ratio and frame length are the key factors in the selection of observation matrix. In order to achieve the best speech reconstruction effect, different observation matrices should be selected under different compression ratio and frame length. Thirdly, starting with the sparse representation of signals and introducing the compact frame algorithm in redundant dictionaries, the signal can be represented more sparsely, and compared with the Gao Si random matrix commonly used in compression perception theory to reconstruct speech quality. The experimental results show that the compact frame matrix is more effective than the conventional Gao Si random matrix in speech reconstruction. Fourthly, the absolute hearing threshold in the psychoacoustic model is added to filter out some unusable signals, reduce the non-zero value of the signals, increase the sparsity of the signals, so as to improve the quality of the reconstructed speech. The experimental results show that after adding the absolute hearing threshold to the traditional speech signal compression perception, the speech reconstruction can achieve better results.
【學(xué)位授予單位】：西華大學(xué)
【學(xué)位級別】：碩士
【學(xué)位授予年份】：2017
【分類號】：TN912.3

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 郭訓(xùn)香;;框架的強(qiáng)分離性與緊框架的構(gòu)造[J];數(shù)學(xué)學(xué)報(中文版);2015年04期

2 張近;夏凌;李光瑞;;基于壓縮感知和圖像分塊的遮擋人臉識別[J];西華大學(xué)學(xué)報(自然科學(xué)版);2015年03期

3 寧礦鳳;王景芳;;壓縮感知分組分離語音增強(qiáng)[J];計算機(jī)工程與應(yīng)用;2014年24期

4 黨殭;馬林華;田雨;張海威;茹樂;李小蓓;;m序列壓縮感知測量矩陣構(gòu)造[J];西安電子科技大學(xué)學(xué)報;2015年02期

5 朱志臻;周崇彬;劉發(fā)林;李濱兵;張志達(dá);;用于壓縮感知的二值化測量矩陣[J];微波學(xué)報;2014年02期

6 王學(xué)偉;崔廣偉;王琳;賈曉璐;聶偉;;基于平衡Gold序列的壓縮感知測量矩陣的構(gòu)造[J];儀器儀表學(xué)報;2014年01期

7 張波;劉郁林;王開;;稀疏隨機(jī)矩陣有限等距性質(zhì)分析[J];電子與信息學(xué)報;2014年01期

8 李s，

本文編號：2134806

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/xinxigongchenglunwen/2134806.html

上一篇：運動疲勞過程中腦電信號特征提取仿真
下一篇：無線傳感網(wǎng)中基于移動匯聚節(jié)點的節(jié)能路由算法研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍源|省級|國家級|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于壓縮感知的語音信號處理及優(yōu)化研究