基于詞格的語音文檔檢索技術(shù)研究

發(fā)布時(shí)間：2018-04-22 11:28

本文選題：語音文檔檢索 + 詞格　；參考：《解放軍信息工程大學(xué)》2012年碩士論文

【摘要】：語音文檔檢索是根據(jù)用戶提出的查詢項(xiàng)，在海量語音資源中搜索并返回與之相關(guān)聯(lián)的語音文檔或語音片段的過程，在信息安全、語音搜索引擎以及語音資源的分類管理等領(lǐng)域具有重要的應(yīng)用價(jià)值。近年來基于Lattice的語音文檔檢索技術(shù)迅速發(fā)展成為了當(dāng)前語音文檔檢索的主流技術(shù)，受到了越來越多的重視和青睞。然而，Lattice的特殊結(jié)構(gòu)在包含更多正確識(shí)別結(jié)果的同時(shí)，也帶來了新的問題和挑戰(zhàn)。本文針對(duì)漢語Lattice的特點(diǎn)，在Lattice結(jié)構(gòu)改進(jìn)、最優(yōu)識(shí)別單元和檢索單元選取、相關(guān)文檔重排序等方面開展研究，以達(dá)到加快檢索速度、提高檢索精度的目的，主要工作集中在以下三個(gè)方面： (1)針對(duì)傳統(tǒng)Lattice生成方法忽略了音位屬性等語音知識(shí)的問題，提出了一種融合音位屬性的Lattice結(jié)構(gòu)改進(jìn)方法。由于不同來源的Lattice具有信息互補(bǔ)性，該方法首先利用基于音位屬性檢測(cè)的語音識(shí)別系統(tǒng)建立Lattice，然后與傳統(tǒng)自動(dòng)語音識(shí)別系統(tǒng)生成的Lattice進(jìn)行信息融合。針對(duì)融合后Lattice規(guī)模增大的問題，采用基于位置的分段對(duì)齊方法對(duì)其結(jié)構(gòu)進(jìn)行壓縮，，得到一種結(jié)構(gòu)緊湊且融合音位屬性的Lattice改進(jìn)結(jié)構(gòu)。實(shí)驗(yàn)結(jié)果表明，改進(jìn)后的Lattice包含更多的正確識(shí)別結(jié)果，其索引覆蓋率由77.83%上升到80.34%，Lattice錯(cuò)誤率由25.31%下降到19.66%，同時(shí)有效地提高了語音檢索性能。 (2)針對(duì)漢語語音文檔檢索中最優(yōu)識(shí)別單元和檢索單元不一致的問題，提出了一種基于子詞PSPL的語音文檔索引方法。該方法首先以詞為識(shí)別單元對(duì)語音文檔進(jìn)行解碼，得到PSPL；然后對(duì)PSPL進(jìn)行子詞切分，并根據(jù)子詞弧與原始詞弧的后驗(yàn)概率關(guān)系，將PSPL轉(zhuǎn)換為相應(yīng)的子詞PSPL；最后以子詞PSPL作為索引進(jìn)行查詢項(xiàng)檢索，實(shí)現(xiàn)了以詞作為識(shí)別單元、子詞作為檢索單元的目的。實(shí)驗(yàn)結(jié)果表明，該檢索方法在利用豐富語言信息的同時(shí)，較好地解決了詞解碼器存在的邊界分割不正確問題，其檢索性能明顯優(yōu)于目前普遍使用的識(shí)別單元和檢索單元均為詞的PSPL索引方法。 (3)針對(duì)檢索結(jié)果中相關(guān)文檔排序不準(zhǔn)確的問題，提出了一種基于聲學(xué)特征相似度的相關(guān)文檔重排序方法。該方法利用虛擬相關(guān)反饋技術(shù)對(duì)語音文檔檢索系統(tǒng)進(jìn)行改進(jìn)，首先從第一次檢索結(jié)果中選取相關(guān)度得分較高的前N篇語音文檔構(gòu)成虛擬相關(guān)文檔集合，然后比較檢索出的語音文檔和虛擬相關(guān)文檔集合在查詢項(xiàng)出現(xiàn)時(shí)間段內(nèi)的聲學(xué)特征相似度，最后對(duì)原始相關(guān)度和聲學(xué)特征相似度進(jìn)行融合得到新的相關(guān)度分?jǐn)?shù)，并依據(jù)新的相關(guān)度分?jǐn)?shù)對(duì)檢索結(jié)果進(jìn)行重排序。實(shí)驗(yàn)結(jié)果表明，重排序后的檢索結(jié)果中R-準(zhǔn)確率由69.07%上升到75.82%，同時(shí)隨著迭代次數(shù)的增多，檢索性能得到了進(jìn)一步提升。
[Abstract]:Speech document retrieval is the process of searching and returning the associated speech documents or speech fragments in the massive speech resources according to the query items put forward by the user. Speech search engine and classification management of speech resources have important application value. In recent years, the technology of voice document retrieval based on Lattice has rapidly developed into the mainstream technology of voice document retrieval, which has been paid more and more attention and favor. However, the special structure of lattice not only contains more correct identification results, but also brings new problems and challenges. According to the characteristics of Chinese Lattice, this paper studies on the improvement of Lattice structure, the selection of optimal identification unit and retrieval unit, and the reordering of relevant documents, in order to speed up the retrieval speed and improve the retrieval accuracy. The main focus is on the following three areas: 1) aiming at the problem that the traditional Lattice generation method neglects phonetic knowledge such as phonetic attributes, an improved Lattice structure method is proposed, which combines phonetic attributes. Because Lattice from different sources are complementary to each other, a speech recognition system based on phonological attribute detection is used to establish Lattice, and then to fuse information with Lattice generated by traditional automatic speech recognition system. In order to solve the problem of increasing the scale of Lattice after fusion, a new improved structure of Lattice with compact structure and fused phonemes is obtained by using the piecewise alignment method based on position. The experimental results show that the improved Lattice contains more correct recognition results, and its index coverage increases from 77.83% to 80.34%. The error rate of Lattice is reduced from 25.31% to 19.66%, and the performance of speech retrieval is improved effectively. 2) aiming at the inconsistency between the optimal recognition unit and the retrieval unit in Chinese speech document retrieval, a speech document indexing method based on subword PSPL is proposed. The method firstly decodes the speech document with the word recognition unit and obtains the PSPL.Then, the sub-word segmentation of the PSPL is carried out, and according to the posterior probability relation between the subword arc and the original word arc, The PSPL is transformed into the corresponding sub-word PSPL.The last, the query item is retrieved by using the sub-word PSPL as the index, which realizes the purpose of using the word as the identification unit and the sub-word as the retrieval unit. The experimental results show that the retrieval method not only makes use of rich language information, but also solves the problem of incorrect boundary segmentation in word decoders. Its retrieval performance is obviously superior to that of the PSPL indexing method, which is widely used at present, in which the recognition unit and the retrieval unit are both words. In order to solve the problem of inaccurate sorting of relevant documents in retrieval results, a new method of document reordering based on acoustic feature similarity is proposed. The method uses virtual correlation feedback technology to improve the speech document retrieval system. Firstly, the first N speech documents with high correlation score are selected from the first retrieval result to form the virtual correlation document set. Then we compare the acoustic similarity between the retrieved speech document and the virtual correlation document set in the time when the query item appears. Finally, we fuse the original correlation degree and the acoustic feature similarity to get a new correlation score. The retrieval results are reordered according to the new correlation score. The experimental results show that the R- accuracy of the reordered retrieval results is increased from 69.07% to 75.82%, and the retrieval performance is further improved with the number of iterations increasing.
【學(xué)位授予單位】：解放軍信息工程大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2012
【分類號(hào)】：TN912.34

【參考文獻(xiàn)】

相關(guān)期刊論文前10條

1 王歡良;韓紀(jì)慶;鄭鐵然;李海峰;;基于K-L散度的最大后驗(yàn)弧主導(dǎo)的混淆網(wǎng)絡(luò)生成算法[J];電子與信息學(xué)報(bào);2008年05期

2 倪崇嘉;劉文舉;徐波;;漢語大詞匯量連續(xù)語音識(shí)別系統(tǒng)研究進(jìn)展[J];中文信息學(xué)報(bào);2009年01期

3 孟莎;劉加;;漢語語音檢索的集外詞問題與兩階段檢索方法[J];中文信息學(xué)報(bào);2009年06期

4 孟莎;余鵬;劉加;;基于格的漢語自然對(duì)話語音索引方法研究[J];自動(dòng)化學(xué)報(bào);2010年02期

5 鄭方,牟曉隆,徐明星,武健,宋戰(zhàn)江;漢語語音聽寫機(jī)技術(shù)的研究與實(shí)現(xiàn)[J];軟件學(xué)報(bào);1999年04期

6 鄭鐵然;韓紀(jì)慶;李海洋;;基于詞片的語言模型及在漢語語音檢索中的應(yīng)用[J];通信學(xué)報(bào);2009年03期

7 郝杰,李星;漢語連續(xù)語音識(shí)別中關(guān)鍵詞可信度的貝葉斯估計(jì)[J];聲學(xué)學(xué)報(bào);2002年05期

8 張家

本文編號(hào)：1786978

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1786978.html

上一篇：關(guān)鍵詞廣告商標(biāo)侵權(quán)問題研究
下一篇：基于主題網(wǎng)絡(luò)爬蟲的高校網(wǎng)絡(luò)信息動(dòng)態(tài)搜索策略研究

論文發(fā)表

·知網(wǎng)|萬方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于詞格的語音文檔檢索技術(shù)研究