基于FPGA的實(shí)時固定語音識別系統(tǒng)研究與實(shí)現(xiàn)
發(fā)布時間:2018-01-24 12:41
本文關(guān)鍵詞: 固定語音識別 實(shí)時 多處理單元 并行處理 AXI4總線 FPGA 出處:《解放軍信息工程大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
【摘要】:固定語音識別是指在動態(tài)語音流中識別出與給定模板庫中的模板語音相同或基本相同的語音片段。作為固定音頻檢索的一個分支,固定語音識別可用于電信網(wǎng)垃圾語音識別及廣告監(jiān)播、版權(quán)管理等領(lǐng)域。但隨著語音模板數(shù)量的不斷增加,固定語音識別系統(tǒng)面臨實(shí)時性的考驗(yàn)。而實(shí)時性則代表了系統(tǒng)的多路實(shí)時處理能力,是其走向?qū)嵱帽仨毥鉀Q的關(guān)鍵問題。 論文課題來源于國家863計(jì)劃某重點(diǎn)項(xiàng)目,結(jié)合電信網(wǎng)對多路語音實(shí)時處理的需求現(xiàn)狀,針對大容量語音模板庫(模板個數(shù)8000)條件下的固定語音實(shí)時識別問題展開研究,取得了如下成果: 1.提出了一種基于超標(biāo)量體系結(jié)構(gòu)的多處理單元并行架構(gòu)(Multi-PE ParallelArchitecture,MPPA)。該架構(gòu)充分挖掘固定語音識別算法的并行性:處理單元內(nèi)部采用超字并行結(jié)構(gòu),考慮到語音信號處理以幀為單位,超字并行的最大并行度為8,設(shè)計(jì)了數(shù)據(jù)處理位寬為256bit的超位寬處理單元;并行的多處理單元之間采用超標(biāo)量體系結(jié)構(gòu),每個處理單元均可以獨(dú)立完成匹配處理任務(wù),根據(jù)處理性能的需求該架構(gòu)具有可伸縮性。 2.針對MPPA架構(gòu),提出了一種使架構(gòu)中各處理單元時刻“盡力而為”的數(shù)據(jù)存儲和調(diào)度機(jī)制。對模板數(shù)據(jù)存儲,研究了一種采用集中式共享存儲和分布式存儲相結(jié)合的混合式存儲結(jié)構(gòu),設(shè)計(jì)模板數(shù)據(jù)的二級存儲機(jī)制和PE內(nèi)部雙緩存(Double Buffer,DB)結(jié)構(gòu),,實(shí)現(xiàn)了高效的數(shù)據(jù)存儲體系。對數(shù)據(jù)調(diào)度,研究了輪詢分發(fā)機(jī)制和先請求先分發(fā)機(jī)制的模板分發(fā)策略。 3.針對直接使用FPGA內(nèi)嵌的硬核乘法器進(jìn)行大位寬平方計(jì)算時IP核資源消耗過大的問題,提出了一種高效的平方計(jì)算方法。該方法先通過一種可重復(fù)迭代的簡單邏輯電路降低操作數(shù)的位寬,然后結(jié)合內(nèi)嵌DSP48E完成平方的計(jì)算,有效的減少了DSP48E的使用數(shù)量。 4.將Xilinx公司的FPGA中內(nèi)嵌的MicroBlaze處理器作為主處理器,結(jié)合基于MPPA處理架構(gòu)的MPPA協(xié)處理器,在FPGA平臺上實(shí)現(xiàn)了固定語音識別SOPC系統(tǒng)。系統(tǒng)采用基于AXI4總線協(xié)議的共享RAM接口方式完成主處理器和協(xié)處理器之間的控制和數(shù)據(jù)的交互。通過對系統(tǒng)的性能測試和資源分析,結(jié)果表明該系統(tǒng)在8192個模板的情況下實(shí)時處理22路固定語音識別任務(wù),實(shí)現(xiàn)了大容量模板庫條件下的多路語音實(shí)時處理。
[Abstract]:Fixed speech recognition refers to the recognition of the same or almost the same speech segment as the template speech in a given template library in the dynamic speech stream, which is regarded as a branch of fixed audio retrieval. Fixed speech recognition can be used in telecom network spam speech recognition, advertising monitoring, copyright management and other fields, but with the number of voice templates increasing. The fixed speech recognition system faces the test of real-time, which represents the multi-channel real-time processing ability of the system, which is the key problem that must be solved when it moves towards practice. This paper comes from a key project of the National 863 Program, and combines the demand of multi-channel real-time speech processing in telecommunication network. In this paper, the fixed speech real-time recognition problem under the condition of large capacity speech template library (8 000 templates) is studied, and the results are as follows: 1. A multi-processing unit parallel architecture based on superscalar architecture is proposed. This architecture fully exploits the parallelism of the fixed speech recognition algorithm: the processing unit adopts the hyperword parallel structure, considering that the speech signal processing takes frame as the unit, the maximum parallelism of superword parallelism is 8. A data processing unit with a bit width of 256bit is designed. The parallel multi-processing units adopt superscalar architecture and each processing unit can accomplish matching processing tasks independently. The architecture is scalable according to the requirement of processing performance. 2. Aiming at the MPPA architecture, a data storage and scheduling mechanism is proposed, which can make every processing unit in the architecture "try its best" at all times. A hybrid storage structure based on centralized shared storage and distributed storage is studied. The secondary storage mechanism of template data and double double Buffer in PE are designed. For data scheduling, the polling distribution mechanism and the template distribution strategy of the first request first distribution mechanism are studied. 3. To solve the problem that IP core resource consumption is too large when the square of large bit width is calculated directly by using the hard core multiplier embedded in FPGA. In this paper, an efficient square calculation method is proposed, which reduces the bit width of operands by a simple iterative logic circuit, and then accomplishes the square calculation with embedded DSP48E. Effectively reduces the use of DSP48E. 4. The MicroBlaze processor embedded in FPGA of Xilinx Company is taken as the main processor, and the MPPA coprocessor based on MPPA processing architecture is combined. The fixed speech recognition SOPC system is implemented on FPGA platform, and the control and data exchange between the main processor and coprocessor is accomplished by using the shared RAM interface based on AXI4 bus protocol. Through performance testing and resource analysis of the system. The results show that the system can process 22 fixed speech recognition tasks in real time with 8192 templates, and realize multi-channel real-time speech processing under the condition of large volume template library.
【學(xué)位授予單位】:解放軍信息工程大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TN912.34
【參考文獻(xiàn)】
相關(guān)期刊論文 前1條
1 步衍冰,曾興雯,張昕;基于FPGA的快速并行平方器[J];電子工程師;2002年09期
本文編號:1460068
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1460068.html
最近更新
教材專著