基于FPGA的實時固定語音識別系統(tǒng)研究與實現(xiàn)
發(fā)布時間:2018-01-24 12:41
本文關鍵詞: 固定語音識別 實時 多處理單元 并行處理 AXI4總線 FPGA 出處:《解放軍信息工程大學》2013年碩士論文 論文類型:學位論文
【摘要】:固定語音識別是指在動態(tài)語音流中識別出與給定模板庫中的模板語音相同或基本相同的語音片段。作為固定音頻檢索的一個分支,固定語音識別可用于電信網垃圾語音識別及廣告監(jiān)播、版權管理等領域。但隨著語音模板數量的不斷增加,固定語音識別系統(tǒng)面臨實時性的考驗。而實時性則代表了系統(tǒng)的多路實時處理能力,是其走向實用必須解決的關鍵問題。 論文課題來源于國家863計劃某重點項目,結合電信網對多路語音實時處理的需求現(xiàn)狀,針對大容量語音模板庫(模板個數8000)條件下的固定語音實時識別問題展開研究,取得了如下成果: 1.提出了一種基于超標量體系結構的多處理單元并行架構(Multi-PE ParallelArchitecture,MPPA)。該架構充分挖掘固定語音識別算法的并行性:處理單元內部采用超字并行結構,考慮到語音信號處理以幀為單位,超字并行的最大并行度為8,設計了數據處理位寬為256bit的超位寬處理單元;并行的多處理單元之間采用超標量體系結構,每個處理單元均可以獨立完成匹配處理任務,根據處理性能的需求該架構具有可伸縮性。 2.針對MPPA架構,提出了一種使架構中各處理單元時刻“盡力而為”的數據存儲和調度機制。對模板數據存儲,研究了一種采用集中式共享存儲和分布式存儲相結合的混合式存儲結構,設計模板數據的二級存儲機制和PE內部雙緩存(Double Buffer,DB)結構,,實現(xiàn)了高效的數據存儲體系。對數據調度,研究了輪詢分發(fā)機制和先請求先分發(fā)機制的模板分發(fā)策略。 3.針對直接使用FPGA內嵌的硬核乘法器進行大位寬平方計算時IP核資源消耗過大的問題,提出了一種高效的平方計算方法。該方法先通過一種可重復迭代的簡單邏輯電路降低操作數的位寬,然后結合內嵌DSP48E完成平方的計算,有效的減少了DSP48E的使用數量。 4.將Xilinx公司的FPGA中內嵌的MicroBlaze處理器作為主處理器,結合基于MPPA處理架構的MPPA協(xié)處理器,在FPGA平臺上實現(xiàn)了固定語音識別SOPC系統(tǒng)。系統(tǒng)采用基于AXI4總線協(xié)議的共享RAM接口方式完成主處理器和協(xié)處理器之間的控制和數據的交互。通過對系統(tǒng)的性能測試和資源分析,結果表明該系統(tǒng)在8192個模板的情況下實時處理22路固定語音識別任務,實現(xiàn)了大容量模板庫條件下的多路語音實時處理。
[Abstract]:Fixed speech recognition refers to the recognition of the same or almost the same speech segment as the template speech in a given template library in the dynamic speech stream, which is regarded as a branch of fixed audio retrieval. Fixed speech recognition can be used in telecom network spam speech recognition, advertising monitoring, copyright management and other fields, but with the number of voice templates increasing. The fixed speech recognition system faces the test of real-time, which represents the multi-channel real-time processing ability of the system, which is the key problem that must be solved when it moves towards practice. This paper comes from a key project of the National 863 Program, and combines the demand of multi-channel real-time speech processing in telecommunication network. In this paper, the fixed speech real-time recognition problem under the condition of large capacity speech template library (8 000 templates) is studied, and the results are as follows: 1. A multi-processing unit parallel architecture based on superscalar architecture is proposed. This architecture fully exploits the parallelism of the fixed speech recognition algorithm: the processing unit adopts the hyperword parallel structure, considering that the speech signal processing takes frame as the unit, the maximum parallelism of superword parallelism is 8. A data processing unit with a bit width of 256bit is designed. The parallel multi-processing units adopt superscalar architecture and each processing unit can accomplish matching processing tasks independently. The architecture is scalable according to the requirement of processing performance. 2. Aiming at the MPPA architecture, a data storage and scheduling mechanism is proposed, which can make every processing unit in the architecture "try its best" at all times. A hybrid storage structure based on centralized shared storage and distributed storage is studied. The secondary storage mechanism of template data and double double Buffer in PE are designed. For data scheduling, the polling distribution mechanism and the template distribution strategy of the first request first distribution mechanism are studied. 3. To solve the problem that IP core resource consumption is too large when the square of large bit width is calculated directly by using the hard core multiplier embedded in FPGA. In this paper, an efficient square calculation method is proposed, which reduces the bit width of operands by a simple iterative logic circuit, and then accomplishes the square calculation with embedded DSP48E. Effectively reduces the use of DSP48E. 4. The MicroBlaze processor embedded in FPGA of Xilinx Company is taken as the main processor, and the MPPA coprocessor based on MPPA processing architecture is combined. The fixed speech recognition SOPC system is implemented on FPGA platform, and the control and data exchange between the main processor and coprocessor is accomplished by using the shared RAM interface based on AXI4 bus protocol. Through performance testing and resource analysis of the system. The results show that the system can process 22 fixed speech recognition tasks in real time with 8192 templates, and realize multi-channel real-time speech processing under the condition of large volume template library.
【學位授予單位】:解放軍信息工程大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TN912.34
【參考文獻】
相關期刊論文 前1條
1 步衍冰,曾興雯,張昕;基于FPGA的快速并行平方器[J];電子工程師;2002年09期
本文編號:1460068
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1460068.html
教材專著