多核X-DSPX共享存儲部件的設(shè)計與實現(xiàn)
發(fā)布時間:2018-02-22 23:03
本文關(guān)鍵詞: 數(shù)字信號處理 多核 共享存儲 優(yōu)化 綜合 出處:《國防科學(xué)技術(shù)大學(xué)》2013年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著數(shù)字信號處理應(yīng)用領(lǐng)域的日益擴(kuò)大,對DSP(Digital Signal Processing)應(yīng)用系統(tǒng)的性能、功耗和成本提出了越來越高的要求,促使人們從單核DSP轉(zhuǎn)向多核DSP技術(shù)。同時,隨著處理器并發(fā)調(diào)度線程數(shù)目的增加,怎樣為片上的計算資源提供快速的共享數(shù)據(jù)訪問成為多核DSP需要解決的關(guān)鍵技術(shù)問題之一。因此,研究多核環(huán)境下的高效共享存儲技術(shù),對推動多核DSP技術(shù)的進(jìn)一步發(fā)展具有重要的現(xiàn)實意義。 X-DSPX是我校自主研發(fā)的新一代高性能32位浮點多核DSP微處理器芯片。本文深入分析了主流DSP共享存儲部件的功能特點,根據(jù)X-DSPX設(shè)計需求,從存儲陣列、數(shù)據(jù)通路、控制器這三個方面出發(fā),對SMC(Shrared Memory Controller)模塊的讀寫命令隊列、仲裁器、命令譯碼、地址生成以及數(shù)據(jù)的串并轉(zhuǎn)換進(jìn)行了設(shè)計與實現(xiàn)。同時對SMC部件進(jìn)行了結(jié)構(gòu)上的優(yōu)化和功能驗證,根據(jù)公平性原則,提出了一種更有效的仲裁機(jī)制。論文的主要工作包括: 1、設(shè)計與實現(xiàn)了X-DSPX的SMC部件;赬-DSPX對SMC部件的功能需求,完成四個DSP同時并發(fā)訪問SMC存儲體和通過SMC部件對DDR2、EMIF、遠(yuǎn)程L2存儲空間的訪問,并且完成DMA對SMC存儲體的后臺數(shù)據(jù)搬移。實現(xiàn)了共享存儲數(shù)據(jù)的高效利用和SMC部件作為數(shù)據(jù)交叉通道的作用。 2、提出了一種降低功耗和減少存儲訪問延遲的優(yōu)化方法。按存儲體分體控制原理,對SMC部件進(jìn)行結(jié)構(gòu)上的調(diào)整,結(jié)果表明SMC部件的功耗降低了80%。采用流水化操作及并發(fā)仲裁,使DSP對SMC部件的請求延遲降低了一個時鐘周期,實現(xiàn)了提高存儲訪問效率的目的。 3、提出了一種固定優(yōu)先級與循環(huán)優(yōu)先級算法相結(jié)合的仲裁機(jī)制;诠叫栽瓌t,在不同優(yōu)先級的請求信號同時請求SMC存儲體時,,不因固定優(yōu)先級而使請求信號出現(xiàn)“餓死”與“撐死”的現(xiàn)象,并且動態(tài)的轉(zhuǎn)換請求信號的優(yōu)先級來獲得SMC存儲體資源的先后順序。實現(xiàn)不同請求信號出現(xiàn)結(jié)構(gòu)相關(guān)時有更加合理的請求順序。 4、完成了SMC部件的功能測試和邏輯綜合。按X-DSPX系統(tǒng)需求完成了SMC部件的測試向量的開發(fā)以及功能驗證,通過編寫針對性的測試程序,使SMC部件的代碼覆蓋率達(dá)到99%以上。采用65nm標(biāo)準(zhǔn)單元工藝庫對SMC部件進(jìn)行綜合,SMC部件最高工作頻率可以達(dá)到555MHz。按照系統(tǒng)最低500MHz的內(nèi)部時鐘工作頻率要求,SMC部件綜合結(jié)果面積為65783um2,功耗為3.8692mW,符合系統(tǒng)設(shè)計要求。
[Abstract]:With the increasing expansion of digital signal processing applications, the performance, power consumption and cost of DSP(Digital Signal processing systems are becoming more and more important, which urges people to switch from single-core DSP to multi-core DSP technology. With the increase of the number of concurrent scheduling threads, how to provide fast access to shared data for computing resources on a chip becomes one of the key technical problems that need to be solved by multi-core DSP. Therefore, the efficient shared storage technology in multi-core environment is studied. It has important practical significance to promote the further development of multi-core DSP technology. X-DSPX is a new generation of high performance 32-bit floating-point multi-core DSP microprocessor developed by our university. This paper deeply analyzes the functional characteristics of mainstream DSP shared memory components. According to the design requirements of X-DSPX, the memory array and data path are analyzed. This paper designs and implements the command queue, arbiter, command decoding, address generation and data series-parallel conversion of SMC(Shrared Memory Controller module. At the same time, the structure and function of SMC are optimized and verified. According to the principle of fairness, a more effective arbitration mechanism is proposed. 1. The SMC part of X-DSPX is designed and implemented. Based on the functional requirement of X-DSPX to SMC part, four DSP simultaneously access SMC storage and access DDR2EMIF, remote L2 storage space through SMC. In addition, the background data transfer of SMC storage by DMA is completed, and the efficient use of shared storage data and the function of SMC as data crossover channel are realized. 2. An optimization method for reducing power consumption and memory access delay is proposed. According to the principle of storage split control, the structure of SMC parts is adjusted. The results show that the power consumption of SMC parts is reduced by 80%. Income operation and concurrent arbitration are adopted. The request delay of DSP to SMC part is reduced by one clock cycle, and the storage access efficiency is improved. 3. An arbitration mechanism based on the combination of fixed priority and cyclic priority is proposed. Based on the fairness principle, when the request signal of different priority requests the SMC storage at the same time, Do not cause the request signal to starve to death because of fixed priority. And the priority of request signal is changed dynamically to obtain the order of SMC storage resources, and there is more reasonable request order when different request signals appear structure correlation. 4. The function test and logic synthesis of SMC parts are completed. According to the requirements of X-DSPX system, the development of test vector and function verification of SMC parts are completed. The code coverage of SMC parts is more than 99%. The maximum working frequency of SMC parts can be up to 555MHz by using 65nm standard cell process library. According to the minimum internal clock frequency requirement of the system, the SMC components can be integrated. The resultant area is 65783 um2, and the power consumption is 3.8692 MW, which meets the requirement of system design.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2013
【分類號】:TP333
【參考文獻(xiàn)】
相關(guān)期刊論文 前2條
1 魏曉云,陳杰,曾云;DSP技術(shù)的最新發(fā)展及其應(yīng)用現(xiàn)狀[J];半導(dǎo)體技術(shù);2003年09期
2 郭陽,李暾,李思昆;微處理器功能驗證方法研究[J];計算機(jī)工程與應(yīng)用;2003年05期
本文編號:1525529
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1525529.html
最近更新
教材專著