天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

當(dāng)前位置:主頁 > 科技論文 > 計(jì)算機(jī)論文 >

片上多核同步單元的研究實(shí)現(xiàn)及片間擴(kuò)展

發(fā)布時(shí)間:2019-01-27 09:31
【摘要】:隨著應(yīng)用需求以及芯片制造工藝的發(fā)展,單個(gè)芯片上能夠集成更多的處理器資源和存儲(chǔ)資源,片上系統(tǒng)逐漸由單核結(jié)構(gòu)發(fā)展為多核結(jié)構(gòu)。多核體系結(jié)構(gòu)的出現(xiàn)在帶來性能提升的同時(shí),對(duì)核間同步機(jī)制也提出了更高的要求。充分發(fā)揮多核芯片中各處理器核的處理能力,需要高效的同步機(jī)制支持。X-DSP是由我校自主開發(fā)的高性能多核DSP,采用自主設(shè)計(jì)的體系結(jié)構(gòu)與指令集結(jié)構(gòu),主要應(yīng)用于信號(hào)與圖像處理等存在大批量的數(shù)據(jù)處理需求的領(lǐng)域。芯片內(nèi)部集成了多個(gè)DSP核與全局cache,通過PCIE接口實(shí)現(xiàn)與片外的高速互聯(lián)通信。其多核結(jié)構(gòu)支持多個(gè)任務(wù)并行執(zhí)行,各任務(wù)間的數(shù)據(jù)通信需要高效的同步機(jī)制保證執(zhí)行的正確性及高效性。本文基于X-DSP的系統(tǒng)結(jié)構(gòu)特點(diǎn),采用分布式的硬件同步單元實(shí)現(xiàn)了多核間的同步。同時(shí)為了讓片外處理器核有效參與同步,完成了基于PCIE的接口擴(kuò)展工作,設(shè)計(jì)并實(shí)現(xiàn)了PCIE-NI轉(zhuǎn)接橋。本文的工作內(nèi)容與貢獻(xiàn)主要體現(xiàn)在以下幾個(gè)方面:(1)分析比較了硬件同步方案與軟件同步方案,確定了基于鎖和柵欄的硬件同步機(jī)制,通過減少同步操作對(duì)正常訪存行為的影響提高了同步效率。(2)綜合考慮X-DSP的體系結(jié)構(gòu)特點(diǎn),設(shè)計(jì)了包含硬件鎖與柵欄的分布式的硬件同步單元總體結(jié)構(gòu)。其中,硬件鎖具有旋轉(zhuǎn)鎖與排隊(duì)旋轉(zhuǎn)鎖兩種工作模式,有效減少鎖獲取請(qǐng)求數(shù)目;硬件柵欄采用廣播方式進(jìn)行釋放,從而減少傳統(tǒng)柵欄串行釋放造成的網(wǎng)絡(luò)熱點(diǎn)問題。(3)設(shè)計(jì)了PCIE-NI轉(zhuǎn)接橋,實(shí)現(xiàn)了AXI標(biāo)準(zhǔn)接口、PBUS以及DBI接口和X-DSP自主設(shè)計(jì)的NI接口之間的協(xié)議轉(zhuǎn)接,使得片外處理器核能夠有效參與同步并實(shí)現(xiàn)片內(nèi)外數(shù)據(jù)共享。(4)基于層次化的驗(yàn)證方法學(xué),完成了模塊級(jí)驗(yàn)證,并在全芯片系統(tǒng)環(huán)境下完成了系統(tǒng)級(jí)驗(yàn)證,以及硬件同步單元與PCIE-NI轉(zhuǎn)接橋之間的聯(lián)合測(cè)試。邏輯綜合的結(jié)果表明,本文的設(shè)計(jì)能夠滿足性能需求。
[Abstract]:With the development of application requirements and chip manufacturing technology, more processor and memory resources can be integrated on a single chip, and the on-chip system is gradually developed from a single-core structure to a multi-core structure. The emergence of multi-core architecture not only improves performance, but also puts forward higher requirements for inter-core synchronization mechanism. To give full play to the processing power of each processor core in the multi-core chip, we need the support of efficient synchronization mechanism. X-DSP is a self-designed architecture and instruction set structure for high-performance multi-core DSP, developed by our university. It is mainly used in the field of signal and image processing. Multiple DSP cores and global cache, are integrated into the chip to communicate with high speed out of chip via PCIE interface. The multi-core architecture supports the parallel execution of multiple tasks, and the data communication among the tasks requires an efficient synchronization mechanism to ensure the correctness and efficiency of the execution. Based on the system structure of X-DSP, this paper uses distributed hardware synchronization unit to realize multi-core synchronization. At the same time, in order to make the off-chip processor core participate in the synchronization effectively, the interface extension based on PCIE is completed, and the PCIE-NI bridge is designed and implemented. The main contents and contributions of this paper are as follows: (1) the hardware synchronization scheme and the software synchronization scheme are analyzed and compared, and the hardware synchronization mechanism based on lock and fence is determined. By reducing the influence of synchronous operation on the normal memory access behavior, the synchronization efficiency is improved. (2) considering the architecture characteristics of X-DSP, a distributed hardware synchronization unit including hardware lock and fence is designed. Among them, the hardware lock has two working modes: the rotation lock and the queue rotation lock, which can effectively reduce the number of requests for lock acquisition. The hardware fence is released by broadcast, thus reducing the network hot issues caused by the serial release of the traditional fence. (3) the PCIE-NI transfer bridge is designed, and the AXI standard interface is realized. The protocol transfer between PBUS and DBI interface and NI interface designed by X-DSP makes the core of off-chip processor participate in synchronization effectively and realize data sharing between chip and chip. (4) Module level verification is completed based on hierarchical verification methodology. The system level verification and the joint test between the hardware synchronization unit and the PCIE-NI bridge are completed in the full chip system environment. The results of logic synthesis show that the design of this paper can meet the performance requirements.
【學(xué)位授予單位】:國防科學(xué)技術(shù)大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2015
【分類號(hào)】:TP332

【參考文獻(xiàn)】

相關(guān)期刊論文 前6條

1 陳書明;萬江華;魯建壯;劉仲;孫海燕;孫永節(jié);劉衡竹;劉祥遠(yuǎn);李振濤;徐毅;陳小文;;YHFT-QDSP:High-Performance Heterogeneous Multi-Core DSP[J];Journal of Computer Science & Technology;2010年02期

2 顏建峰;吳寧;;基于PCI總線的DMA高速數(shù)據(jù)傳輸系統(tǒng)[J];電子科技大學(xué)學(xué)報(bào);2007年05期

3 Mick Posner;;快速實(shí)現(xiàn)基于AMBA 3 AXI協(xié)議的設(shè)計(jì)[J];電子設(shè)計(jì)應(yīng)用;2007年01期

4 蔣周良;權(quán)進(jìn)國;林孝康;;AMBA總線新一代標(biāo)準(zhǔn)AXI分析和應(yīng)用[J];微計(jì)算機(jī)信息;2006年29期

5 汪東,馬劍武,陳書明;基于Gray碼的異步FIFO接口技術(shù)及其應(yīng)用[J];計(jì)算機(jī)工程與科學(xué);2005年01期

6 胡偉武,,夏培肅;順序一致共享存儲(chǔ)系統(tǒng)中的亂序執(zhí)行技術(shù)──基本理論[J];計(jì)算機(jī)學(xué)報(bào);1997年06期

相關(guān)博士學(xué)位論文 前1條

1 賈小敏;多核處理器片上Cache訪問行為分析與優(yōu)化機(jī)制研究[D];國防科學(xué)技術(shù)大學(xué);2011年

相關(guān)碩士學(xué)位論文 前4條

1 梁天永;IP集成方案研究與DFI-AXI總線橋的設(shè)計(jì)[D];華南理工大學(xué);2010年

2 黃穎然;基于覆蓋率驗(yàn)證方法的IP核測(cè)試平臺(tái)設(shè)計(jì)[D];西安電子科技大學(xué);2009年

3 黃冕;X處理器存儲(chǔ)一致性模型的研究與實(shí)現(xiàn)[D];國防科學(xué)技術(shù)大學(xué);2008年

4 陳石坤;多核處理器中CACHE一致性協(xié)議研究和實(shí)現(xiàn)[D];國防科學(xué)技術(shù)大學(xué);2005年



本文編號(hào):2416136

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/2416136.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶e4bcf***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com