天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

面向軟件安全的二進制代碼逆向分析關鍵技術研究

發(fā)布時間:2018-08-04 15:34
【摘要】:二進制代碼逆向分析是一種針對二進制代碼的程序分析技術。它在源代碼無法獲取的情形中至關重要。如在惡意軟件檢測與分析中,由于惡意軟件作者往往不公開源代碼,二進制代碼逆向分析幾乎是唯一的分析手段。在對商業(yè)軟件的安全審查以及抄襲檢測中,由于沒有源代碼,也只能對其二進制代碼進行分析。二進制代碼逆向分析技術還可以應用于加固現(xiàn)有軟件,減少安全漏洞,也可以用于阻止軟件被破解,防止軟件被盜版,保護知識產(chǎn)權(quán)。當前,無論是在巨型計算機,還是在智能手機以及嵌入式設備中,絕大多數(shù)軟件都是以二進制代碼形式發(fā)布。所以,研究二進制代碼逆向分析對于提高計算機軟件的安全性,具有重要的科學理論意義和實際應用價值。由于二進制代碼和源代碼間存在巨大的差異,使得二進制代碼逆向分析相對于程序源代碼分析要困難得多;煜夹g的使用和編譯器優(yōu)化也會增加對二進制代碼進行分析的難度。此外,為保護軟件不被檢測和分析,惡意軟件會使用各種反分析方法,如基于完整性校驗的反修改和基于計時攻擊的反監(jiān)控。為分析這些軟件,需要對抗這些反分析。這又進一步增加了二進制代碼逆向分析的難度。本文重點對二進制代碼反分析的識別,反匯編,函數(shù)和庫函數(shù)的識別等關鍵技術進行深入研究。針對當前反分析識別的研究都只針對特定類型的反分析設計特定的反分析識別方法、缺乏通用性的問題,分析了各種反分析方法之間的概念相似性,提出了一個基于信息流的反分析識別框架。針對當前對抗基于完整性校驗的反修改需使用硬件輔助且不能處理自修改代碼的問題,研究了一個無需硬件輔助的基于動態(tài)信息流的識別方法。首先使用后向污點分析來識別可執(zhí)行內(nèi)存位置或用來計算可執(zhí)行位置值的內(nèi)存位置,然后使用前向污點分析來識別校驗過程。對于基于計時攻擊的反監(jiān)控,亦可以使用這種方法識別。首先將常見獲取時間的指令和系統(tǒng)調(diào)用的返回值作為污點源,然后使用污點分析識別驗證過程。本文方法可以成功識別現(xiàn)有研究文獻中的基于完整性校驗的反修改和基于計時攻擊的反監(jiān)控技術,并提供識別出的反分析的基礎結(jié)構(gòu)信息,進而可以幫助分析人員設計出對抗這些反分析的技術。針對當前動靜結(jié)合反匯編方法仍存在覆蓋率低的問題,研究了一種多路徑探索方法來反匯編代碼。靜態(tài)反匯編無法區(qū)分可執(zhí)行代碼區(qū)域中的數(shù)據(jù)和代碼,也無法處理自修改代碼。動態(tài)反匯編方法代碼覆蓋率低,只會處理已執(zhí)行的路徑。本文使用基于二進制插樁技術的動態(tài)分析技術記錄程序指令執(zhí)行軌跡,并通過逆轉(zhuǎn)已執(zhí)行路徑中的條件分支來實現(xiàn)多路徑探索,從而提高動態(tài)分析的覆蓋率。然后精簡合并所有執(zhí)行軌跡。最后使用靜態(tài)反匯編來發(fā)現(xiàn)未處理區(qū)域中的代碼。該方法能夠高準確度高覆蓋率地反匯編二進制代碼。當前函數(shù)識別方法無法識別無交叉引用和無頭尾特征的函數(shù)。針對這個問題,研究了一個以函數(shù)返回指令為識別特征的函數(shù)識別方法。因為一個函數(shù)至少有一個返回指令使得控制流離開函數(shù),因此,相比傳統(tǒng)方法使用的函數(shù)頭尾特征,本文采用的返回指令作為識別特征更可靠。首先引入逆向擴展控制流圖(Reverse Extended Control Flow Graph,RECFG)的概念。它是特定代碼區(qū)域中,包含指定返回指令所有可能的控制流圖集合。然后提出一種基于RECFG的函數(shù)識別方法,該方法首先從一個代碼區(qū)域中的所有可解釋為返回指令的地址開始逆向分析控制流圖,構(gòu)造RECFG。設計了4個圖剪枝規(guī)則來移除非編譯器正常生成的點和路徑。然后對于每個獨立的RECFG,最后使用多屬性決策方法來挑選一個子圖作為函數(shù)的控制流圖。該方法可以準確地識別特定代碼區(qū)域中可能的函數(shù)。針對傳統(tǒng)庫函數(shù)識別方法無法識別內(nèi)聯(lián)庫函數(shù)的問題,研究了一個識別庫函數(shù)的新方法。由于內(nèi)聯(lián)及優(yōu)化的庫函數(shù)存在非連續(xù)性和多態(tài)性,傳統(tǒng)的基于函數(shù)頭n個字節(jié)的特征匹配方法無法識別內(nèi)聯(lián)函數(shù)。本文首先引入執(zhí)行流圖(Execution Flow Graph,EFG)的概念,用EFG來描述二進制代碼的內(nèi)在行為特征。然后通過在目標函數(shù)中識別相似EFG子圖來識別庫函數(shù)。通過分析其各指令內(nèi)執(zhí)行依賴關系識別目標函數(shù)中非連續(xù)內(nèi)聯(lián)庫函數(shù)。通過指令標準化識別經(jīng)編譯優(yōu)化后存在的多形態(tài)內(nèi)聯(lián)庫函數(shù)。由于子圖同構(gòu)測試非常耗時,因此本文定義了5個過濾器過濾掉不可能匹配的子圖,并引入收縮執(zhí)行流圖(Reduced Execution Flow Graph,REFG)來加速子圖同構(gòu)測試。EFG和REFG方法的查準率都比當前最先進的工具高,并可以準確地識別傳統(tǒng)方法難以識別的內(nèi)聯(lián)庫函數(shù)。相對于EFG,REFG可以在保持相同查準率和查全率的情況下顯著降低EFG方法的處理時間。綜上所述,上述方法為識別包括基于完整性校驗的反修改在內(nèi)的反分析、提高動態(tài)反匯編方法覆蓋率、識別無交叉引用無明顯頭尾特征的函數(shù)、快速識別庫函數(shù)等關鍵技術問題提供了新思路和新方法。
[Abstract]:Binary code reverse analysis is a program analysis technique for binary code. It is critical in situations where source code is unavailable. For malware detection and analysis, as malware writers often do not expose source code, binary code reverse analysis is almost the only analytical means. The whole review and plagiarism test can only analyze its binary code because there is no source code. The binary code reverse analysis technology can also be used to reinforce existing software, reduce security vulnerabilities, prevent software from being cracked, prevent software from being pirated, and protect intellectual property. Most of the software is published in the form of binary code in smart phones and embedded devices. Therefore, it has important scientific theoretical significance and practical application value to study the reverse analysis of binary code to improve the security of computer software. There is a huge difference between the two code and the source code. It is much harder to analyze binary code reverse analysis relative to program source code analysis. Obfuscation technology and compiler optimization can also increase the difficulty of analyzing binary code. In addition, in order to protect software from detection and analysis, malware will use various anti analysis methods, such as reverse modification based on integrity check and based on integrity check. In order to analyze these software, to analyze these software, we need to fight against these anti analysis. This further increases the difficulty of reverse analysis of binary code. This paper focuses on the key technologies of binary code back analysis recognition, disassembly, function and library function identification. In view of the specific anti analysis design specific anti analysis recognition method, the problem of lack of generality is lack, the conceptual similarity between various anti analysis methods is analyzed, and an anti analysis recognition framework based on information flow is proposed. The problem of code is a method of identification based on dynamic information flow without hardware assistance. First, the back stain analysis is used to identify the executable memory location or the memory location used to calculate the executable position value, and then use the forward stain analysis to identify the checkout process. In this method, the common acquisition time instruction and the return value of the system call are used as the source of the stain, and then the verification process is identified using the stain analysis. This method can successfully identify the reverse modification based on the integrity check and the counter monitoring technology based on the timing attack in the existing research literature, and provide the identified counter points. Based on the analysis of the basic structure information, it can help the analyst to design the anti analysis technology. In view of the problem that the current static and static disassembly methods still have low coverage, a multi-path exploration method is studied to disassemble code. Static disassembler can not distinguish the data and code in the executable code area, nor can it be used. The dynamic disassembly method has low code coverage and only deals with the path that has been executed. This paper uses dynamic analysis technology based on binary piling technique to record program instructions to execute the trajectory, and realizes multi path exploration by reversing the conditional branch in the execution path, thus improving the coverage of dynamic analysis. After simplifying all execution trajectories. Finally, a static disassembly is used to find the code in the unprocessed area. This method can disassemble the binary code with high accuracy and high coverage. The current function recognition method can not identify functions without cross reference and head and tail features. In this case, a function return instruction is studied. A function recognition method for identifying features. Because a function has at least one return instruction to make the control flow out of the function, the return instruction used in this paper is more reliable compared to the feature of the function head and tail used in the traditional method. First, the reverse extended control flow graph (Reverse Extended Control Flow Graph, RECFG) is introduced. It is the concept of a specific code area that contains all possible control stream graphs of the specified return instruction. Then a RECFG based method of function recognition is proposed. This method begins with a reverse analysis and control flow graph from all the interpretable addresses in a code area as the address of the return instruction, and the construction of the RECFG. design 4 pruning rules. To remove the points and paths that the compiler generates normally. Then, for each independent RECFG, the multiple attribute decision method is used to select a subgraph as the control flow graph of the function. This method can accurately identify the possible functions in the specific code area. A new method of identifying library functions is studied. Due to the discontinuity and polymorphism of the library functions of inline and optimization, the traditional feature matching method based on the N byte of function head can not identify inline functions. Firstly, the concept of Execution Flow Graph (EFG) is introduced, and the inner line of binary code is described with EFG. It is characterized by identifying the library functions by identifying similar EFG subgraphs in the target function. 5 filters are defined to filter out subgraphs that can not be matched, and the Reduced Execution Flow Graph (REFG) is introduced to accelerate the precision of the.EFG and REFG methods of subgraph isomorphic testing, which are higher than the most advanced tools at present, and can accurately identify inline library functions that are difficult to identify by traditional methods. REFG can be compared to EFG. In the case of maintaining the same precision and recall rate, the processing time of the EFG method is significantly reduced. Above all, the above method is to identify the inverse analysis, including the inverse modification based on the integrity check, improve the coverage of the dynamic disassembly method, identify the function without cross reference, and quickly identify the key techniques, such as the library function. New ideas and new methods are provided for the problem of operation.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:博士
【學位授予年份】:2015
【分類號】:TP309

【相似文獻】

相關期刊論文 前10條

1 王懷軍;房鼎益;李光輝;張聰;姜河;;基于變形的二進制代碼混淆技術研究[J];四川大學學報(工程科學版);2014年01期

2 高敏芬;王志;;二進制代碼分析與反分析技術開放實驗的探索[J];實驗室科學;2011年03期

3 ;葛雷碼——二進碼轉(zhuǎn)換[J];電子計算機動態(tài);1961年12期

4 曾鳴;趙榮彩;姚京松;王小芹;;基于特征提取的二進制代碼比較技術[J];計算機工程與應用;2006年22期

5 鄧超國;谷大武;李卷孺;孫明;;一種基于全系統(tǒng)仿真和指令流分析的二進制代碼分析方法[J];計算機應用研究;2011年04期

6 劉亮;彭帝;楊延峰;吳潤浦;;二進制代碼中整數(shù)型漏洞挖掘和利用技術[J];四川大學學報(工程科學版);2012年01期

7 曾鳴;趙榮彩;;二進制代碼中函數(shù)混淆調(diào)用的識別[J];計算機工程與應用;2007年17期

8 趙釗;袁勇;車向前;何永君;元慧慧;;多種動態(tài)二進制代碼插入框架的研究與分析[J];微計算機信息;2010年12期

9 姚偉平;王震宇;劉建林;竇增杰;;二進制代碼覆蓋率評估系統(tǒng)的設計與實現(xiàn)[J];計算機工程與設計;2010年24期

10 宋威;曾勇軍;奚琪;;基于空間約束的二進制代碼重寫技術研究[J];計算機應用與軟件;2014年06期

相關會議論文 前2條

1 李卷孺;谷大武;陸海寧;;二進制代碼隱秘功能的安全性驗證[A];全國計算機安全學術交流會論文集(第二十三卷)[C];2008年

2 王旭;范文慶;黃瑋;;二進制代碼混淆關鍵技術研究[A];2012年全國網(wǎng)絡與數(shù)字內(nèi)容安全學術年會論文集[C];2012年

相關博士學位論文 前2條

1 邱景;面向軟件安全的二進制代碼逆向分析關鍵技術研究[D];哈爾濱工業(yè)大學;2015年

2 王志;二進制代碼路徑混淆技術研究[D];南開大學;2012年

相關碩士學位論文 前6條

1 畢涵誠;二進制代碼匹配與分析系統(tǒng)的設計與實現(xiàn)[D];山東大學;2016年

2 李朝君;二進制代碼安全性分析[D];中國科學技術大學;2010年

3 白莉莉;多源二進制代碼一體化翻譯關鍵技術研究[D];解放軍信息工程大學;2010年

4 王為尉;基于混合執(zhí)行的二進制代碼測試系統(tǒng)的設計與實現(xiàn)[D];電子科技大學;2012年

5 陳曉斌;基于二進制代碼等價變換的代碼偽裝技術研究[D];解放軍信息工程大學;2009年

6 黎超;基于切片的二進制代碼可視化分析的研究[D];廣東工業(yè)大學;2011年

,

本文編號:2164328

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/falvlunwen/zhishichanquanfa/2164328.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶9a9cd***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com