面向軟件安全的二進制代碼逆向分析關鍵技術研究
[Abstract]:Binary code reverse analysis is a program analysis technique for binary code. It is critical in situations where source code is unavailable. For malware detection and analysis, as malware writers often do not expose source code, binary code reverse analysis is almost the only analytical means. The whole review and plagiarism test can only analyze its binary code because there is no source code. The binary code reverse analysis technology can also be used to reinforce existing software, reduce security vulnerabilities, prevent software from being cracked, prevent software from being pirated, and protect intellectual property. Most of the software is published in the form of binary code in smart phones and embedded devices. Therefore, it has important scientific theoretical significance and practical application value to study the reverse analysis of binary code to improve the security of computer software. There is a huge difference between the two code and the source code. It is much harder to analyze binary code reverse analysis relative to program source code analysis. Obfuscation technology and compiler optimization can also increase the difficulty of analyzing binary code. In addition, in order to protect software from detection and analysis, malware will use various anti analysis methods, such as reverse modification based on integrity check and based on integrity check. In order to analyze these software, to analyze these software, we need to fight against these anti analysis. This further increases the difficulty of reverse analysis of binary code. This paper focuses on the key technologies of binary code back analysis recognition, disassembly, function and library function identification. In view of the specific anti analysis design specific anti analysis recognition method, the problem of lack of generality is lack, the conceptual similarity between various anti analysis methods is analyzed, and an anti analysis recognition framework based on information flow is proposed. The problem of code is a method of identification based on dynamic information flow without hardware assistance. First, the back stain analysis is used to identify the executable memory location or the memory location used to calculate the executable position value, and then use the forward stain analysis to identify the checkout process. In this method, the common acquisition time instruction and the return value of the system call are used as the source of the stain, and then the verification process is identified using the stain analysis. This method can successfully identify the reverse modification based on the integrity check and the counter monitoring technology based on the timing attack in the existing research literature, and provide the identified counter points. Based on the analysis of the basic structure information, it can help the analyst to design the anti analysis technology. In view of the problem that the current static and static disassembly methods still have low coverage, a multi-path exploration method is studied to disassemble code. Static disassembler can not distinguish the data and code in the executable code area, nor can it be used. The dynamic disassembly method has low code coverage and only deals with the path that has been executed. This paper uses dynamic analysis technology based on binary piling technique to record program instructions to execute the trajectory, and realizes multi path exploration by reversing the conditional branch in the execution path, thus improving the coverage of dynamic analysis. After simplifying all execution trajectories. Finally, a static disassembly is used to find the code in the unprocessed area. This method can disassemble the binary code with high accuracy and high coverage. The current function recognition method can not identify functions without cross reference and head and tail features. In this case, a function return instruction is studied. A function recognition method for identifying features. Because a function has at least one return instruction to make the control flow out of the function, the return instruction used in this paper is more reliable compared to the feature of the function head and tail used in the traditional method. First, the reverse extended control flow graph (Reverse Extended Control Flow Graph, RECFG) is introduced. It is the concept of a specific code area that contains all possible control stream graphs of the specified return instruction. Then a RECFG based method of function recognition is proposed. This method begins with a reverse analysis and control flow graph from all the interpretable addresses in a code area as the address of the return instruction, and the construction of the RECFG. design 4 pruning rules. To remove the points and paths that the compiler generates normally. Then, for each independent RECFG, the multiple attribute decision method is used to select a subgraph as the control flow graph of the function. This method can accurately identify the possible functions in the specific code area. A new method of identifying library functions is studied. Due to the discontinuity and polymorphism of the library functions of inline and optimization, the traditional feature matching method based on the N byte of function head can not identify inline functions. Firstly, the concept of Execution Flow Graph (EFG) is introduced, and the inner line of binary code is described with EFG. It is characterized by identifying the library functions by identifying similar EFG subgraphs in the target function. 5 filters are defined to filter out subgraphs that can not be matched, and the Reduced Execution Flow Graph (REFG) is introduced to accelerate the precision of the.EFG and REFG methods of subgraph isomorphic testing, which are higher than the most advanced tools at present, and can accurately identify inline library functions that are difficult to identify by traditional methods. REFG can be compared to EFG. In the case of maintaining the same precision and recall rate, the processing time of the EFG method is significantly reduced. Above all, the above method is to identify the inverse analysis, including the inverse modification based on the integrity check, improve the coverage of the dynamic disassembly method, identify the function without cross reference, and quickly identify the key techniques, such as the library function. New ideas and new methods are provided for the problem of operation.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:博士
【學位授予年份】:2015
【分類號】:TP309
【相似文獻】
相關期刊論文 前10條
1 王懷軍;房鼎益;李光輝;張聰;姜河;;基于變形的二進制代碼混淆技術研究[J];四川大學學報(工程科學版);2014年01期
2 高敏芬;王志;;二進制代碼分析與反分析技術開放實驗的探索[J];實驗室科學;2011年03期
3 ;葛雷碼——二進碼轉(zhuǎn)換[J];電子計算機動態(tài);1961年12期
4 曾鳴;趙榮彩;姚京松;王小芹;;基于特征提取的二進制代碼比較技術[J];計算機工程與應用;2006年22期
5 鄧超國;谷大武;李卷孺;孫明;;一種基于全系統(tǒng)仿真和指令流分析的二進制代碼分析方法[J];計算機應用研究;2011年04期
6 劉亮;彭帝;楊延峰;吳潤浦;;二進制代碼中整數(shù)型漏洞挖掘和利用技術[J];四川大學學報(工程科學版);2012年01期
7 曾鳴;趙榮彩;;二進制代碼中函數(shù)混淆調(diào)用的識別[J];計算機工程與應用;2007年17期
8 趙釗;袁勇;車向前;何永君;元慧慧;;多種動態(tài)二進制代碼插入框架的研究與分析[J];微計算機信息;2010年12期
9 姚偉平;王震宇;劉建林;竇增杰;;二進制代碼覆蓋率評估系統(tǒng)的設計與實現(xiàn)[J];計算機工程與設計;2010年24期
10 宋威;曾勇軍;奚琪;;基于空間約束的二進制代碼重寫技術研究[J];計算機應用與軟件;2014年06期
相關會議論文 前2條
1 李卷孺;谷大武;陸海寧;;二進制代碼隱秘功能的安全性驗證[A];全國計算機安全學術交流會論文集(第二十三卷)[C];2008年
2 王旭;范文慶;黃瑋;;二進制代碼混淆關鍵技術研究[A];2012年全國網(wǎng)絡與數(shù)字內(nèi)容安全學術年會論文集[C];2012年
相關博士學位論文 前2條
1 邱景;面向軟件安全的二進制代碼逆向分析關鍵技術研究[D];哈爾濱工業(yè)大學;2015年
2 王志;二進制代碼路徑混淆技術研究[D];南開大學;2012年
相關碩士學位論文 前6條
1 畢涵誠;二進制代碼匹配與分析系統(tǒng)的設計與實現(xiàn)[D];山東大學;2016年
2 李朝君;二進制代碼安全性分析[D];中國科學技術大學;2010年
3 白莉莉;多源二進制代碼一體化翻譯關鍵技術研究[D];解放軍信息工程大學;2010年
4 王為尉;基于混合執(zhí)行的二進制代碼測試系統(tǒng)的設計與實現(xiàn)[D];電子科技大學;2012年
5 陳曉斌;基于二進制代碼等價變換的代碼偽裝技術研究[D];解放軍信息工程大學;2009年
6 黎超;基于切片的二進制代碼可視化分析的研究[D];廣東工業(yè)大學;2011年
,本文編號:2164328
本文鏈接:http://sikaile.net/falvlunwen/zhishichanquanfa/2164328.html