基于變階馬爾科夫模型的口令猜測方法
發(fā)布時(shí)間:2018-06-17 22:29
本文選題:口令猜測 + 馬爾科夫模型 ; 參考:《武漢大學(xué)》2017年碩士論文
【摘要】:目前最流行的用戶身份認(rèn)證方式是以用戶名-口令對進(jìn)行認(rèn)證,這種方式容易理解、實(shí)現(xiàn)方便、使用簡單,但往往存在口令安全強(qiáng)度不夠的問題,因此,研究口令的安全性一直是現(xiàn)階段的一個(gè)熱點(diǎn)課題,其中,利用不同的口令猜測技術(shù)來破解口令集是研究口令安全性的主要方法。常用的口令猜測技術(shù)包括暴力破解、字典攻擊、基于概率口令模型的攻擊等等,其中基于概率口令模型的攻擊方法是近年來的研究熱點(diǎn),也是破解效果最好的口令猜測方法;诟怕实目诹钅P涂梢苑譃閮煞N,一種是基于模板的口令模型,即把口令按照一定的結(jié)構(gòu)分段,先計(jì)算出不同模板結(jié)構(gòu)的概率,從而得到具體口令的概率;另一種是基于全串的口令模型,即把口令當(dāng)做一個(gè)整體計(jì)算其概率。目前有很多基于模板的概率口令模型猜測方法的研究,而對于基于全串的概率口令模型猜測方法的研究較少,主要方向就是把自然語言處理技術(shù)中的馬爾科夫模型引入到口令概率計(jì)算之中。多數(shù)研究者采用的都是固定階數(shù)的馬爾科夫模型,階數(shù)過低時(shí),就會導(dǎo)致計(jì)算口令中每個(gè)位置的字符概率時(shí),使用的歷史字符信息過少,計(jì)算結(jié)果不夠準(zhǔn)確;而階數(shù)過高時(shí),則會由于訓(xùn)練集的數(shù)據(jù)稀疏問題,導(dǎo)致高階馬爾科夫模型過度擬合。針對以上問題,本文提出了一種實(shí)現(xiàn)變階馬爾科夫模型的方法,BackOff方法,在計(jì)算整串口令的概率時(shí),自適應(yīng)選擇所用的馬爾科夫模型階數(shù),即根據(jù)具體位置選擇歷史字符信息長度來計(jì)算概率。其實(shí)現(xiàn)方式是設(shè)置一個(gè)出現(xiàn)次數(shù)閾值,然后從最高階數(shù)的馬爾科夫模型開始嘗試,不斷降低模型階數(shù),直到N元語法出現(xiàn)次數(shù)大于閾值。整個(gè)猜測方法的流程是首先通過對真實(shí)口令集的訓(xùn)練,得到N元語法模型和對應(yīng)頻率的集合,然后在生成猜測口令階段,將N元語法與字符空間中的字符進(jìn)行拼接,利用馬爾科夫鏈計(jì)算口令概率,最后利用優(yōu)先隊(duì)列的入隊(duì)和出隊(duì),得到降序排列的猜測口令集合,用猜測口令集合去匹配測試集,得出不同猜測次數(shù)對應(yīng)的破解率。經(jīng)過四組對比實(shí)驗(yàn),本文提出的基于變階馬爾科夫模型的口令猜測方法取得了較好的效果,在進(jìn)行了兩千萬次猜測之后,本文猜測方法的破解率相對于傳統(tǒng)的JTR工具、PCFG方法以及基于定階馬爾科夫模型的口令猜測方法,都有了明顯的提高。
[Abstract]:At present, the most popular way of user identity authentication is to authenticate by username-password, which is easy to understand, easy to realize and easy to use, but it often has the problem of insufficient password security intensity. The research on password security has been a hot topic at present. Among them, it is the main method to study password security by using different password guessing techniques to crack password set. The commonly used password guessing techniques include brute force, dictionary attacks, attacks based on probabilistic password model and so on. Among them, probabilistic password model based attack method is a hot research topic in recent years, and it is also the best password guessing method. The probabilistic password model can be divided into two kinds: one is the password model based on template, that is, the password is segmented according to a certain structure, the probability of different template structure is calculated first, and the probability of specific password is obtained. The other is a full string-based password model, in which the probabilities of passwords are calculated as a whole. At present, there are many researches on probabilistic password model guessing based on template, but there are few researches on probabilistic password model guessing based on full string. The main direction is to introduce Markov model in natural language processing technology into password probability calculation. Most researchers use Markov model with fixed order. When the order is too low, the probability of characters at each position in password will be calculated, the historical character information is too little and the calculation result is not accurate, and when the order is too high, Because of the data sparsity of the training set, the higher order Markov model is overfitted. In order to solve the above problems, this paper presents a method to realize the variable order Markov model. In order to calculate the probability of the whole password, the order of Markov model is adaptively selected. The probability is calculated by selecting the length of historical character information according to the specific position. The method is to set a threshold of occurrence times, and then try to reduce the order of the model from the Markov model with the highest order, until the number of N-meta syntax occurrences is greater than the threshold. The flow of the whole guessing method is to obtain the N-meta syntax model and the set of corresponding frequencies by training the real password set, and then, in the stage of generating the guess password, the N-meta syntax and the characters in the character space are stitched together. The Markov chain is used to calculate the password probability. Finally, the descending order of the guess password set is obtained by using the priority queue, and the test set is matched by the guess password set, and the corresponding cracking rate of the different guesses is obtained. After four sets of comparative experiments, the password guessing method based on variable order Markov model has achieved good results. After 20 million guesses, Compared with the traditional JTR tool PCFG method and the password guessing method based on fixed order Markov model, the decoding rate of this method is obviously improved.
【學(xué)位授予單位】:武漢大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TP309
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 劉功申;邱衛(wèi)東;孟魁;李建華;;基于真實(shí)數(shù)據(jù)挖掘的口令脆弱性評估及恢復(fù)[J];計(jì)算機(jī)學(xué)報(bào);2016年03期
2 ;2014年最易中招的密碼公布[J];中國教育網(wǎng)絡(luò);2015年Z1期
3 鄒靜;林東岱;郝春輝;;一種基于結(jié)構(gòu)劃分概率的口令攻擊方法[J];計(jì)算機(jī)學(xué)報(bào);2014年05期
4 冶建華;;幾類插值算法的數(shù)學(xué)思想淺論[J];甘肅科技;2013年08期
5 羅江石;祝躍飛;顧純祥;;基于塊存儲結(jié)構(gòu)的彩虹表時(shí)空折中方法[J];計(jì)算機(jī)工程;2012年15期
6 張敬芝;高強(qiáng);耿樺;潘金貴;;統(tǒng)計(jì)自然語言處理中的線性插值平滑技術(shù)[J];計(jì)算機(jī)科學(xué);2007年06期
7 徐望,王炳錫;N-gram語言模型中的插值平滑技術(shù)研究[J];信息工程大學(xué)學(xué)報(bào);2002年04期
,本文編號:2032695
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2032695.html
最近更新
教材專著