面向刑事案件的精細(xì)分類與串并案分析技術(shù)研究
[Abstract]:With the rapid development of information technology, the information system in the field of public security is also faced with a huge amount of data, mainly text data, the traditional manual processing method has been difficult to meet the needs of the business. More automatic and intelligent text mining technology must be adopted to improve the efficiency of case handling. Focusing on the text of criminal cases, this paper focuses on the fine classification of cases and the analysis of serial cases, which are generally concerned by criminal investigators. A two-level classification method, TLC-NBK, based on naive Bayes and cooccurrence map of keywords is proposed. The method is based on the characteristics of short text length, low word frequency, hierarchical and unbalanced distribution of categories. Firstly, based on the DF method of document frequency, part of speech feature is introduced, and a two-factor evaluation algorithm is proposed for feature selection, and then naive Bayesian classification is carried out by using the multi-variable Bernoulli model oriented to unbalanced categories. On the basis of the first level classifier, the cooccurrence vector of keywords based on the document set is constructed for the second class case category to which it belongs. The cooccurrence relation between keywords is used instead of the word frequency to calculate the weight, and the inverse class frequency factor is proposed to modify the co-occurrence weight. Finally, the simple vector distance algorithm is used to realize the fine classification of the second-level case category. In addition, the interference of domain synonyms to classification results is eliminated by using synonym net technology. A density clustering method based on case features is proposed, and the serial case sequence analysis is realized. The method firstly extracts the structured case features from the unstructured case description information by combining rules and dictionaries, and then defines the formula for calculating the similarity of features between the case texts, and considers the fine case categories synthetically. The influence of time and location on the similarity of case features is analyzed, and the weight of each dimension is determined by AHP. Finally, the idea of OPTICS, a classical density clustering algorithm, is used for reference. The feature density clustering algorithm (OPTICS-FD,) is proposed to analyze the cluster of cases effectively and to assist the criminal investigators to solve the cases. Finally, the double factor evaluation algorithm, two-level classifier, case feature extraction and string-parallel case clustering are tested through experiments. The results show that in the field of criminal case text mining, the accuracy and recall rate of TLC-NBK method are increased by 7.53% and 12.99%, respectively, and the reduction rate and recall rate of OPTICS-FD algorithm are 66.52% and 91.25%, respectively.
【學(xué)位授予單位】:華中科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2016
【分類號】:TP391.1;D918.2
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 吳文浩;吳升;;多時間尺度密度聚類算法的案事件分析應(yīng)用[J];地球信息科學(xué)學(xué)報;2015年07期
2 陳龍;Neil Stuart;Williams A.Mackaness;;美國內(nèi)布拉斯加州林肯市犯罪行為的聚類及熱點分布分析[J];測繪與空間地理信息;2015年03期
3 盧睿;;刑事案件的屬性約簡聚類算法研究[J];中國人民公安大學(xué)學(xué)報(自然科學(xué)版);2015年01期
4 蘇光大;田青;徐偉;鄧宇;;人臉識別技術(shù)及其在公共安全領(lǐng)域的應(yīng)用[J];警察技術(shù);2014年05期
5 周志濤;鮑靈佳;;社會網(wǎng)絡(luò)分析在團(tuán)伙詐騙犯罪偵查中的應(yīng)用[J];江西警察學(xué)院學(xué)報;2014年03期
6 陳俊杰;候宏旭;高靜;;一種KeyGraph的建模方法[J];中北大學(xué)學(xué)報(自然科學(xué)版);2014年02期
7 李為;;基于數(shù)據(jù)挖掘技術(shù)的網(wǎng)絡(luò)違法案件分析研究[J];現(xiàn)代計算機(專業(yè)版);2013年35期
8 楊靜;王靖;;基于聚類分析檢索團(tuán)伙多起犯罪的迭代算法[J];計算機與現(xiàn)代化;2013年01期
9 高建強;譚劍;崔永發(fā);;一種基于通訊痕跡的社會網(wǎng)絡(luò)團(tuán)伙分析模型[J];計算機應(yīng)用與軟件;2012年03期
10 楊凱峰;張毅坤;李燕;;基于文檔頻率的特征選擇方法[J];計算機工程;2010年17期
相關(guān)碩士學(xué)位論文 前3條
1 韓彥斌;基于人臉檢測和特征提取的移動人像采集系統(tǒng)[D];云南大學(xué);2015年
2 金鑫;基于文本機會發(fā)現(xiàn)的共識與非共識標(biāo)簽區(qū)分方法[D];東北大學(xué);2011年
3 程春惠;公安犯罪案件文本挖掘關(guān)鍵技術(shù)研究[D];浙江大學(xué);2010年
,本文編號:2231130
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2231130.html