基于DBSCAN算法與句間關(guān)系的熱點(diǎn)話題發(fā)現(xiàn)研究
發(fā)布時(shí)間:2018-03-18 05:15
本文選題:信息用戶 切入點(diǎn):熱點(diǎn)話題 出處:《圖書情報(bào)工作》2017年12期 論文類型:期刊論文
【摘要】:[目的 /意義]在大數(shù)據(jù)時(shí)代面對(duì)海量的數(shù)據(jù)用戶有時(shí)會(huì)束手無策。因此,越來越多的學(xué)者們開始關(guān)注互聯(lián)網(wǎng)熱點(diǎn)話題發(fā)現(xiàn)的算法,幫助用戶快速獲取熱點(diǎn)話題。[方法 /過程]基于DBSCAN算法,通過動(dòng)態(tài)調(diào)整參數(shù)來優(yōu)化算法,實(shí)現(xiàn)熱點(diǎn)話題發(fā)現(xiàn)。根據(jù)句法結(jié)構(gòu)與句間關(guān)系分析構(gòu)建熱點(diǎn)話題過濾模型,過濾包含熱點(diǎn)詞項(xiàng)的一般話題。[結(jié)果 /結(jié)論]采用主流網(wǎng)站新聞數(shù)據(jù)集進(jìn)行實(shí)驗(yàn),利用錯(cuò)檢率、漏檢率等評(píng)價(jià)指標(biāo)對(duì)算法的有效性進(jìn)行檢驗(yàn),實(shí)驗(yàn)結(jié)果證明改進(jìn)算法性能有所提升,能夠?yàn)樾畔⒂脩籼峁┛茖W(xué)研究網(wǎng)絡(luò)數(shù)據(jù)的高效途徑。
[Abstract]:[purpose / significance] in big data's time faced with massive data users will sometimes be helpless. Therefore, more and more scholars are beginning to pay attention to the Internet hot topic discovery algorithm, [methods / procedures] based on DBSCAN algorithm, the algorithm is optimized by dynamically adjusting parameters to realize hot topic discovery. Based on the analysis of syntactic structure and sentence relationship, a hot topic filtering model is constructed. Filtering general topics containing hot words. [results / conclusions] using mainstream website news data set to test the effectiveness of the algorithm, using error detection rate, missed detection rate and other evaluation indicators, The experimental results show that the improved algorithm can improve the performance of the algorithm and provide an efficient way for information users to study network data scientifically.
【作者單位】: 長春理工大學(xué)圖書館;長春市農(nóng)業(yè)信息中心;
【分類號(hào)】:G254
,
本文編號(hào):1628151
本文鏈接:http://sikaile.net/tushudanganlunwen/1628151.html
最近更新
教材專著