基于DBSCAN算法與句間關(guān)系的熱點(diǎn)話(huà)題發(fā)現(xiàn)研究
發(fā)布時(shí)間:2018-03-18 05:15
本文選題:信息用戶(hù) 切入點(diǎn):熱點(diǎn)話(huà)題 出處:《圖書(shū)情報(bào)工作》2017年12期 論文類(lèi)型:期刊論文
【摘要】:[目的 /意義]在大數(shù)據(jù)時(shí)代面對(duì)海量的數(shù)據(jù)用戶(hù)有時(shí)會(huì)束手無(wú)策。因此,越來(lái)越多的學(xué)者們開(kāi)始關(guān)注互聯(lián)網(wǎng)熱點(diǎn)話(huà)題發(fā)現(xiàn)的算法,幫助用戶(hù)快速獲取熱點(diǎn)話(huà)題。[方法 /過(guò)程]基于DBSCAN算法,通過(guò)動(dòng)態(tài)調(diào)整參數(shù)來(lái)優(yōu)化算法,實(shí)現(xiàn)熱點(diǎn)話(huà)題發(fā)現(xiàn)。根據(jù)句法結(jié)構(gòu)與句間關(guān)系分析構(gòu)建熱點(diǎn)話(huà)題過(guò)濾模型,過(guò)濾包含熱點(diǎn)詞項(xiàng)的一般話(huà)題。[結(jié)果 /結(jié)論]采用主流網(wǎng)站新聞數(shù)據(jù)集進(jìn)行實(shí)驗(yàn),利用錯(cuò)檢率、漏檢率等評(píng)價(jià)指標(biāo)對(duì)算法的有效性進(jìn)行檢驗(yàn),實(shí)驗(yàn)結(jié)果證明改進(jìn)算法性能有所提升,能夠?yàn)樾畔⒂脩?hù)提供科學(xué)研究網(wǎng)絡(luò)數(shù)據(jù)的高效途徑。
[Abstract]:[purpose / significance] in big data's time faced with massive data users will sometimes be helpless. Therefore, more and more scholars are beginning to pay attention to the Internet hot topic discovery algorithm, [methods / procedures] based on DBSCAN algorithm, the algorithm is optimized by dynamically adjusting parameters to realize hot topic discovery. Based on the analysis of syntactic structure and sentence relationship, a hot topic filtering model is constructed. Filtering general topics containing hot words. [results / conclusions] using mainstream website news data set to test the effectiveness of the algorithm, using error detection rate, missed detection rate and other evaluation indicators, The experimental results show that the improved algorithm can improve the performance of the algorithm and provide an efficient way for information users to study network data scientifically.
【作者單位】: 長(zhǎng)春理工大學(xué)圖書(shū)館;長(zhǎng)春市農(nóng)業(yè)信息中心;
【分類(lèi)號(hào)】:G254
,
本文編號(hào):1628151
本文鏈接:http://sikaile.net/tushudanganlunwen/1628151.html
最近更新
教材專(zhuān)著