詞頻分析法中高頻詞閾值界定方法適用性的實(shí)證分析
發(fā)布時間:2018-09-10 17:41
【摘要】:詞頻分析法是文獻(xiàn)計(jì)量學(xué)的重要分析方法之一,而確定高頻詞閾值是進(jìn)行詞頻分析的必要前提,高頻詞閾值的選取不僅決定詞頻分析法的分析結(jié)果,而且對整個分析研究都有著極其重要的影響。本文首先以近三年國內(nèi)運(yùn)用詞頻分析法展開研究的文獻(xiàn)為調(diào)研基礎(chǔ),發(fā)現(xiàn)目前學(xué)界常用的高頻詞閾值選取方法主要有自定義選取法、高低頻詞界定公式選取法、普賴斯公式選取法及混合選取法四類;其次,以個人知識管理領(lǐng)域的文獻(xiàn)為研究對象,對前三類高頻詞閾值選取方法分別進(jìn)行取值計(jì)算并做領(lǐng)域熱點(diǎn)聚類分析,對比驗(yàn)證聚類結(jié)果,同時以此結(jié)果為基礎(chǔ)討論高頻詞閾值選擇對分析結(jié)果的影響及其合理性;最后,指出我國學(xué)界在高頻詞閾值選取方面存在主觀性強(qiáng)、方法原理不明、改進(jìn)方法適用性不明,高低頻詞界定公式和普賴斯公式適用性尚待研究等問題。
[Abstract]:Word frequency analysis is one of the important analytical methods in bibliometrics, and the determination of the threshold of high-frequency words is a necessary prerequisite for word frequency analysis. The selection of the threshold of high-frequency words not only determines the analysis results of word frequency analysis. Moreover, it has an extremely important influence on the whole analysis and research. First of all, based on the literature about the use of word frequency analysis in recent three years, this paper finds out that the methods of selecting the threshold of high-frequency words are mainly self-defined method and high-low frequency word definition formula selection method, which are commonly used in academic circles at present. Secondly, taking the literature in the field of personal knowledge management as the research object, the first three kinds of high-frequency word threshold selection methods are calculated, and the hot spot clustering analysis is done for the first three kinds of high-frequency word threshold selection methods. At the same time, the influence and rationality of the threshold selection of high-frequency words on the analysis results are discussed. Finally, it is pointed out that there is a strong subjectivity in the selection of high-frequency word threshold in Chinese academic circles, and the principle of the method is not clear. The applicability of the improved method is unknown, and the applicability of the high and low frequency word definition formula and Price formula remains to be studied.
【作者單位】: 東北師范大學(xué)信息科學(xué)與技術(shù)學(xué)院;
【分類號】:G353.1
本文編號:2235153
[Abstract]:Word frequency analysis is one of the important analytical methods in bibliometrics, and the determination of the threshold of high-frequency words is a necessary prerequisite for word frequency analysis. The selection of the threshold of high-frequency words not only determines the analysis results of word frequency analysis. Moreover, it has an extremely important influence on the whole analysis and research. First of all, based on the literature about the use of word frequency analysis in recent three years, this paper finds out that the methods of selecting the threshold of high-frequency words are mainly self-defined method and high-low frequency word definition formula selection method, which are commonly used in academic circles at present. Secondly, taking the literature in the field of personal knowledge management as the research object, the first three kinds of high-frequency word threshold selection methods are calculated, and the hot spot clustering analysis is done for the first three kinds of high-frequency word threshold selection methods. At the same time, the influence and rationality of the threshold selection of high-frequency words on the analysis results are discussed. Finally, it is pointed out that there is a strong subjectivity in the selection of high-frequency word threshold in Chinese academic circles, and the principle of the method is not clear. The applicability of the improved method is unknown, and the applicability of the high and low frequency word definition formula and Price formula remains to be studied.
【作者單位】: 東北師范大學(xué)信息科學(xué)與技術(shù)學(xué)院;
【分類號】:G353.1
【相似文獻(xiàn)】
相關(guān)期刊論文 前1條
1 王崇德;來玲;;漢語文集的齊夫分布[J];情報(bào)科學(xué);1989年02期
,本文編號:2235153
本文鏈接:http://sikaile.net/tushudanganlunwen/2235153.html
最近更新
教材專著