天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于內(nèi)容的農(nóng)業(yè)網(wǎng)絡(luò)信息可信度評估方法研究

發(fā)布時間:2018-12-16 07:38
【摘要】:隨著網(wǎng)絡(luò)技術(shù)的普及,信息化技術(shù)得以飛速發(fā)展,同樣農(nóng)業(yè)在社會信息化過程中也在逐步實(shí)現(xiàn)農(nóng)業(yè)的信息化。農(nóng)業(yè)的主體是農(nóng)民,在農(nóng)業(yè)信息化服務(wù)中,由于農(nóng)民普遍存在知識文化水平不高以及經(jīng)濟(jì)能力較弱的問題,無法判別網(wǎng)絡(luò)中各類信息的真實(shí)可靠性。本文針對農(nóng)業(yè)信息服務(wù)過程中出現(xiàn)的這些問題,對如何評估農(nóng)業(yè)網(wǎng)絡(luò)信息的可信度問題進(jìn)行了研究,主要工作包括:(1)針對傳統(tǒng)的TF-IDF主題提取方法沒有考慮詞語所在網(wǎng)頁位置的問題,提出基于詞語位置權(quán)重的TF-IDF方法提取農(nóng)業(yè)web信息的主題,經(jīng)實(shí)驗(yàn)驗(yàn)證本文提出的改進(jìn)方法的主題提取精度高于傳統(tǒng)的TF-IDF方法,提取效果理想;(2)針對候選網(wǎng)頁獲取階段搜索引擎沒有考慮其可信度的問題,提出基于內(nèi)容的農(nóng)業(yè)網(wǎng)絡(luò)信息可信度評估方法,主要構(gòu)建有四層可信度評估指標(biāo)的指標(biāo)體系:第一層判斷網(wǎng)頁的權(quán)威性,針對目前還沒有網(wǎng)頁權(quán)威性的分類和量化標(biāo)準(zhǔn)問題,自定義一個網(wǎng)站權(quán)威度的權(quán)重賦予表,對區(qū)分不同的網(wǎng)頁權(quán)威性效果較好;第二層判斷網(wǎng)頁的時效性,提出一種以網(wǎng)絡(luò)信息內(nèi)容發(fā)布日期來建立特定的時間衰減函數(shù)的新方法,能夠更好的反映時效性對農(nóng)業(yè)網(wǎng)絡(luò)信息可信度的影響;第三層判斷網(wǎng)頁的相關(guān)性,通過引入VSM模型生成每個候選網(wǎng)頁各自的詞頻向量,對候選網(wǎng)頁的內(nèi)容與關(guān)鍵字的相關(guān)程度進(jìn)行計(jì)算;第四層判斷網(wǎng)頁的影響力,結(jié)合網(wǎng)頁鏈接和用戶行為兩方面引入網(wǎng)站PR值、Page View值和Time on Page值,能夠很好地量化網(wǎng)頁影響力的大;(3)設(shè)置不同的主題來反映查詢詞個數(shù)與主題相關(guān)性之間的聯(lián)系,結(jié)果表明,選取4個查詢詞得到的候選網(wǎng)頁主題相關(guān)性平均值為77.4%,結(jié)果為最優(yōu);(4)分別建立搜索引擎自然排序、缺相關(guān)性指標(biāo)的排序和本文基于內(nèi)容的評估方法的排序,對候選網(wǎng)頁可信度進(jìn)行驗(yàn)證。自然排序可信度值分布落差較大;缺相關(guān)性指標(biāo)排序把一些與主題內(nèi)容無關(guān)的信息排在了靠前位置;本文方法的排序把主題內(nèi)容相關(guān)的且可信度高的網(wǎng)頁篩選出來并能最先提供給用戶,說明本文基于內(nèi)容的評估方法對評價農(nóng)業(yè)web信息可信度是具有一定的有效性和實(shí)用性。
[Abstract]:With the popularization of network technology, information technology has been developed rapidly, and agriculture is gradually realizing agricultural informatization in the process of social informatization. The main body of agriculture is farmers. In the service of agricultural information, it is impossible for farmers to judge the true reliability of all kinds of information in the network because of the problems of low level of knowledge and culture and weak economic ability. In view of these problems in the process of agricultural information service, this paper studies how to evaluate the credibility of agricultural network information. The main work includes: (1) aiming at the problem that the traditional TF-IDF topic extraction method does not consider the location of words on the web page, a TF-IDF method based on word position weight is proposed to extract agricultural web information. The experimental results show that the proposed method is more accurate than the traditional TF-IDF method, and the extraction effect is ideal. (2) aiming at the problem that the search engine does not consider its credibility in the stage of obtaining candidate web pages, a content-based method for evaluating the credibility of agricultural network information is proposed. This paper mainly constructs an index system with four levels of credibility evaluation index: the first layer judges the authority of the web page, aiming at the problem that there is no authoritative classification and quantification standard of the web page at present, we define a weighting table of the authority degree of the website. It has a good effect on differentiating the authority of different web pages. The second layer judges the timeliness of the web page, and puts forward a new method to establish the specific time attenuation function by the date of the publication of the network information content, which can better reflect the influence of the timeliness on the credibility of the agricultural network information. The third layer judges the relevance of the web page, and generates the word frequency vector of each candidate page by introducing the VSM model, and calculates the correlation degree between the content of the candidate page and the keyword. The fourth layer judges the influence of the web page and introduces the, Page View value and Time on Page value of the website PR value in combination with the two aspects of the web page link and user behavior, which can well quantify the size of the influence of the web page. (3) different topics are set to reflect the relationship between the number of query words and the relevance of the topic. The results show that the average value of the topic relevance of the candidate pages is 77.4, and the result is the best; (4) search engine natural sort, lack of correlation index sort and content-based evaluation method are established respectively to verify the credibility of candidate web pages. The distribution of reliability value of natural ranking is large, and the ranking of lack of correlation index ranks some information which is independent of subject content in the front position. The ranking of the methods in this paper filters out the highly reliable web pages related to the subject content and can be provided to the users first. It shows that the evaluation method based on the content in this paper is effective and practical in evaluating the credibility of agricultural web information.
【學(xué)位授予單位】:湖南農(nóng)業(yè)大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2015
【分類號】:S126

【參考文獻(xiàn)】

相關(guān)期刊論文 前10條

1 冀俊忠;張玲玲;吳晨生;吳金源;;基于知識語義權(quán)重特征的樸素貝葉斯情感分類算法[J];北京工業(yè)大學(xué)學(xué)報(bào);2014年12期

2 胡堰;彭啟民;胡曉惠;;一種基于隱語義概率模型的個性化Web服務(wù)推薦方法[J];計(jì)算機(jī)研究與發(fā)展;2014年08期

3 徐靜;楊小平;柳增;;基于內(nèi)容信任的Web信息可信度驗(yàn)證方法研究[J];北京理工大學(xué)學(xué)報(bào);2014年07期

4 楊博;陳賀昌;朱冠宇;趙學(xué)華;;基于超鏈接多樣性分析的新型網(wǎng)頁排名算法[J];計(jì)算機(jī)學(xué)報(bào);2014年04期

5 卓志宏;;一種基于語義信息的主題相關(guān)性判別模型[J];計(jì)算機(jī)與現(xiàn)代化;2013年09期

6 馬海波;楊楠;于新興;;用戶差別化和主題敏感的PageRank算法[J];大連交通大學(xué)學(xué)報(bào);2013年04期

7 黃f^;俞建家;;基于分類排名的網(wǎng)站可信度分析[J];福州大學(xué)學(xué)報(bào)(自然科學(xué)版);2013年01期

8 丁世飛;齊丙娟;譚紅艷;;支持向量機(jī)理論與算法研究綜述[J];電子科技大學(xué)學(xué)報(bào);2011年01期

9 艾靜;王仲遠(yuǎn);孟小峰;;C-Rank:一種Deep Web數(shù)據(jù)記錄可信度評估方法[J];計(jì)算機(jī)科學(xué)與探索;2009年06期

10 鞠時光;呂霞;王];;基于時間鏈接分析的頁面排序優(yōu)化算法[J];計(jì)算機(jī)應(yīng)用研究;2009年07期



本文編號:2382007

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/kejilunwen/nykj/2382007.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶ea75b***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com