天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于Hadoop的中醫(yī)藥Web信息資源評價體系研究

發(fā)布時間:2018-03-06 09:30

  本文選題:中醫(yī)藥 切入點:Web 出處:《山東中醫(yī)藥大學》2016年博士論文 論文類型:學位論文


【摘要】:隨著計算機和通訊技術的發(fā)展,Internet(互聯(lián)網(wǎng))逐漸滲透到人們生產(chǎn)、生活的各個領域,成為人們重要的知識來源,人們不斷的從網(wǎng)上獲取信息用來指導自己的工作和生活,現(xiàn)代社會已經(jīng)每時每刻都離不開互聯(lián)網(wǎng)。Web,指的是Internet上與HTML相關的部分,即基于HTML協(xié)議的信息資源頁面。Web上的中醫(yī)藥信息資源每天都在不斷的增長,已經(jīng)存在的資源也在不斷的發(fā)生著變化和更新,信息技術的快速發(fā)展使得Web上的中醫(yī)藥信息資源相關數(shù)據(jù)呈爆炸式增長,但這些不斷增長的中醫(yī)藥信息質(zhì)量良莠不齊,并且在現(xiàn)有的情況下很難有一套相對完善的方法對中醫(yī)藥信息資源的質(zhì)量進行客觀的評價,并指導人們從大量的中醫(yī)藥信息資源中找到正確的、對自己有用的信息。因此,我們需要一種方法,能夠?qū)δ壳癢eb上存在的中醫(yī)藥信息資源進行客觀的評價。論文從Web中醫(yī)藥信息資源特點出發(fā),使用Hadoop分布式計算技術,提出基于數(shù)據(jù)輔助的德爾菲法與AHP(Analytic Hierarchy Process,即層次分析法)建立中醫(yī)藥Web信息資源評價指標體系,并針對中醫(yī)藥健康服務類網(wǎng)站進行了實證研究。主要研究成果包括以下幾個方面:(1)中醫(yī)藥主題爬蟲的設計。(第3章)討論了Web中醫(yī)藥信息資源具有增速快、分布廣、易變化的特點,如果要對Web上存在的中醫(yī)藥信息資源進行分析和評價,前提是能夠以廉價、快速、高質(zhì)量的方法獲取信息,因此應使用自動化的Web信息獲取方式,即使用網(wǎng)絡爬蟲對中醫(yī)藥Web信息進行自動爬取。同時,該爬蟲與通用搜索引擎的爬蟲有所區(qū)別,只針對以中醫(yī)藥為主題的網(wǎng)站進行爬取,避免浪費爬蟲時間,從而提高爬取目標的準確率。因此針對上述要求,確定了中醫(yī)藥主題爬蟲分布式、可伸縮、高性能、高質(zhì)量的爬取目標,制定相應的爬取策略,并對爬蟲進行開發(fā)。(2)中醫(yī)藥信息資源的Hadoop平臺搭建。(第3章、第6章)爬取到的中醫(yī)藥Web相關主題頁面內(nèi)容,由于范圍廣泛、需要定期不斷的進行數(shù)據(jù)更新,同時在進行頁面分析和數(shù)據(jù)挖掘時,使用單機的分析策略,對單機的性能帶來很高的要求,因此使用單機關系數(shù)據(jù)庫的存儲方式,不能滿足高性能的計算要求,因此,在爬蟲爬取到頁面后,使用Hadoop的HDFS進行存儲,在后期對現(xiàn)有網(wǎng)頁內(nèi)容的文本挖掘、統(tǒng)計分析上,都能夠保證高性能和低系統(tǒng)開銷。(3)中醫(yī)藥Web信息資源評價指標體系的構建。(第4章、第5章)從中醫(yī)藥Web信息資源特點入手,探討了針對Web中醫(yī)藥信息資源評價的原則,對評價指標體系進行了構建。整個評價指標體系共分為四個大的部分,即信息內(nèi)容評價、網(wǎng)站設計評價、易用性評價和其他評價。每個部分又細分了具體的二級指標,總共24項,并詳細說明了這24項評價指標的意義和作用。進而對基于AHP層次分析法的中醫(yī)藥信息資源評價進行了分析,建立判斷矩陣,確定指標體系具體指標的權重,并進行一致性檢驗。根據(jù)權重的比較,確定中醫(yī)藥Web信息資源評價中各個指標的重要性程度。(4)基于數(shù)據(jù)分析的中醫(yī)藥Web信息資源評價實施(第6章)以具體的中醫(yī)藥網(wǎng)站評價實務為例,從搭建分析環(huán)境開始,包括對于軟硬件的配置要求、系統(tǒng)架構、Hadoop集群搭建等都進行了詳細的說明。并解釋了相關Map Reduce算法設計與實現(xiàn),闡述了對網(wǎng)站進行分類、打分評價的具體實施過程。并指出了基于該評價,網(wǎng)站應做的改進。
[Abstract]:With the development of computer and communication technology, Internet (Internet) has gradually penetrated into people's production and life in all areas, become an important source of knowledge, people from the Internet to obtain information to guide their work and life, modern society has all the time, all cannot do without the Internet.Web, refers to the Internet and HTML related parts, namely Chinese medicine information resources day HTML protocol based on.Web page information resources are growing, existing resources are constantly changing and updating, the rapid development of information technology makes the traditional Chinese medicine information resources related data on the Web is growing explosively, but the traditional Chinese medicine the growing information quality uneven in quality, and in the existing situation is very difficult to assess the quality of a relatively perfect method of traditional Chinese medicine information resources, and To guide people from Chinese medicine information resources found in the correct and useful information on their own. Therefore, we need a method to objectively assess TCM information resources exist on the Web at present. From the characteristics of information resources of traditional Chinese medicine of Web, using the Hadoop distributed computing technology, put forward Delphy Fa and AHP based on the data aided (Analytic Hierarchy Process, the analytic hierarchy process) to establish the evaluation index system of traditional Chinese medicine Web information resources, and makes an empirical research on Chinese medicine health service website. The main research results as follows: (1) the design of traditional Chinese medicine topic crawler. (Chapter third) discusses the Web of traditional Chinese medicine the medicine information resource with fast growth, wide distribution, easy to change, if you want to analyze and evaluate the traditional Chinese medicine information resource on the Web, the premise is to cheap, fast, high quality The method of obtaining information, so should the use of automated Web information retrieval method, namely the use of web crawler on traditional Chinese medicine Web information automatic crawling. At the same time, the difference of the reptiles and the general search engine crawler, only for the traditional Chinese medicine as the theme of the web crawling, avoid the waste of time so as to improve the accuracy of the crawler. Rate of climb from the target. So based on the above requirements, determine the TCM topical crawler distributed, scalable, high performance, high quality crawling target, formulate the corresponding crawling strategy, and the development of reptiles. (2) Chinese medicine information resources of the Hadoop platform. (Chapter third, chapter sixth) to take up the Chinese medicine Web topic page content, because of the extensive range, need to regularly update the data at the same time, page analysis and data mining, analysis of strategy use single, to bring high performance single Storage requirements, so the use of stand-alone database, can not meet the requirements of high performance computing, therefore, in the crawler crawl page, use Hadoop HDFS for storage, mining in the late of the existing web content text, statistical analysis, can ensure the high performance and low system overhead construction (3). The evaluation index system of Web information resources of traditional Chinese medicine. (Chapter fourth, chapter fifth) starting from the characteristics of information resources of traditional Chinese medicine Web, Chinese medicine Web on information resources evaluation principle, the evaluation index system was constructed. The evaluation index system is divided into four parts, namely information content evaluation website design, evaluation, usability evaluation and other evaluation. Each part is divided two levels of specific indicators, a total of 24 items, and a detailed description of the meaning and function of these 24 evaluation indexes. Then the analysis method based on AHP levels. The analysis of medical information resource evaluation, establish judgment matrix, determining the index weight of the index system, and consistency checking. According to the weight of the comparison, determine the degree of importance of each index of traditional Chinese medicine Web information resources evaluation. (4) in the evaluation of the implementation of pharmaceutical Web information resources based on data analysis (Chapter sixth) to TCM site specific evaluation practice, starting from the analysis of constructing the environment, including the software and hardware configuration requirements, system architecture, Hadoop cluster are discussed in detail. And explain the design and implementation of Map Reduce algorithm, describes the classification of the site, the specific implementation process and evaluation. Pointed out based on the evaluation, improve the site should be done.

【學位授予單位】:山東中醫(yī)藥大學
【學位級別】:博士
【學位授予年份】:2016
【分類號】:TP393.09;R2-03

【相似文獻】

相關博士學位論文 前1條

1 李學博;基于Hadoop的中醫(yī)藥Web信息資源評價體系研究[D];山東中醫(yī)藥大學;2016年

,

本文編號:1574269

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/zhongyixuelunwen/1574269.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權申明:資料由用戶e259f***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
狠狠做五月深爱婷婷综合| 麻豆一区二区三区在线免费| 97人妻精品免费一区二区| 国产成人在线一区二区三区| 亚洲免费黄色高清在线观看| 国产又粗又长又爽又猛的视频| 视频一区二区 国产精品| 好吊视频一区二区在线| 日韩欧美一区二区不卡看片| 日本高清不卡一二三区| 蜜桃av人妻精品一区二区三区| 亚洲精品国产主播一区| 亚洲国产91精品视频| 久久国产亚洲精品成人| 亚洲一区二区三区四区| 亚洲午夜av久久久精品| 五月天丁香婷婷一区二区| 久久精品中文字幕人妻中文| 亚洲欧美日韩国产自拍| 99热中文字幕在线精品| 91香蕉视频精品在线看| 日韩一区二区三区18| 久久本道综合色狠狠五月| 亚洲最新中文字幕在线视频 | 在线一区二区免费的视频| 少妇熟女精品一区二区三区| 99国产一区在线播放| 免费大片黄在线观看国语| 五月激情综合在线视频| 欧美字幕一区二区三区| 搡老妇女老熟女一区二区| 亚洲国产欧美久久精品| 亚洲精品av少妇在线观看| 精品一区二区三区免费看| 成人精品视频一区二区在线观看| 午夜福利网午夜福利网| 久久99精品国产麻豆婷婷洗澡| 日韩熟妇人妻一区二区三区| 欧美精品日韩精品一区| 日韩一区二区三区高清在| 亚洲国产精品av在线观看|