天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

基于評(píng)論挖掘的藥物副作用發(fā)現(xiàn)

發(fā)布時(shí)間:2018-03-21 09:07

  本文選題:藥物副作用發(fā)現(xiàn) 切入點(diǎn):文本挖掘 出處:《大連理工大學(xué)》2014年碩士論文 論文類(lèi)型:學(xué)位論文


【摘要】:隨著藥物副作用帶來(lái)的危害越來(lái)越大,藥物安全問(wèn)題日益受到人們的重視并逐漸成為醫(yī)學(xué)界和民眾關(guān)注的熱點(diǎn),因此如何發(fā)現(xiàn)藥物的副作用具有重大的理論與實(shí)用價(jià)值。而Web2.0技術(shù)的發(fā)展使得互聯(lián)網(wǎng)上出現(xiàn)了不少醫(yī)療健康類(lèi)社交網(wǎng)站,人們?cè)谏厦娣窒碛盟幗?jīng)歷并對(duì)藥物進(jìn)行評(píng)論。這些網(wǎng)站上的用戶(hù)評(píng)論數(shù)據(jù)日益豐富,其中蘊(yùn)含的藥物副作用相關(guān)信息開(kāi)始受到研究人員的關(guān)注,并逐漸形成從用戶(hù)評(píng)論中挖掘副作用信息這樣一種快捷、有效的藥物副作用發(fā)現(xiàn)機(jī)制。 在從用戶(hù)評(píng)論中挖掘藥物副作用時(shí),由于人們可能采用不同的表述方式來(lái)描述副作用,而新藥的上市與用藥者的差異性會(huì)造成新的副作用出現(xiàn),因此從評(píng)論中識(shí)別新的副作用名稱(chēng)并進(jìn)行標(biāo)準(zhǔn)化十分重要。針對(duì)該問(wèn)題,本文第3章工作利用條件隨機(jī)場(chǎng)模型識(shí)別評(píng)論中的副作用,對(duì)識(shí)別出的副作用名稱(chēng)進(jìn)行標(biāo)準(zhǔn)化,最后獲取藥物的副作用。實(shí)驗(yàn)結(jié)果顯示,條件隨機(jī)場(chǎng)模型可以識(shí)別出已知的與新的副作用名稱(chēng),而標(biāo)準(zhǔn)化技術(shù)將副作用名稱(chēng)進(jìn)行聚合與歸并,有利于藥物副作用的發(fā)現(xiàn)。本文通過(guò)將挖掘出的藥物已知的副作用與數(shù)據(jù)庫(kù)記錄進(jìn)行對(duì)比驗(yàn)證了本文方法的有效性,同時(shí)得到一個(gè)按評(píng)論中的發(fā)生頻率排序的藥物潛在副作用列表。 從用戶(hù)評(píng)論中識(shí)別副作用名稱(chēng)是藥物副作用發(fā)現(xiàn)中基礎(chǔ)卻關(guān)鍵的步驟,但由于評(píng)論內(nèi)容在語(yǔ)法上的不規(guī)范性與副作用名稱(chēng)的多樣性,從評(píng)論中識(shí)別副作用實(shí)體具有較大的挑戰(zhàn)性。針對(duì)該問(wèn)題,本文第4章實(shí)現(xiàn)了一個(gè)融合不同方法的副作用實(shí)體識(shí)別系統(tǒng)。第一種方法將滑動(dòng)窗口中的短語(yǔ)與詞典中的名稱(chēng)進(jìn)行詞袋匹配識(shí)別副作用實(shí)體,并在匹配時(shí)考慮了編輯距離;第二種方法利用條件隨機(jī)場(chǎng)模型進(jìn)行識(shí)別,其中應(yīng)用了向前選擇法找出最佳的特征集合,并通過(guò)試驗(yàn)找出效果最好的詞語(yǔ)上下文特征組合方式。將兩種方法的識(shí)別結(jié)果進(jìn)行融合,得到的融合后結(jié)果比單一方法具有較大提升,說(shuō)明通過(guò)融合可以彌補(bǔ)單一方法識(shí)別的不足。與其他文獻(xiàn)中的副作用實(shí)體識(shí)別方法相比,本文方法的識(shí)別性能與之相當(dāng)甚至可能優(yōu)于他們,從而證明本文提出的融合方法的有效性。
[Abstract]:With more and more harm caused by side effects of drugs, people pay more and more attention to the problem of drug safety and gradually become a hot spot in the medical field and the public. So how to find the side effects of drugs has great theoretical and practical value. And the development of Web2.0 technology has made many medical and health social networking sites appear on the Internet. People share their experiences with drugs and comment on them. There's a growing body of user reviews on these sites, and information about the side effects of drugs is starting to get the attention of researchers. A quick and effective mechanism of drug side effect discovery is gradually formed by mining side effect information from user comments. In mining side effects from user reviews, because people may use different expressions to describe side effects, the differences between new drug listings and drug users can lead to new side effects. Therefore, it is very important to identify and standardize the new side effect names from the comments. In order to solve this problem, in the third chapter, we use conditional random field model to identify the side effects in the comments, and standardize the identified side effects names. Finally, the side effects of the drug were obtained. The experimental results showed that the conditional random field model could identify the known and new side effects names, while the standardized technology aggregated and merged the side effects names. In this paper, the effectiveness of the method is verified by comparing the known side effects of the extracted drugs with database records. Also get a list of potential side effects by frequency of occurrence in the comments. Identifying side effects names from user reviews is a fundamental but critical step in the discovery of side effects, but due to the grammatical irregularity of comments and the diversity of side effects names, Identifying side effects from comments can be challenging. In chapter 4, we implement a side effect entity recognition system which combines different methods. The first method matches the phrase in the sliding window with the name in the dictionary to identify the side-effect entity, and considers the editing distance when matching. The second method is based on the conditional random field model, in which the forward selection method is used to find the best feature set. And through the experiment to find out the best way of word context feature combination. The results of the two methods are fused, and the result of fusion is much better than that of the single method. It shows that the fusion method can make up for the shortcomings of single method recognition. Compared with other side effect entity recognition methods in other literatures, the recognition performance of this method is comparable or even better than that of them. Thus, the validity of the fusion method proposed in this paper is proved.
【學(xué)位授予單位】:大連理工大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類(lèi)號(hào)】:TP311.13;R96

【相似文獻(xiàn)】

相關(guān)期刊論文 前10條

1 周小甲;李昊e,

本文編號(hào):1643177


資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/yixuelunwen/yiyaoxuelunwen/1643177.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶(hù)a65da***提供,本站僅收錄摘要或目錄,作者需要?jiǎng)h除請(qǐng)E-mail郵箱bigeng88@qq.com