天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

統(tǒng)計學習模型分析蛋白質(zhì)表達對乳癌細胞增殖的作用

發(fā)布時間:2018-09-04 09:40
【摘要】:隨著人們在日常生活中與有害物質(zhì)的接觸越來越頻繁,癌癥的發(fā)病率也逐漸增高。在這個大數(shù)據(jù)時代,如何在錯綜復雜的數(shù)據(jù)中選取有效的部分,變得十分重要。由于統(tǒng)計學習方法能夠更好的挖掘出有用的信息,這使得它成為十分重要的研究內(nèi)容。本文的研究對象為MD Anderson的一組乳癌細胞MDA-MB-231所掃描的反時相蛋白質(zhì)陣列(RPPA)和細胞增殖數(shù)據(jù)。通過這些數(shù)據(jù)對線性回歸、支持向量機(SVM)和隨機森林模型(RF)分別進行訓練,從而找到控制乳癌細胞增殖的關鍵蛋白質(zhì)。最終把這些關鍵蛋白質(zhì)作為癌癥藥物的潛在靶標。本文使用的數(shù)據(jù)波動性較大,為減少這些數(shù)據(jù)對統(tǒng)計效能產(chǎn)生的影響,首先對RPPA進行數(shù)據(jù)預處理。然后將預處理過的RPPA作為輸入數(shù)據(jù),細胞增殖作為輸出數(shù)據(jù),分別對線性回歸、SVM和RF進行訓練,其中在線性回歸模型的應用中,提出并使用了主成分分析(PCA)與線性回歸模型相結(jié)合的方法。最后通過比較三種模型的結(jié)果,得到了既具有較高精確度、又能夠篩選出具有關鍵影響力的蛋白質(zhì)組合的模型。本文結(jié)果表明,線性回歸模型精確度高,SVM模型能篩選出對乳癌細胞增殖起關鍵作用的蛋白質(zhì)組合,而RF在這兩方面表現(xiàn)都非常好。最后,利用RF對RPPA進行分析,得到28種對乳癌細胞影響較大的蛋白質(zhì),查找文獻可知,確認其中21種對乳癌細胞增殖有很大影響。
[Abstract]:As people contact with harmful substances more and more frequently in their daily life, the incidence of cancer increases gradually. In this big data era, how to select valid parts in the intricate data becomes very important. Because the statistical learning method can better excavate useful information, it becomes a very important research content. The object of this study was reverse phase protein array (RPPA) and cell proliferation data scanned by MDA-MB-231 of a group of breast cancer cells in MD Anderson. These data were used to train linear regression, support vector machine (SVM) and random forest model (RF) to find the key proteins to control the proliferation of breast cancer cells. These key proteins are eventually used as potential targets for cancer drugs. The data used in this paper are highly volatile. In order to reduce the impact of these data on statistical performance, the data preprocessing of RPPA is carried out first. Then the preprocessed RPPA is used as input data and cell proliferation is used as output data to train linear regression SVM and RF, respectively, which are used in the application of linear regression model. The method of combining principal component analysis (PCA) with linear regression model is proposed and used. Finally, by comparing the results of the three models, the model with high accuracy and the ability to screen out protein combinations with key influence is obtained. The results show that the linear regression model with high accuracy can screen out protein combinations that play a key role in the proliferation of breast cancer cells, and RF performs very well in both aspects. Finally, RF was used to analyze RPPA, and 28 kinds of proteins which had a great effect on breast cancer cells were obtained. The results showed that 21 of them had great influence on the proliferation of breast cancer cells.
【學位授予單位】:哈爾濱工業(yè)大學
【學位級別】:碩士
【學位授予年份】:2014
【分類號】:R737.9;Q811.4

【參考文獻】

相關期刊論文 前1條

1 林成德;彭國蘭;;隨機森林在企業(yè)信用評估指標體系確定中的應用[J];廈門大學學報(自然科學版);2007年02期

,

本文編號:2221705

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/yixuelunwen/swyx/2221705.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶c3116***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com
激情三级在线观看视频| 男女午夜在线免费观看视频 | 国产欧美一区二区三区精品视| 亚洲欧美日韩中文字幕二欧美| 国产内射在线激情一区| 国产国产精品精品在线| 亚洲国产一区精品一区二区三区色| 91日韩欧美国产视频| 亚洲综合天堂一二三区| 免费在线播放一区二区| 国产激情一区二区三区不卡| 亚洲av熟女国产一区二区三区站| 日本精品中文字幕在线视频 | 91熟女大屁股偷偷对白| 性感少妇无套内射在线视频| 91日韩欧美国产视频| 视频一区二区黄色线观看| 日本深夜福利视频在线| 九九视频通过这里有精品| 亚洲精品一区二区三区免| 欧美黑人在线一区二区| 俄罗斯胖女人性生活视频| 少妇福利视频一区二区| 亚洲最大的中文字幕在线视频| 香蕉久久夜色精品国产尤物| 日本高清视频在线播放| 麻豆亚州无矿码专区视频| 日韩日韩日韩日韩在线| 精品人妻一区二区三区在线看| 成人你懂的在线免费视频| 亚洲欧美黑人一区二区| 黄色国产精品一区二区三区| 色婷婷久久五月中文字幕| 欧美熟妇喷浆一区二区| 国产亚洲欧美日韩国亚语| 国产丝袜美女诱惑一区二区| 日韩免费国产91在线| 日韩中文字幕人妻精品| 中文字幕一区久久综合| 国产成人精品一区二区三区| 初尝人妻少妇中文字幕在线|