基于SVM的化合物致突變性分類預(yù)測系統(tǒng)的研究與實現(xiàn)
[Abstract]:With the development of science and technology, more and more drugs are developed to fight various diseases, and the process of drug development takes a lot of material resources, manpower, and long research and development cycle. Five characteristics of ADMET (absorption, distribution, metabolism, excretion and toxicity) should be considered in the whole process of drug development. The mutagenicity of drug toxicity is closely related to cancer. In the final phase of drug development in animal human trials, the mutagenicity of drugs in humans will be tested. At this stage, many drug mutagenicity experiments will be abandoned because of too much damage to animals or human beings, thus wasting the previous stages of investment. In recent years, the pattern recognition technology in computer technology has developed rapidly and been applied to all fields of society. Biological information and drug development are also an important research direction of pattern recognition. The main function of this system is to predict and classify the mutagenicity of compounds by using machine learning algorithm, and to analyze the characteristics of compounds related to mutagenicity by classification model. The system provides a large number of compounds and their characteristic attributes as training sets for classification models, including the results of the studies on the mutagenicity of the compounds by various research institutions. The system provides users with functions such as compound feature calculation, feature selection, data cleaning, classification model building, compound mutagenicity prediction, result analysis, result file preservation, and so on. The researchers can use the predicted results to analyze key characteristics that affect mutagenicity of compounds. The system uses Java language to develop, uses the Spring MVC frame to carry on the system structure, uses the MySQL database to carry on the compound characteristic and the personal information and so on data storage, has realized the data processing module, the forecast classification module, the result analysis module, System management module and personal information module. In the data processing module, according to the SMILES sequence of the compound, the system calculates the characteristic descriptor of the compound in 1446 latitudes, processes the missing value and normalizes the characteristic data, and then uses the information gain. Feature selection algorithms such as CFS and Relief are used to reduce the dimension of features. In the prediction classification module, the support vector machine (SVM) algorithm model is adopted and the Adaboost algorithm is used to iterate the SVM model to improve the prediction accuracy of the system. After a variety of cross-validation and independent test set verification, the system can accurately predict the mutagenicity of compounds, with an accuracy of 83.555. In function and performance can meet the needs of users, to achieve the desired results.
【學(xué)位授予單位】:遼寧大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2017
【分類號】:TQ460;TP181
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 晏皓鸞;黃景碧;;學(xué)習(xí)者情感挖掘:一個重要的教育技術(shù)研究領(lǐng)域[J];軟件導(dǎo)刊(教育技術(shù));2014年01期
2 計智偉;胡珉;尹建新;;特征選擇算法綜述[J];電子設(shè)計工程;2011年09期
3 劉慶和;梁正友;;一種基于信息增益的特征優(yōu)化選擇方法[J];計算機工程與應(yīng)用;2011年12期
4 丁世飛;齊丙娟;譚紅艷;;支持向量機理論與算法研究綜述[J];電子科技大學(xué)學(xué)報;2011年01期
5 朱樹先;張仁杰;;支持向量機核函數(shù)選擇的研究[J];科學(xué)技術(shù)與工程;2008年16期
6 姚勇;趙輝;劉志鏡;;一種非線性支持向量機決策樹多值分類器[J];西安電子科技大學(xué)學(xué)報;2007年06期
7 廖明陽;吳純啟;;藥物毒理學(xué)研究的發(fā)展現(xiàn)狀與趨勢[J];毒理學(xué)雜志;2007年05期
8 毛勇;周曉波;夏錚;尹征;孫優(yōu)賢;;特征選擇算法研究綜述[J];模式識別與人工智能;2007年02期
9 李琳;張曉龍;;基于RBF核的SVM學(xué)習(xí)算法的優(yōu)化計算[J];計算機工程與應(yīng)用;2006年29期
10 王興玲,李占斌;基于網(wǎng)格搜索的支持向量機核函數(shù)參數(shù)的確定[J];中國海洋大學(xué)學(xué)報(自然科學(xué)版);2005年05期
相關(guān)碩士學(xué)位論文 前1條
1 李曉嵐;基于Relief特征選擇算法的研究與應(yīng)用[D];大連理工大學(xué);2013年
,本文編號:2391291
本文鏈接:http://sikaile.net/kejilunwen/zidonghuakongzhilunwen/2391291.html