環(huán)境化合物毒性定量構(gòu)效關(guān)系建模方法研究
本文選題:定量構(gòu)效關(guān)系 切入點(diǎn):致癌性 出處:《哈爾濱理工大學(xué)》2013年博士論文
【摘要】:大量存在于空氣、土壤和水等諸多環(huán)境要素中的化合物,它們對(duì)人類(lèi)和動(dòng)植物的毒性的定性與定量是當(dāng)前迫切需要解決的問(wèn)題。這些大量的環(huán)境化合物的毒性的當(dāng)前檢測(cè)手段是動(dòng)物實(shí)驗(yàn),其中便宜并且快速的試管實(shí)驗(yàn)用于初檢,昂貴并且費(fèi)時(shí)的體內(nèi)實(shí)驗(yàn)用于終檢。動(dòng)物實(shí)驗(yàn)所面臨的最大問(wèn)題是倫理問(wèn)題,隨著人類(lèi)文明程度的提高和人類(lèi)對(duì)于自身與其共居地球的動(dòng)植物之間關(guān)系的認(rèn)識(shí)的深入,倫理問(wèn)題將成為動(dòng)物實(shí)驗(yàn)所面臨的最大問(wèn)題;其次,動(dòng)物實(shí)驗(yàn)尤其是體內(nèi)實(shí)驗(yàn)的高時(shí)間成本和高金錢(qián)成本也限制了動(dòng)物實(shí)驗(yàn)檢測(cè)化合物的數(shù)量。為解決動(dòng)物實(shí)驗(yàn)的檢測(cè)瓶頸問(wèn)題,定量構(gòu)效關(guān)系技術(shù)出現(xiàn)于世并且逐漸發(fā)展起來(lái),定量構(gòu)效關(guān)系涉及數(shù)學(xué)和統(tǒng)計(jì)學(xué)、量子力學(xué)、生物學(xué)、和計(jì)算機(jī)科學(xué),是化合物的分子結(jié)構(gòu)及其毒性之間的定量因果關(guān)系模型。定量構(gòu)效關(guān)系以數(shù)學(xué)和統(tǒng)計(jì)學(xué)理論為基礎(chǔ)建立數(shù)學(xué)模型,以計(jì)算機(jī)科學(xué)為工具實(shí)現(xiàn)數(shù)學(xué)和統(tǒng)計(jì)學(xué)理論,以量子力學(xué)為工具獲取化合物的分子結(jié)構(gòu),以生物學(xué)為工具獲取化合物的毒性數(shù)據(jù)以及認(rèn)識(shí)化合物的致毒機(jī)理,利用所建立的模型可不經(jīng)動(dòng)物實(shí)驗(yàn)直接從化合物的分子結(jié)構(gòu)獲取化合物的毒性值。定量構(gòu)效關(guān)系技術(shù)替代動(dòng)物實(shí)驗(yàn)成為化合物毒性的檢測(cè)手段的可能性,已經(jīng)使得定量構(gòu)效關(guān)系對(duì)當(dāng)前的化合物毒性檢測(cè)技術(shù)產(chǎn)生了重大影響,并且可以預(yù)見(jiàn),定量構(gòu)效關(guān)系對(duì)于當(dāng)前檢測(cè)技術(shù)的未來(lái)發(fā)展方向也將產(chǎn)生深遠(yuǎn)的影響。 本論文以環(huán)境化合物的毒性為檢測(cè)目標(biāo),以定量構(gòu)效關(guān)系技術(shù)為檢測(cè)手段,探索了以定量構(gòu)效關(guān)系技術(shù)替代動(dòng)物實(shí)驗(yàn)檢測(cè)化合物毒性的可能性,一共建立了三個(gè)定量構(gòu)效關(guān)系模型,分別是致癌性分類(lèi)模型、雌激素受體綁定能力分類(lèi)模型和腦血屏障可透性分類(lèi)模型,并且利用動(dòng)物實(shí)驗(yàn)檢測(cè)值對(duì)三個(gè)所建模型的性能進(jìn)行了評(píng)價(jià)。 首先,利用美國(guó)環(huán)保局提供的1153個(gè)環(huán)境化合物的分子結(jié)構(gòu)數(shù)據(jù)和長(zhǎng)期嚙齒類(lèi)動(dòng)物致癌性生物鑒定值,建立了環(huán)境化合物的致癌性分類(lèi)模型。根據(jù)化合物的分子結(jié)構(gòu)描述符的正態(tài)分布假設(shè)和化合物毒性分類(lèi)值的二項(xiàng)分布假設(shè),取得全部1153個(gè)化合物的分子結(jié)構(gòu)和毒性值的羅杰斯分布函數(shù)式;利用拉普拉斯前提改造負(fù)對(duì)數(shù)似然函數(shù)取得羅杰斯分布的稀疏性和擬合性矛盾二者的制衡;利用交叉校驗(yàn)從729個(gè)分子結(jié)構(gòu)描述符的權(quán)重排序中選擇9個(gè)分子結(jié)構(gòu)描述符,作為化合物致癌性分類(lèi)模型的結(jié)構(gòu)數(shù)據(jù);以化合物致癌性的陰性和陽(yáng)性之間距離的最大化為優(yōu)化條件,選取797個(gè)化合物作為支持向量,,選取高斯核函數(shù)度量?jī)蓛苫衔镏g的相關(guān)性,利用支持向量機(jī)構(gòu)造超平面將1153個(gè)化合物分類(lèi)為陰性和陽(yáng)性;用1153個(gè)化合物的長(zhǎng)期嚙齒類(lèi)動(dòng)物致癌性生物鑒定值對(duì)所建的化合物致癌性分類(lèi)模型的性能進(jìn)行了評(píng)價(jià),模型對(duì)1153個(gè)化合物的致癌性的分類(lèi)正確率是66.86%。 其次,利用美國(guó)環(huán)保局提供的278個(gè)環(huán)境化合物的分子結(jié)構(gòu)數(shù)據(jù)和大鼠子宮細(xì)胞溶質(zhì)雌激素受體競(jìng)爭(zhēng)性綁定實(shí)驗(yàn)值,建立了環(huán)境化合物的雌激素受體綁定能力分類(lèi)模型。利用化合物的熵構(gòu)造化合物的對(duì)稱無(wú)常,利用對(duì)稱無(wú)常同時(shí)度量化合物的分子結(jié)構(gòu)描述符兩兩之間的冗余性和分子結(jié)構(gòu)描述符與雌激素受體綁定能力之間的因果性;設(shè)計(jì)算法從278個(gè)化合物的729個(gè)分子結(jié)構(gòu)描述符中選擇8個(gè)高因果性并且低冗余性的分子結(jié)構(gòu)描述符,作為雌激素受體綁定能力分類(lèi)模型的結(jié)構(gòu)數(shù)據(jù);構(gòu)造8維笛卡爾特征空間,采用歐幾里得距離度量278個(gè)化合物兩兩之間的相似性,采用k個(gè)最近鄰居利用4個(gè)結(jié)構(gòu)最相似的化合物投票決定待測(cè)化合物的雌激素受體綁定能力的陰性或陽(yáng)性;利用278個(gè)化合物的大鼠子宮細(xì)胞溶質(zhì)雌激素受體競(jìng)爭(zhēng)性綁定實(shí)驗(yàn)值對(duì)所建的雌激素受體綁定能力分類(lèi)模型的性能進(jìn)行了評(píng)價(jià),模型對(duì)278個(gè)化合物的雌激素受體綁定能力的分類(lèi)正確率是96.76%。 最后,利用QSAR World提供的80個(gè)環(huán)境化合物的分子結(jié)構(gòu)數(shù)據(jù)和腦血屏障可透性活體測(cè)量值,建立了環(huán)境化合物的腦血屏障可透性分類(lèi)模型。構(gòu)造全部80個(gè)化合物的完全圖,利用點(diǎn)積計(jì)算完全圖的鄰接矩陣、次數(shù)矩陣和拉普拉斯矩陣,利用奇異值分解取得拉普拉斯矩陣的特征值和特征向量,利用完全圖譜度量分子結(jié)構(gòu)描述符的優(yōu)度;利用交叉校驗(yàn)從729個(gè)分子結(jié)構(gòu)描述符的優(yōu)度排序中選擇9個(gè)分子結(jié)構(gòu)描述符,作為腦血屏障可透性分類(lèi)模型的結(jié)構(gòu)數(shù)據(jù);構(gòu)造貝葉斯分類(lèi)器作為化合物的腦血屏障可透性分類(lèi)模型,利用樸素假設(shè)將聯(lián)合概率轉(zhuǎn)化為獨(dú)立概率,利用頻率計(jì)算化合物的腦血屏障可透性的陰性和陽(yáng)性的概率,利用正態(tài)分布構(gòu)造分子結(jié)構(gòu)描述符的概率分布式,利用最大似然估計(jì)取得正態(tài)分布的均值和方差;利用10個(gè)化合物的腦血屏障可透性活體測(cè)量值對(duì)所建立的化合物腦血屏障可透性分類(lèi)模型的性能進(jìn)行了評(píng)價(jià),模型對(duì)10個(gè)化合物的腦血屏障可透性的分類(lèi)正確率是90.00%。
[Abstract]:A large number of compounds present in the air, soil and water and other environmental factors, qualitative and quantitative their toxicity to humans and animals and plants is an urgent problem to be solved. The current detection means these massive environmental compounds is animal experiment, which will be fast and in vitro experiments for early detection, in vivo the expensive and time-consuming for final inspection. The biggest problem facing the animal experiment is ethical issues, with the advance of human civilization and human beings for their own and live earth dynamic relationship between the in-depth understanding of plant, ethical issues will be the biggest problem facing the animal experiment; secondly, animal experiments especially high time cost in vivo experiments and Gao Jinqian cost also limits the number of animal testing compounds. To solve the bottleneck problem of detection of animal experiments, the quantitative structure-activity relationship The technology in the world and gradually developed, the quantitative structure-activity relationship of mathematics and statistics, quantum mechanics, biology, and computer science, is a quantitative causal relationship model between the molecular structure and toxic compounds. The quantitative structure-activity relationship with mathematics and statistics theory to establish the mathematical model based on computer science as a tool to realize mathematics and the statistics theory to quantum mechanics as the tool to obtain molecular structure compounds, biological toxicity data tool to obtain compounds and understanding the toxicity mechanism of compounds, this model can not directly from the animal experiment of the molecular structure to obtain compounds toxicity values. The possibility of QSAR technology to replace animal experiments as detection of toxicity of compound, has made the quantitative structure-activity relationship of compounds toxicity detection current There is a significant impact, and it is foreseeable that the quantitative structure-activity relationship will have a far-reaching impact on the future direction of the current detection technology.
In this paper, the toxicity of environmental chemicals for the target detection, the quantitative structure-activity relationship method to explore the possibility of quantitative structure-activity relationship of compounds toxicity detection technology to replace animal experiments, have set up a quantitative structure-activity relationship model three, which are carcinogenic estrogen receptor binding ability of classification model, classification model and cerebral blood barrier permeability classification model, and the use of animal experimental values on three model performance was evaluated.
First of all, the molecular structure data of 1153 environmental compounds by the United States Environmental Protection Agency to provide long-term and rodent animal carcinogenicity bioassay, established carcinogenic compounds. According to the classification model of molecular descriptors compound of the assumption of normal distribution and toxicity of compound two classification value distribution, Rodgers distribution function and molecular structure all 1153 compounds have toxicity values; balance by using Laplasse transformation premise negative log likelihood function to obtain the Rodgers distribution sparsity and fitting of the contradiction between the two; the cross check 9 molecular descriptors from the weights of 729 molecular descriptors sorting, data structure as the classification model to carcinogenic compounds; the positive and negative chemical carcinogenicity the distance between the maximum optimization condition, 797 compounds were selected For the support vector, the correlation between selected Gauss kernel metric 22 compounds, using the SVM hyperplane 1153 compounds classified as negative and positive; 1153 compounds were evaluated by long-term rodent animal carcinogenicity bioassay value performance classification model of the cancer compound, carcinogenicity classification model of the 1153 compounds the correct rate is 66.86%.
Secondly, the molecular structure data of 278 environmental compounds by the United States Environmental Protection Agency provides and uterine cytosol of rat estrogen receptor competitive binding experiments, established estrogen receptor binding ability of environmental compounds. The classification model using symmetric entropy to construct the compounds of impermanence, causality between molecular descriptors and 22 compounds by measure symmetric impermanent redundancy and molecular structure descriptors and estrogen receptor binding ability; design algorithm from the 729 molecular structures of 278 compounds described 8 high causality and low redundancy of the molecular structure descriptors, as the data structure of estrogen receptor binding ability of the classification model; construct 8 dimensional Cartesian feature space. The Euclidean distance measure between 278 compounds and 22 similarity with k nearest neighbors by 4 A structure of the most similar compounds voted negative estrogen receptor binding ability of the test compound or positive; the uterine cytosol of rat estrogen receptor competitive binding experiments of 278 compounds on the properties of estrogen receptor binding capacity value classification model was evaluated, the classification of estrogen receptor binding ability of the 278 model compounds the correct rate is 96.76%.
Finally, the molecular structure of the data and the blood brain barrier permeability in vivo measurement of 80 environmental compounds by QSAR World to provide value, establish the blood-brain barrier permeability classification model environmental compounds. All 80 compounds of complete graph structure, using the dot product of the adjacency matrix calculation of complete graph, matrix and Laplasse matrix, using the the singular value decomposition to obtain characteristics of Laplasse matrix eigenvalues and eigenvectors, measure the molecular descriptors using complete map goodness; using cross validation to select 9 molecular descriptors from the sort of goodness of 729 molecular descriptors in the data structure as the classification model of the blood brain barrier permeability; constructing the Bias classifier as the blood brain barrier permeable classification model compounds, using the naive assumption will be transformed into independent joint probability probability, brain blood barrier compounds using frequency calculation Negative permeability and positive probability, the probability distribution of normal distribution structure of molecular descriptors, has estimated the mean and variance of normal distribution by using the maximum likelihood; blood brain barrier permeability in vivo measurement of 10 compounds on the properties of compound value of blood brain barrier permeability classification model is evaluated classification, blood brain barrier permeability model of the 10 compounds the correct rate is 90.00%.
【學(xué)位授予單位】:哈爾濱理工大學(xué)
【學(xué)位級(jí)別】:博士
【學(xué)位授予年份】:2013
【分類(lèi)號(hào)】:R114
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張輝;李娜;馬梅;劉光斌;;15種取代酚對(duì)淡水發(fā)光菌Q67的毒性及定量構(gòu)效分析[J];生態(tài)毒理學(xué)報(bào);2012年04期
2 陳莉敏;林友文;康建軍;邱彬;;姜黃素-釕配合物的合成和抗氧化活性研究[J];海峽藥學(xué);2010年05期
3 張明;盧俊瑞;辛春偉;劉芳;王菁菁;李紅姬;魏榮寶;鮑秀榮;;N-取代苯基-鹵代鄰羥基芐胺的合成、表征及抑菌活性[J];有機(jī)化學(xué);2009年10期
4 張明;盧俊瑞;陳麗然;柳宜君;何玲玲;鮑秀榮;;鄰羥基苯基芳基取代席夫堿的合成、表征及抑菌活性研究[J];天津理工大學(xué)學(xué)報(bào);2009年01期
5 茍紹華;李磊民;;2-(4`-甲酸吡啶)-亞肼基-1,3-二硫雜環(huán)戊烷化合物對(duì)水稻常見(jiàn)病菌的室內(nèi)抑菌試驗(yàn)[J];西南科技大學(xué)學(xué)報(bào);2008年04期
6 陳莉敏;劉洋;李光文;李娜;;姜黃素金屬配合物的合成、表征和抗腫瘤活性研究[J];中國(guó)新藥雜志;2008年19期
7 桑艷雙;劉薇;王安娜;宋洪濤;何仲貴;;布洛偽麻自微乳化制劑的處方篩選及體外溶出的評(píng)價(jià)[J];沈陽(yáng)藥科大學(xué)學(xué)報(bào);2008年08期
8 吳洪;黃真珠;陳秀娟;黃增平;鄭勇;;肼基單胺氧化酶抑制劑活性與電子結(jié)構(gòu)構(gòu)效關(guān)系的計(jì)算分析[J];中國(guó)生物化學(xué)與分子生物學(xué)報(bào);2007年11期
9 孟繁浩;孫也之;李佐靜;閆心麗;;定量構(gòu)效關(guān)系在化合物毒性研究中的應(yīng)用進(jìn)展[J];化學(xué)與生物工程;2007年06期
10 李響,劉征濤,沈萍萍,孔志明;鹵代酚類(lèi)物質(zhì)對(duì)抗氧化酶活性的影響研究及構(gòu)效分析[J];環(huán)境科學(xué)學(xué)報(bào);2004年05期
本文編號(hào):1693797
本文鏈接:http://sikaile.net/yixuelunwen/yufangyixuelunwen/1693797.html