海量數(shù)據(jù)下的特定語(yǔ)義數(shù)據(jù)檢索優(yōu)化方法研究
發(fā)布時(shí)間:2018-08-21 10:25
【摘要】:在對(duì)海量數(shù)據(jù)中特定語(yǔ)義數(shù)據(jù)進(jìn)行檢索,是數(shù)據(jù)挖掘的一個(gè)重要方向。數(shù)據(jù)的本體信息與轉(zhuǎn)化的語(yǔ)義信息之間會(huì)有一定的誤差,使得特定的數(shù)據(jù)語(yǔ)義本體與信息本身存的關(guān)聯(lián)性在誤差和海量數(shù)據(jù)的雙重干擾下,更加退化。傳統(tǒng)的檢索方法一般采用映射方法,根據(jù)詞頻信息關(guān)聯(lián)性進(jìn)行信息檢索,隨著關(guān)聯(lián)性的降低,檢索性能下降。提出本體模型分詞高斯邊緣化融合的特定語(yǔ)義數(shù)據(jù)檢索算法。針對(duì)搜索引擎中本體內(nèi)元素之間的分類關(guān)系,把詞條數(shù)據(jù)當(dāng)作一個(gè)時(shí)間片來(lái)進(jìn)行分塊,對(duì)檢索的詞頻進(jìn)行上下包絡(luò)分區(qū)和高斯邊緣化融合,克服中心頻率變化對(duì)本體之間語(yǔ)義映射和語(yǔ)言知識(shí)配對(duì)的影響,實(shí)現(xiàn)海量數(shù)據(jù)干擾下特定數(shù)據(jù)語(yǔ)義檢索算法改進(jìn)。仿真結(jié)果表明,改進(jìn)算法能克服語(yǔ)義詞頻交叉項(xiàng)干擾,提高數(shù)據(jù)庫(kù)數(shù)據(jù)語(yǔ)義檢索精度。
[Abstract]:Retrieval of specific semantic data in massive data is an important direction of data mining. There will be some errors between the ontology information of data and the transformed semantic information, which makes the relationship between specific data semantic ontology and information itself degenerate under the double interference of error and mass data. The traditional retrieval methods usually use mapping method to retrieve information according to the correlation of word frequency information. With the decrease of relevance, the retrieval performance drops. A semantic data retrieval algorithm based on Gao Si marginalization fusion for ontology model partitioning is proposed. Aiming at the classification relationship between the elements in the ontology of search engine, the term data is divided into blocks as a time slice, and the frequency of the retrieval is divided into upper and lower envelopes and Gao Si marginalization. In order to overcome the influence of the change of center frequency on semantic mapping and linguistic knowledge pairing between ontologies, the semantic retrieval algorithm of specific data is improved under the interference of massive data. Simulation results show that the improved algorithm can overcome the interference of semantic word frequency crossover and improve the precision of semantic retrieval of database data.
【作者單位】: 上海師范大學(xué)人文與傳播學(xué)院;廣西大學(xué)計(jì)算機(jī)與電子信息學(xué)院;
【基金】:廣西科學(xué)研究與技術(shù)開發(fā)計(jì)劃項(xiàng)目(桂科能1140008-3B) 廣西自然科學(xué)基金項(xiàng)目(2014GXNSFBA118274)
【分類號(hào)】:TP391.3
[Abstract]:Retrieval of specific semantic data in massive data is an important direction of data mining. There will be some errors between the ontology information of data and the transformed semantic information, which makes the relationship between specific data semantic ontology and information itself degenerate under the double interference of error and mass data. The traditional retrieval methods usually use mapping method to retrieve information according to the correlation of word frequency information. With the decrease of relevance, the retrieval performance drops. A semantic data retrieval algorithm based on Gao Si marginalization fusion for ontology model partitioning is proposed. Aiming at the classification relationship between the elements in the ontology of search engine, the term data is divided into blocks as a time slice, and the frequency of the retrieval is divided into upper and lower envelopes and Gao Si marginalization. In order to overcome the influence of the change of center frequency on semantic mapping and linguistic knowledge pairing between ontologies, the semantic retrieval algorithm of specific data is improved under the interference of massive data. Simulation results show that the improved algorithm can overcome the interference of semantic word frequency crossover and improve the precision of semantic retrieval of database data.
【作者單位】: 上海師范大學(xué)人文與傳播學(xué)院;廣西大學(xué)計(jì)算機(jī)與電子信息學(xué)院;
【基金】:廣西科學(xué)研究與技術(shù)開發(fā)計(jì)劃項(xiàng)目(桂科能1140008-3B) 廣西自然科學(xué)基金項(xiàng)目(2014GXNSFBA118274)
【分類號(hào)】:TP391.3
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王樂(lè),孫莉;基于服裝實(shí)例創(chuàng)建企業(yè)本體模型的初步研究[J];計(jì)算機(jī)工程與設(shè)計(jì);2005年01期
2 焦宏想;葛世倫;孫清;;制造企業(yè)物資管理領(lǐng)域本體模型的構(gòu)建[J];中國(guó)制造業(yè)信息化;2006年11期
3 徐珊珊;厲穎;;基于本體的信息本體模型研究[J];煤炭技術(shù);2011年02期
4 唐攀;王紅衛(wèi);王U,
本文編號(hào):2195408
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2195408.html
最近更新
教材專著