基于概率的潛在語義分析模型在搜索引擎商業(yè)文本分類系統(tǒng)中的應用研究
發(fā)布時間:2018-07-27 20:21
【摘要】:就搜索引擎的盈利性來說,搜索引擎投放的商業(yè)廣告是否與用戶的搜索意圖相關十分重要。傳統(tǒng)的文本分類方法在搜索引擎的商業(yè)文本分類系統(tǒng)中解決了一部分問題,但是,語義的抽象性、多義性、同義性等特征是普遍存在的現象,如何定義和計算語義、怎樣與上下文結合分析語義,仍然是搜索引擎目前面臨的主要問題。本文針對搜索引擎的商業(yè)需求,利用近年來學術界提出的“潛在概率語義分析”(Probability Latent Semantic Analysis, PLSA)技術,以軟件工程的思想為指導,設計并實現了搜索引擎商業(yè)文本分類系統(tǒng)中的潛在概率語義計算模塊。最后,按照商業(yè)搜索引擎的業(yè)務要求,本文作者采用相關標準對本模塊進行了測試,證明了它的有效性和實用性。
[Abstract]:As far as the profitability of search engines is concerned, it is very important whether the commercial advertisements placed by search engines are relevant to users' search intentions. The traditional text classification method solves part of the problem in the commercial text classification system of search engine. However, the features of semantic abstraction, polysemy and synonym are common phenomena, how to define and calculate semantics, How to combine semantic analysis with context is still the main problem for search engines. Aiming at the commercial demand of search engine, this paper uses the "latent probabilistic semantic Analysis" (Probability Latent Semantic Analysis, PLSA) technology, which is proposed by academic circles in recent years, to be guided by the idea of software engineering. The latent probabilistic semantic computing module in the commercial text classification system of search engine is designed and implemented. Finally, according to the business requirements of the commercial search engine, the author tests the module with relevant standards, and proves its validity and practicability.
【學位授予單位】:北京交通大學
【學位級別】:碩士
【學位授予年份】:2011
【分類號】:TP391.1
[Abstract]:As far as the profitability of search engines is concerned, it is very important whether the commercial advertisements placed by search engines are relevant to users' search intentions. The traditional text classification method solves part of the problem in the commercial text classification system of search engine. However, the features of semantic abstraction, polysemy and synonym are common phenomena, how to define and calculate semantics, How to combine semantic analysis with context is still the main problem for search engines. Aiming at the commercial demand of search engine, this paper uses the "latent probabilistic semantic Analysis" (Probability Latent Semantic Analysis, PLSA) technology, which is proposed by academic circles in recent years, to be guided by the idea of software engineering. The latent probabilistic semantic computing module in the commercial text classification system of search engine is designed and implemented. Finally, according to the business requirements of the commercial search engine, the author tests the module with relevant standards, and proves its validity and practicability.
【學位授予單位】:北京交通大學
【學位級別】:碩士
【學位授予年份】:2011
【分類號】:TP391.1
【相似文獻】
相關期刊論文 前10條
1 相春雷;;2009年中國搜索引擎市場趨勢分析[J];軟件世界;2010年02期
2 ;揭秘搜索引擎收錄網站的秘密[J];計算機與網絡;2010年Z1期
3 申兵一;鞏青歌;;中文分詞技術在搜索引擎中的應用研究[J];計算機與網絡;2010年01期
4 馬s,
本文編號:2149029
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/2149029.html
教材專著