面向數(shù)學(xué)搜索的排序算法研究
發(fā)布時(shí)間:2018-08-02 14:48
【摘要】:目前,Web中的數(shù)學(xué)信息量逐漸增加,數(shù)學(xué)搜索成為人們關(guān)注的焦點(diǎn)。近幾年,瀏覽器對數(shù)學(xué)公式的顯示和存儲(chǔ)問題己得到逐步解決,為面向數(shù)學(xué)公式的搜索引擎的研究和開發(fā)提供了良好的基礎(chǔ)。 盡管數(shù)學(xué)公式可以存儲(chǔ)在web文檔中,在網(wǎng)絡(luò)中搜索數(shù)學(xué)公式仍具有局限性。數(shù)學(xué)公式具有復(fù)雜的二維結(jié)構(gòu)以及蘊(yùn)涵有復(fù)雜的數(shù)學(xué)表達(dá)意義,不同描述的數(shù)學(xué)公式可能具有相同的意義,同一數(shù)學(xué)公式的表示形式可能有多種,另外用戶查詢公式可能為某一公式的子公式,因此用傳統(tǒng)的文本檢索系統(tǒng)搜索數(shù)學(xué)公式顯得力所不足。國際上現(xiàn)有的或者正在研究的數(shù)學(xué)公式檢索系統(tǒng),在建立索引方面已取得逐步發(fā)展,在返回結(jié)果集的排序算法方面大部分仍應(yīng)用文本搜索引擎的排序算法,未深入研究面向數(shù)學(xué)公式搜索結(jié)果排序的算法。因此,本文將在深入研究現(xiàn)有的基于文本搜索引擎排序算法的原理和基礎(chǔ)上,結(jié)合數(shù)學(xué)公式的特點(diǎn)以及數(shù)學(xué)公式間的關(guān)系(等價(jià)、代數(shù)相關(guān)、子公式等)嘗試提出面向數(shù)學(xué)公式搜索排序的算法。本文將計(jì)算機(jī)代數(shù)系統(tǒng)(CAS)和數(shù)學(xué)公式搜索引擎相結(jié)合去挖掘公式與公式之間的關(guān)系,不但為查詢公式和網(wǎng)頁之間相關(guān)度的計(jì)算方面提供更加合理可靠的相關(guān)度量方法,還將促進(jìn)系統(tǒng)對數(shù)學(xué)公式語義檢索的能力。
[Abstract]:At present, the amount of mathematical information in Web is increasing gradually, and mathematical search has become the focus of attention. In recent years, the problem of displaying and storing mathematical formulas in browsers has been gradually solved, which provides a good foundation for the research and development of search engines oriented to mathematical formulas. Although mathematical formulas can be stored in web documents, searching for them in a network has its limitations. Mathematical formulas have complex two-dimensional structure and implicature of complex mathematical expressions. Different mathematical formulas may have the same meaning, and the same mathematical formulas may have many forms of expression. In addition, the user query formula may be a subformula of a certain formula, so it is insufficient to search the mathematical formula with the traditional text retrieval system. The existing or currently studied mathematical formula retrieval systems in the world have made gradual progress in indexing, and most of the sorting algorithms for returning result sets still use the sorting algorithms of text search engines. The algorithm for sorting search results for mathematical formulas is not studied in depth. Therefore, on the basis of studying the principle and foundation of the existing text search engine sorting algorithm, this paper will combine the characteristics of mathematical formula and the relationship between mathematical formulas (equivalent, algebraic correlation, etc.) Subformulas, etc.) an algorithm for searching and sorting mathematical formulas is proposed. In this paper, the computer algebra system (CAS) and the search engine of mathematical formulas are combined to mine the relationship between the formulas and the formulas, which not only provides a more reasonable and reliable correlation measure method for the calculation of the correlation between the query formulas and the web pages. It will also promote the system's ability of semantic retrieval of mathematical formulas.
【學(xué)位授予單位】:蘭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP391.3;O223
本文編號(hào):2159783
[Abstract]:At present, the amount of mathematical information in Web is increasing gradually, and mathematical search has become the focus of attention. In recent years, the problem of displaying and storing mathematical formulas in browsers has been gradually solved, which provides a good foundation for the research and development of search engines oriented to mathematical formulas. Although mathematical formulas can be stored in web documents, searching for them in a network has its limitations. Mathematical formulas have complex two-dimensional structure and implicature of complex mathematical expressions. Different mathematical formulas may have the same meaning, and the same mathematical formulas may have many forms of expression. In addition, the user query formula may be a subformula of a certain formula, so it is insufficient to search the mathematical formula with the traditional text retrieval system. The existing or currently studied mathematical formula retrieval systems in the world have made gradual progress in indexing, and most of the sorting algorithms for returning result sets still use the sorting algorithms of text search engines. The algorithm for sorting search results for mathematical formulas is not studied in depth. Therefore, on the basis of studying the principle and foundation of the existing text search engine sorting algorithm, this paper will combine the characteristics of mathematical formula and the relationship between mathematical formulas (equivalent, algebraic correlation, etc.) Subformulas, etc.) an algorithm for searching and sorting mathematical formulas is proposed. In this paper, the computer algebra system (CAS) and the search engine of mathematical formulas are combined to mine the relationship between the formulas and the formulas, which not only provides a more reasonable and reliable correlation measure method for the calculation of the correlation between the query formulas and the web pages. It will also promote the system's ability of semantic retrieval of mathematical formulas.
【學(xué)位授予單位】:蘭州大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2012
【分類號(hào)】:TP391.3;O223
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 李世奇;計(jì)算機(jī)代數(shù)系統(tǒng)MAPLE及其程序設(shè)計(jì)語言[J];重慶師范學(xué)院學(xué)報(bào)(自然科學(xué)版);1998年04期
2 姜楚江;余軼軍;;基于分塊和凈化的搜索引擎排序算法[J];計(jì)算機(jī)工程與應(yīng)用;2012年01期
3 李紹華;高文宇;;搜索引擎頁面排序算法研究綜述[J];計(jì)算機(jī)應(yīng)用研究;2007年06期
,本文編號(hào):2159783
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2159783.html
最近更新
教材專著