藏文名詞短語(yǔ)結(jié)構(gòu)類型分布與統(tǒng)計(jì)研究

發(fā)布時(shí)間：2018-03-05 17:10

本文選題：名詞短語(yǔ)　切入點(diǎn)：短語(yǔ)結(jié)構(gòu)　出處：《西北民族大學(xué)》2017年碩士論文　論文類型：學(xué)位論文

【摘要】：大數(shù)據(jù)策略和深度學(xué)習(xí)方法已經(jīng)成為藏語(yǔ)自然語(yǔ)言處理領(lǐng)域的主流技術(shù)。當(dāng)前,知識(shí)資源和標(biāo)注語(yǔ)料庫(kù)的匱乏已經(jīng)影響到了藏語(yǔ)智能化研究的進(jìn)程,尤其是像WordNet、HowNet和框架語(yǔ)義一樣的詞匯語(yǔ)義資源和句法結(jié)構(gòu)標(biāo)注、語(yǔ)義角色標(biāo)注以及篇章信息標(biāo)注等資源,還未形成統(tǒng)一的規(guī)范模式,深度學(xué)習(xí)等主流的學(xué)習(xí)方法不能用于實(shí)際訓(xùn)練。因此,資源庫(kù)建設(shè)已經(jīng)成為藏文信息處理領(lǐng)域中一項(xiàng)基礎(chǔ)而艱巨的任務(wù)。名詞短語(yǔ)、動(dòng)詞短語(yǔ)和形容詞短語(yǔ)研究是句法樹庫(kù)構(gòu)建所面臨的核心問題。本文在藏語(yǔ)句法樹庫(kù)框架下,對(duì)藏語(yǔ)名詞短語(yǔ)及其結(jié)構(gòu)展開分類統(tǒng)計(jì)研究,其目的是檢驗(yàn)藏語(yǔ)短語(yǔ)結(jié)構(gòu)分類歸納的準(zhǔn)確性,提高藏語(yǔ)短語(yǔ)分析的效率,加快藏語(yǔ)句法樹庫(kù)構(gòu)建的進(jìn)程。文章主要分為八個(gè)章節(jié)進(jìn)行敘述,首先討論了短語(yǔ)的研究背景和研究現(xiàn)狀,進(jìn)一步去了解了英語(yǔ)和漢語(yǔ)中名詞短語(yǔ)的相關(guān)句法分析理論和構(gòu)建名詞短語(yǔ)結(jié)構(gòu)庫(kù)所需的語(yǔ)料。其次,對(duì)英語(yǔ)、漢語(yǔ)和藏語(yǔ)的名詞短語(yǔ)的概念進(jìn)行敘述,并通過藏語(yǔ)文本真實(shí)語(yǔ)料對(duì)藏語(yǔ)中構(gòu)成名詞短語(yǔ)的結(jié)構(gòu)進(jìn)行分析,將詞類修飾構(gòu)成的名詞短語(yǔ)進(jìn)行分類歸納,通過分類歸納建立了藏語(yǔ)名詞短語(yǔ)的標(biāo)記集。最后,通過藏文真實(shí)語(yǔ)料中對(duì)名詞短語(yǔ)結(jié)構(gòu)的統(tǒng)計(jì)結(jié)果構(gòu)建了名詞短語(yǔ)結(jié)構(gòu)庫(kù)、名詞短語(yǔ)詞性標(biāo)注庫(kù)和名詞短語(yǔ)結(jié)構(gòu)標(biāo)注軟件。文章整體采用了語(yǔ)料實(shí)證、對(duì)比分析、統(tǒng)計(jì)分析、人工標(biāo)注以及人工校對(duì)的研究方法,建立了藏語(yǔ)基本名詞短語(yǔ)結(jié)構(gòu)庫(kù)和詞性標(biāo)注語(yǔ)料庫(kù)�？傊�,藏文名詞短語(yǔ)結(jié)構(gòu)類型分布與統(tǒng)計(jì)研究為藏語(yǔ)句法語(yǔ)義分析和樹庫(kù)構(gòu)建提供基本資源,為信息檢索、搜索引擎、機(jī)器翻譯、文本分類、模式識(shí)別、多媒體教學(xué)、網(wǎng)絡(luò)等應(yīng)用技術(shù)領(lǐng)域提供一定的理論與技術(shù)支持。
[Abstract]:Big data's strategy and in-depth learning methods have become the mainstream technology in the field of Tibetan natural language processing. At present, the lack of knowledge resources and annotated corpus has affected the process of intelligent Tibetan language research. In particular, lexical semantic resources and syntactic structure tagging, semantic role tagging and textual information tagging resources, such as WordNet HowNet and framework semantics, have not yet formed a unified normative model. Mainstream learning methods, such as in-depth learning, cannot be used for practical training. Therefore, the construction of a resource bank has become a basic and arduous task in the field of Tibetan information processing. The study of verb phrase and adjective phrase is the core problem in the construction of syntactic tree library. This paper, under the framework of Tibetan syntactic tree library, makes a statistical study of Tibetan noun phrases and their structures. The purpose of this paper is to test the accuracy of the classification and induction of Tibetan phrase structure, to improve the efficiency of Tibetan phrase analysis, and to speed up the construction of Tibetan syntactic tree bank. First of all, it discusses the background and present situation of phrase research, and further studies the syntactic analysis theory of noun phrase in English and Chinese, and the data needed to construct the noun phrase structure database. The concept of noun phrases in Chinese and Tibetan is described, and the structure of noun phrases in Tibetan is analyzed through the true data of Tibetan texts, and the noun phrases which are modified by parts of speech are classified and summarized. The tag set of Tibetan noun phrases is established by classification and induction. Finally, the noun phrase structure database is constructed through the statistical results of the noun phrase structure in the real Tibetan corpus. Part of speech tagging database and noun phrase structure tagging software. The research methods of corpus demonstration, comparative analysis, statistical analysis, manual tagging and artificial proofreading are used in this paper. The basic noun phrase structure database and part of speech tagging corpus are established. In a word, the distribution and statistical study of Tibetan noun phrase structure types provide basic resources for Tibetan syntactic and semantic analysis and tree database construction, as well as for information retrieval and search engine. Machine translation, text classification, pattern recognition, multimedia teaching, network and other applications provide some theoretical and technical support.
【學(xué)位授予單位】：西北民族大學(xué)
【學(xué)位級(jí)別】：碩士
【學(xué)位授予年份】：2017
【分類號(hào)】：H214

【參考文獻(xiàn)】

相關(guān)期刊論文前2條

1 章忠憲;;基于規(guī)則的英語(yǔ)名詞短語(yǔ)結(jié)構(gòu)自動(dòng)識(shí)別研究[J];吉林工程技術(shù)師范學(xué)院學(xué)報(bào);2013年07期

2 王維賢;;現(xiàn)代漢語(yǔ)的短語(yǔ)結(jié)構(gòu)和句子結(jié)構(gòu)[J];語(yǔ)文研究;1984年03期

，

本文編號(hào)：1571152

資料下載

論文發(fā)表

支付寶下載

Download by Alipay
微信下載

Download by Wechat
會(huì)員下載

Download by Member

本文鏈接：http://sikaile.net/kejilunwen/sousuoyinqinglunwen/1571152.html

上一篇：基于本體的語(yǔ)義檢索關(guān)鍵技術(shù)研究
下一篇：基于分解轉(zhuǎn)移矩陣的PageRank迭代計(jì)算方法

論文發(fā)表

·知網(wǎng)|萬(wàn)方|維普|龍?jiān)磡省級(jí)|國(guó)家級(jí)|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|

天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

藏文名詞短語(yǔ)結(jié)構(gòu)類型分布與統(tǒng)計(jì)研究