基于自然語(yǔ)言處理的疑似侵權(quán)專(zhuān)利智能檢索研究
本文關(guān)鍵詞:基于自然語(yǔ)言處理的疑似侵權(quán)專(zhuān)利智能檢索研究 出處:《江蘇大學(xué)》2017年碩士論文 論文類(lèi)型:學(xué)位論文
更多相關(guān)文章: 專(zhuān)利侵權(quán) 信息抽取 詞向量 相似度計(jì)算 自然語(yǔ)言處理
【摘要】:專(zhuān)利文獻(xiàn)作為技術(shù)信息最有效的載體,囊括了全球90%以上的最新技術(shù)成果,對(duì)于知識(shí)產(chǎn)權(quán)的保護(hù)起著至關(guān)重要的作用。隨著目前專(zhuān)利數(shù)量的不斷增長(zhǎng)以及專(zhuān)利侵權(quán)訴訟的日益頻繁,專(zhuān)利侵權(quán)檢索已成為情報(bào)科學(xué)領(lǐng)域的研究熱點(diǎn)之一。傳統(tǒng)的專(zhuān)利侵權(quán)檢索主要是通過(guò)構(gòu)建檢索式從專(zhuān)利檢索系統(tǒng)中檢索相關(guān)專(zhuān)利,然后人工篩選出具有侵權(quán)風(fēng)險(xiǎn)的專(zhuān)利,不僅耗時(shí)耗力還容易受主觀因素的影響。因此,研究具有自動(dòng)檢索疑似侵權(quán)專(zhuān)利的智能檢索算法具有重要的現(xiàn)實(shí)意義。本文在介紹了專(zhuān)利侵權(quán)檢索所涉及的侵權(quán)判定、文本預(yù)處理、相似度計(jì)算等基礎(chǔ)上,重點(diǎn)研究了專(zhuān)利侵權(quán)檢索系統(tǒng)的核心,即疑似侵權(quán)專(zhuān)利檢測(cè)算法。論文就當(dāng)前專(zhuān)利侵權(quán)檢索研究中存在的特征選擇不合理、權(quán)利要求書(shū)信息利用不充分等問(wèn)題提出相應(yīng)的解決方案。本文的主要工作如下:(1)針對(duì)中文專(zhuān)利侵權(quán)檢索過(guò)程中關(guān)鍵詞特征表達(dá)能力弱等問(wèn)題,提出了一種基于三元組特征覆蓋度計(jì)算的侵權(quán)專(zhuān)利檢測(cè)方法。該方法將專(zhuān)利權(quán)利要求抽取為三元組特征的集合,并結(jié)合詞向量技術(shù)和HowNet計(jì)算三元組特征間的語(yǔ)義相似度。通過(guò)對(duì)專(zhuān)利技術(shù)特征集合覆蓋度算法的改進(jìn),有效提高了對(duì)疑似侵權(quán)專(zhuān)利的識(shí)別能力。實(shí)驗(yàn)結(jié)果表明,該方法取得較好的檢索效果和準(zhǔn)確率。(2)針對(duì)依存句法分析器穩(wěn)定性差而影響三元組特征抽取以及方法類(lèi)專(zhuān)利檢索準(zhǔn)確率低等問(wèn)題,提出了一種基于句子相似度計(jì)算的侵權(quán)專(zhuān)利檢測(cè)算法。該算法以句子作為最小計(jì)算單元,對(duì)權(quán)利要求書(shū)進(jìn)行樹(shù)狀結(jié)構(gòu)化構(gòu)建,并結(jié)合侵權(quán)判定規(guī)則設(shè)計(jì)了一種樹(shù)匹配算法,對(duì)樹(shù)狀權(quán)利要求書(shū)進(jìn)行侵權(quán)程度的計(jì)算。通過(guò)與現(xiàn)有的侵權(quán)檢索算法進(jìn)行實(shí)驗(yàn)對(duì)比表明,該算法具有一定的優(yōu)勢(shì)。(3)在Java平臺(tái)下,采用面向?qū)ο蟮乃枷?設(shè)計(jì)并實(shí)現(xiàn)了具有數(shù)據(jù)庫(kù)更新、預(yù)處理、初步檢索、侵權(quán)檢測(cè)等功能的中文疑似侵權(quán)專(zhuān)利智能檢索系統(tǒng)。其中侵權(quán)檢測(cè)模塊實(shí)現(xiàn)了本文所提出的兩種檢測(cè)方法,其余模塊也對(duì)傳統(tǒng)的方法進(jìn)行了改進(jìn)。
[Abstract]:The patent literature as the most effective carrier for technical information, including more than 90% of the world's latest technology, for the protection of intellectual property plays a vital role. With the increasing number of patents and patent infringement litigation is becoming more and more frequent, patent infringement retrieval has become one of the hot research field of information science. The traditional patent infringement retrieval is mainly through the construction of retrieval from the patent search patent retrieval system, and then manually screened with the risk of infringement of patent, not only time-consuming but also easy to be affected by subjective factors. Therefore, the research of intelligent automatic retrieval with suspected infringement of patent retrieval algorithm has important practical significance. This paper introduces the patent infringement retrieval involved in the infringement, text preprocessing, similarity calculation basis, focus on the core of patent infringement retrieval system, That is suspected of patent infringement detection algorithm. The current patent infringement retrieval features in the research of selection is not reasonable, the right to put forward the corresponding solutions by the problem of insufficient demand book information. The main work of this paper are as follows: (1) according to the Chinese patent infringement retrieval keyword feature expression ability in the process of weak and other issues, put forward a three tuple feature coverage based on the calculation of patent infringement detection method. This method will be a collection of patent claims for three tuple feature extraction, and combining the word vector and HowNet semantic similarity calculation of three tuple features. Through the collection of improved coverage algorithm to improve the technical features of the patent, suspected of infringement of patent the ability of recognition. The experimental results show that this method has better retrieval effect and accuracy. (2) according to the dependency parser and the impact of poor stability of three yuan The retrieval accuracy and low feature extraction method patent, this paper presents a calculation based on sentence similarity of patent infringement detection algorithm. In this algorithm, as the minimum sentence calculation unit for claims of tree structured construction, and design a tree matching algorithm combining ofinfringement rules to calculate tort claims tree degree. Through experimental comparison with existing infringement retrieval algorithm show that this algorithm has certain advantages. (3) in the Java platform, using object oriented method, the design and implementation of a database update, preprocessing, initial retrieval, Chinese suspected infringement of patent infringement retrieval system intelligent detection function. Infringement detection module realizes two kinds of detection method proposed in this paper, the method of module also improved the traditional.
【學(xué)位授予單位】:江蘇大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類(lèi)號(hào)】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前10條
1 張杰;孫寧寧;張海超;翟東升;馮秀珍;;基于SAO結(jié)構(gòu)的中文相似專(zhuān)利識(shí)別算法及其應(yīng)用[J];情報(bào)學(xué)報(bào);2016年05期
2 杜玉鋒;季鐸;姜利雪;張桂平;;基于SAO的專(zhuān)利結(jié)構(gòu)化相似度計(jì)算方法[J];中文信息學(xué)報(bào);2016年01期
3 袁勁松;張小明;李舟軍;;術(shù)語(yǔ)自動(dòng)抽取方法研究綜述[J];計(jì)算機(jī)科學(xué);2015年08期
4 饒齊;王裴巖;張桂平;;面向中文專(zhuān)利SAO結(jié)構(gòu)抽取的文本特征比較研究[J];北京大學(xué)學(xué)報(bào)(自然科學(xué)版);2015年02期
5 張杰;張海超;翟東升;;面向中文專(zhuān)利權(quán)利要求書(shū)的分詞方法研究[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2014年09期
6 武玉英;馬羽翔;翟東升;;基于SOM的中文專(zhuān)利侵權(quán)檢測(cè)研究[J];情報(bào)雜志;2014年02期
7 李茹;王智強(qiáng);李雙紅;梁吉業(yè);Collin Baker;;基于框架語(yǔ)義分析的漢語(yǔ)句子相似度計(jì)算[J];計(jì)算機(jī)研究與發(fā)展;2013年08期
8 胡阿沛;張靜;劉俊麗;;基于改進(jìn)C-value方法的中文術(shù)語(yǔ)抽取[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2013年02期
9 周群芳;;相似專(zhuān)利檢測(cè)研究[J];現(xiàn)代圖書(shū)情報(bào)技術(shù);2012年11期
10 韓紅旗;安小米;;C-value值和unithood指標(biāo)結(jié)合的中文科技術(shù)語(yǔ)抽取[J];圖書(shū)情報(bào)工作;2012年19期
相關(guān)碩士學(xué)位論文 前4條
1 李兵;基于領(lǐng)域本體的專(zhuān)利語(yǔ)義檢索研究[D];北京理工大學(xué);2015年
2 郭瞳康;基于詞典的中文分詞技術(shù)研究[D];哈爾濱理工大學(xué);2010年
3 邵一瓊;論專(zhuān)利侵權(quán)判定中的多余指定原則[D];寧波大學(xué);2009年
4 王苑;基于依存樹(shù)的中文命名實(shí)體語(yǔ)義關(guān)系抽取的研究[D];中南大學(xué);2009年
,本文編號(hào):1378622
本文鏈接:http://sikaile.net/shoufeilunwen/xixikjs/1378622.html