基于寫作風(fēng)格特征的論文剽竊檢查優(yōu)化方法研究
本文選題:寫作風(fēng)格特征 + 票竊檢查 ; 參考:《復(fù)旦大學(xué)》2011年碩士論文
【摘要】:互聯(lián)網(wǎng)技術(shù)日新月異的發(fā)展與網(wǎng)絡(luò)數(shù)據(jù)庫資源的日益豐富,為科研工作帶來極大的幫助。學(xué)術(shù)論文、調(diào)研報(bào)告、分析數(shù)據(jù)等等學(xué)術(shù)論文寫作所需要的參考資料得以便捷獲取,與此同時(shí)論文抄襲也相應(yīng)地更為容易與常見。尋找并建立有效預(yù)防及遏制剽竊行為的手段已經(jīng)刻不容緩。 自2005年以來,作者所在課題組通過產(chǎn)學(xué)研合作模式,在論文剽竊檢查方面進(jìn)行了大量的研究和開發(fā),完成了基于詞頻的論文剽竊檢查以及基于相對單元密度的論文剽竊檢查的設(shè)計(jì)與實(shí)現(xiàn)。前者對于完全抄襲的情況可以起到很好的判別作用,后者則在此基礎(chǔ)上完成了對部分抄襲情況的判斷,使得檢查結(jié)果的召回率得到顯著提高。然而,這兩種剽竊檢查方法在改變原文的剽竊行為判斷方面還存在較大的局限。為此,我們在其基礎(chǔ)上引入了綜合性考量對象——寫作風(fēng)格特征,對現(xiàn)有的剽竊檢查方法進(jìn)行優(yōu)化。 主要工作有如下4個方面: 1.本文研究對比了國內(nèi)外主流的與寫作風(fēng)格特征分析相關(guān)的技術(shù)以及語義詞典技術(shù),從中尋找適合應(yīng)用于單篇論文的,滿足剽竊檢查應(yīng)用需求的技術(shù)思路。 2.介紹了本課題組的前期工作:設(shè)計(jì)并實(shí)現(xiàn)了基于詞頻統(tǒng)計(jì)的論文剽竊檢查算法,以及基于相對單元密度的論文剽竊檢查應(yīng)用。在介紹前期工作取得的具體進(jìn)展同時(shí),還說明了目前這兩個方法存在的問題、局限以及可改進(jìn)之處。 3.在前期工作基礎(chǔ)上,借鑒國內(nèi)外相關(guān)技術(shù),提出了基于寫作風(fēng)格特征的論文剽竊檢查優(yōu)化方法,建立初步的寫作風(fēng)格特征語義詞典,描述了相應(yīng)的論文剽竊檢查系統(tǒng)的結(jié)構(gòu)與整體流程。 4.本文通過具體的應(yīng)用實(shí)例分析,闡述了優(yōu)化方法的應(yīng)用場景與效果,驗(yàn)證了新方法的有效性。 本文所研究的基于寫作風(fēng)格特征的論文剽竊檢查方法是對前期工作的補(bǔ)充優(yōu)化,對改變原文的論文剽竊情況進(jìn)行分析檢查,為剽竊檢查課題引入了新的思路,幫助該課題進(jìn)一步深入研究奠定基礎(chǔ),從而逐步建立起更準(zhǔn)確更完善的剽竊檢查方法與系統(tǒng),對學(xué)術(shù)剽竊的不正風(fēng)氣起到有效的打擊預(yù)防作用。
[Abstract]:The rapid development of Internet technology and the increasing abundance of network database resources bring great help to scientific research. Academic papers, research reports, data analysis and other academic papers required for the writing of reference materials can be easily obtained, at the same time, the paper plagiarism is also easier and more common. It is urgent to find and establish effective means to prevent and curb plagiarism. Since 2005, the author's research group has carried out a lot of research and development in the area of plagiarism inspection through the cooperation model of industry, education and research. The thesis plagiarism check based on word frequency and the paper plagiarism check based on relative unit density are designed and implemented. The former can play a very good role in discriminating the situation of complete plagiarism, while the latter has completed the judgment of partial plagiarism on this basis, which makes the recall rate of inspection results improved significantly. However, these two methods of checking plagiarism still have some limitations in changing the judgment of plagiarism. On the basis of this, we introduce the comprehensive object-writing style feature to optimize the existing methods of checking plagiarism. The main tasks are as follows: 1. In this paper, the main technologies related to the analysis of writing style and semantic dictionary are compared, and the technical ideas suitable for the application of plagiarism inspection are found. 2. This paper introduces the previous work of our group: we design and implement the algorithm of checking plagiarism based on word frequency statistics and the application of checking plagiarism based on relative unit density. At the same time, the problems, limitations and improvements of these two methods are explained. 3. On the basis of previous work and drawing lessons from relevant technologies at home and abroad, this paper puts forward an optimized method of checking plagiarism based on writing style features, and establishes a preliminary semantic dictionary of writing style features. The structure and overall flow of the corresponding paper plagiarism checking system are described. 4. In this paper, the application scene and effect of the optimization method are expounded through the analysis of the concrete application examples, and the validity of the new method is verified. The method of checking plagiarism based on writing style in this paper is a supplementary optimization to the previous work. It analyzes and checks the plagiarism situation of the original text, and introduces a new way of thinking for the subject of plagiarism checking. In order to establish a more accurate and perfect method and system for checking plagiarism, it can effectively combat and prevent the abnormal trend of academic plagiarism.
【學(xué)位授予單位】:復(fù)旦大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2011
【分類號】:TP391.1
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 朱彩萍;學(xué)術(shù)論文中關(guān)鍵詞的規(guī)范[J];圖書與情報(bào);2005年04期
2 李瑞芳;孫軍波;常詩珧;;基于計(jì)算機(jī)的《紅樓夢》字詞淺探[J];電腦知識與技術(shù);2009年03期
3 曾毅平;朱曉文;;計(jì)算方法在漢語風(fēng)格學(xué)研究中的應(yīng)用[J];福建師范大學(xué)學(xué)報(bào)(哲學(xué)社會科學(xué)版);2006年01期
4 張運(yùn)良;朱禮軍;喬曉東;張全;;基于句類特征的作者寫作風(fēng)格分類研究[J];計(jì)算機(jī)工程與應(yīng)用;2009年22期
5 黃曾陽;HNC理論概要[J];中文信息學(xué)報(bào);1997年04期
6 張衛(wèi)東 ,劉麗川;《紅樓夢》前八十回與后四十回語言風(fēng)格差異初探[J];深圳大學(xué)學(xué)報(bào)(人文社會科學(xué)版);1986年01期
7 錢兆明;新發(fā)現(xiàn)的一首“莎士比亞”抒情詩——評蓋里·泰勒的考據(jù)[J];外語教學(xué)與研究;1986年02期
相關(guān)碩士學(xué)位論文 前2條
1 康方圓;基于論文語義的高效剽竊檢查技術(shù)與系統(tǒng)研究[D];復(fù)旦大學(xué);2010年
2 沈元一;互聯(lián)網(wǎng)藥品信息抽取算法的研究[D];復(fù)旦大學(xué);2010年
,本文編號:1849674
本文鏈接:http://sikaile.net/wenshubaike/gzzj/1849674.html