基于游戲虛擬貨幣市場的數據分析
發(fā)布時間:2018-06-11 13:50
本文選題:游戲虛擬貨幣 + 特定領域的文本檢索; 參考:《電子科技大學》2013年碩士論文
【摘要】:根據美國市場分析機構ABI Research的報告,全球網游市場規(guī)模在2015年將超過290億美金[1]。游戲虛擬貨幣是該產業(yè)鏈上的核心商品,處于產業(yè)鏈上的實體都亟需了解市場的工具以獲取供求統(tǒng)計信息及實時信息。大規(guī)模的網游市場伴隨著海量網絡數據的出現(xiàn),但基于該特定領域的自然語言處理技術(包括文本信息表示技術、同義詞問題處理、特征詞選擇方法、文本檢索技術、文本分類技術、Web信息提取技術等)的研究仍不多見。 本文針對上述問題,構造虛擬的專業(yè)搜索引擎,以獲取網游領域相關的結果集作為初始研究對象,并結合游戲虛擬貨幣網絡交易的特征,用適當的分類方法將初始結果集分類,以獲得承載游戲虛擬貨幣網絡交易信息的網頁集,再基于該網頁集進行游戲虛擬貨幣網絡交易訂單的數據采集和分析(包括冗余檢查和狀態(tài)更新),主要內容為: 1.建立向量空間模型以處理網頁文本,并提出結合領域特征的特征詞選擇方法和同義詞處理方法,來計算和降低向量空間的維度。 2.基于多個通用搜索引擎,構造虛擬的專業(yè)搜索引擎以獲取網游領域相關的網頁集,作為初始研究對象。 3.以K-近鄰文本分類方法為基礎,提出一種變換的KNN分類方法,對網頁集進行文本分類,該方法基于對訓練語料的分析,以余弦計算新文本與已知類別的相似度,不僅實現(xiàn)簡單且準確率高,,對訓練文本的重新訓練代價較低,計算的時間和空間復雜度都在訓練規(guī)模的線性變化空間內。 4.采用基于DOM的Web信息提取技術提取訂單信息不僅簡單高效,而且信息的采集穩(wěn)定可靠。結合遺傳算法的基本思想以檢測多次采集的訂單信息的狀態(tài)變化,不僅具有全局搜索優(yōu)化性能以及高效的并行計算性能,而且具有自組織、自適應、自學習的特征,從而可以確保訂單信息采集的高效性和準確性。 5.建立游戲虛擬貨幣數據應用平臺,以提供供求統(tǒng)計信息服務及實時信息服務。
[Abstract]:The global market for online games will exceed $29 billion in 2015, according to ABI Research, a U.S. market analyst. The game virtual currency is the core commodity in the industry chain. The entities in the industry chain need to know the tools of the market in order to obtain the statistical information of supply and demand and real-time information. Large-scale online game market is accompanied by the emergence of massive network data, but natural language processing technology (including text information representation technology, synonym problem processing, feature word selection method, text retrieval technology) based on this specific field, including text information representation technology, synonym problem processing, text retrieval technology, etc. The research of text classification technology and Web information extraction technology is still rare. In view of the above problems, this paper constructs a virtual professional search engine to obtain the result set related to the domain of online games as the initial research object. Combined with the characteristics of virtual currency network transaction, the initial result set is classified by appropriate classification method, so as to obtain the web page set carrying the information of virtual currency network transaction. Then the data collection and analysis (including redundancy check and status update) of the virtual currency network transaction order based on the web page set are as follows: 1. A vector space model is established to deal with the text of a web page, and a feature selection method combining domain features and a synonym processing method are proposed to calculate and reduce the dimension of vector space. 2. Based on multiple general search engines, a virtual professional search engine is constructed to obtain the web pages related to the online game domain, as the initial research object. 3. Based on the K-nearest neighbor text classification method, a transformed KNN classification method is proposed to classify the web pages. Based on the analysis of the training corpus, the similarity between the new text and the known category is calculated by cosine. Not only is the implementation simple and accurate, but the cost of retraining the training text is low. The time and space complexity of the calculation are both in the linear variation space of the training scale. 4. 4. Using Dom based Web information extraction technology to extract order information is not only simple and efficient, but also stable and reliable. Combining the basic idea of genetic algorithm to detect the state change of order information collected many times, it not only has global search optimization performance and efficient parallel computing performance, but also has the characteristics of self-organization, self-adaptation and self-learning. In order to ensure the order information collection efficiency and accuracy. 5. The virtual currency data application platform is established to provide the statistical information service of supply and demand and the real time information service.
【學位授予單位】:電子科技大學
【學位級別】:碩士
【學位授予年份】:2013
【分類號】:TP391.1
【參考文獻】
相關期刊論文 前10條
1 張繼東,劉萍;基于語料庫同義詞辨析的一般方法[J];解放軍外國語學院學報;2005年06期
2 徐小琳,闕喜戎,程時端;信息過濾技術和個性化信息服務[J];計算機工程與應用;2003年09期
3 章成志;一種基于語義體系的同義詞識別研究[J];淮陰工學院學報;2004年01期
4 戴文華;焦翠珍;何婷婷;;基于混合并行遺傳聚類的文本特征抽取方法研究[J];計算機科學;2008年09期
5 張寧,賈自艷,史忠植;使用KNN算法的文本分類[J];計算機工程;2005年08期
6 楊舟;卓林;趙朋朋;崔志明;;一種針對商品數據記錄的自動抽取方法[J];計算機工程;2010年23期
7 郭建兵;崔志明;陳明;趙朋朋;;基于DOM樹與領域本體的Web抽取方法[J];計算機工程;2012年05期
8 劉丹;謝慶生;顧新建;;電子商務環(huán)境下產品本體構建技術研究[J];計算機應用;2007年03期
9 趙世奇,張宇,劉挺,陳毅恒,黃永光,李生;基于類別特征域的文本分類特征選擇方法[J];中文信息學報;2005年06期
10 張琪玉;;網絡信息檢索工具增強關鍵詞檢索功能的措施[J];圖書館雜志;2001年01期
相關博士學位論文 前1條
1 李榮陸;文本分類及其相關技術研究[D];復旦大學;2005年
本文編號:2005494
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2005494.html
教材專著