天堂国产午夜亚洲专区-少妇人妻综合久久蜜臀-国产成人户外露出视频在线-国产91传媒一区二区三区

海量郵件自動化分析技術(shù)的研究與應(yīng)用

發(fā)布時間:2018-03-02 13:33

  本文選題:搜索 切入點:海量 出處:《電子科技大學(xué)》2014年碩士論文 論文類型:學(xué)位論文


【摘要】:電子郵件中蘊含的大量信息讓它成為了數(shù)據(jù)挖掘和大數(shù)據(jù)分析的重要對象。利用和分析這些信息成為很多用戶所關(guān)注的需求。而將原始的郵件文件高效快速的轉(zhuǎn)化為郵件元數(shù)據(jù),并構(gòu)建一個擁有便于分析和利用海量郵件數(shù)據(jù)的郵件自動分析平臺,就將為分析和利用好這些信息提供良好的基礎(chǔ)。本文研究了海量郵件自動化分析涉及的關(guān)鍵技術(shù)、設(shè)計實現(xiàn)了一個海量郵件自動化分析系統(tǒng)。首先,針對海量內(nèi)容和自動化這兩大需求,在盡可能不遺漏郵件文件信息的前提下,建立郵件快速導(dǎo)入模塊,分析并歸類郵件中的元信息,盡最大可能的提升導(dǎo)入效率、降低數(shù)據(jù)規(guī)模、提升用戶體驗和保證信息的完備性,解決了在海量郵件的前提下處理郵件速度和軟硬件基礎(chǔ)利用效率的問題,為進一步開展數(shù)據(jù)挖掘和分析提供良好的數(shù)據(jù)條件。其次,通過深入用戶工作實際,發(fā)現(xiàn)在人工分析中存在的流程特點和管理特點,實現(xiàn)對人工分析工作流程的集成,減少人工分析中不必要的工作和降低程序的運行開銷,提升了人工分析部分的信息化程度。接下來,在郵件元數(shù)據(jù)、郵件文本信息和分析結(jié)果入庫的前提下,本文實現(xiàn)了對上述信息的索引和檢索功能,提高了在面對海量郵件信息時快速檢索到感興趣信息的能力。在此基礎(chǔ)上,系統(tǒng)實現(xiàn)了郵件自動化分類標(biāo)記的功能,整體提升了系統(tǒng)的自動化能力。此后,本文設(shè)計了對感興趣的信息進行統(tǒng)計和導(dǎo)出的功能,實現(xiàn)了信息從分解、歸類、索引、統(tǒng)計到再次整合的過程。最后,還針對實際工作環(huán)境中的信息管理的具體流程和需要,建立了分角色的信息管理系統(tǒng),提升整個工作的信息化程度。本文對系統(tǒng)部署之后的工作情況做出了統(tǒng)計測試和對比,并對統(tǒng)計和對比結(jié)果反映的問題做出了分析和解釋。統(tǒng)計和對比數(shù)據(jù)表明,系統(tǒng)基本實現(xiàn)了用戶的需求并且可以服務(wù)于實際的工作。最后,本文還總結(jié)了海量郵件自動化分析系統(tǒng)需要改進的地方。并為海量郵件自動化分析系統(tǒng)將來功能的拓展和研究提出了一些自己的想法。
[Abstract]:E-mail contains a lot of information that makes it an important object of data mining and big data analysis. Using and analyzing this information has become the concern of many users. And the original mail files are transferred efficiently and quickly. Into mail metadata, And build a mail automatic analysis platform which is easy to analyze and utilize the mass mail data, which will provide a good basis for analyzing and utilizing the information. This paper studies the key technologies involved in the automatic analysis of mass mail. A mass mail automation analysis system is designed and implemented. First of all, aiming at the two major requirements of mass content and automation, a mail fast import module is established on the premise of not omitting the mail file information as much as possible. Analyze and classify the meta-information in email, improve the efficiency of import, reduce the scale of data, improve the user experience and ensure the completeness of information. It solves the problem of processing mail speed and using efficiency of software and hardware base under the premise of mass mail, and provides a good data condition for further developing data mining and analysis. Find out the characteristics of flow and management in manual analysis, realize the integration of manual analysis workflow, reduce unnecessary work in manual analysis and reduce the running cost of program. The information level of the manual analysis part is improved. Next, under the premise of the mail metadata, the mail text information and the analysis result, this paper realizes the function of indexing and retrieving the above information. The system improves the ability of retrieving the information of interest quickly in the face of mass email information. On this basis, the system realizes the function of automatic classification marking of mail, and improves the automation ability of the system as a whole. This paper designs the function of statistics and exportation of information of interest, realizes the process of information from decomposition, classification, index, statistics to re-integration. Finally, it also aims at the concrete flow and needs of information management in the actual working environment. In order to improve the information level of the whole work, this paper makes a statistical test and comparison of the work situation after the system deployment. Statistics and comparative data show that the system basically realizes the needs of users and can serve the actual work. This paper also summarizes the improvement of the mass mail automated analysis system, and puts forward some ideas for the expansion and research of the future functions of the mass mail automation analysis system.
【學(xué)位授予單位】:電子科技大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP393.098

【參考文獻】

相關(guān)期刊論文 前1條

1 伊衛(wèi)國,衛(wèi)金茂,王名揚;挖掘有效的關(guān)聯(lián)規(guī)則[J];計算機工程與科學(xué);2005年07期

,

本文編號:1556736

資料下載
論文發(fā)表

本文鏈接:http://sikaile.net/guanlilunwen/ydhl/1556736.html


Copyright(c)文論論文網(wǎng)All Rights Reserved | 網(wǎng)站地圖 |

版權(quán)申明:資料由用戶74566***提供,本站僅收錄摘要或目錄,作者需要刪除請E-mail郵箱bigeng88@qq.com