一種分布式廣告投放引擎的設計與實現(xiàn)
發(fā)布時間:2018-07-10 02:31
本文選題:定向廣告 + 搜索引擎。 參考:《北京郵電大學》2012年碩士論文
【摘要】:隨著互聯(lián)網(wǎng)廣告業(yè)的快速發(fā)展,定向廣告作為一種新興的網(wǎng)絡廣告模式也隨之迅速發(fā)展,這種廣告模式以其精準、及時、高效的特點備受人們的關注。定向廣告是一種投放在網(wǎng)頁上、與網(wǎng)頁內容或者用戶自身的行為特征相關的廣告投放模式,按定向模式的不同可以分為內容定向廣告和行為定向廣告。與傳統(tǒng)平面媒體、電視媒體相對而言,互聯(lián)網(wǎng)的豐富交互手段使得互聯(lián)網(wǎng)廣告具備了傳統(tǒng)媒體無法比擬的廣告投放優(yōu)化空間。PC終端豐富的多媒體交互能力、狀態(tài)保持能力為廣告定向投放行為提供了良好的平臺。 定向廣告的基礎是垂搜索引擎。垂直搜索引擎,是針對某一個行業(yè)的專業(yè)搜索引擎,是搜索引擎的細分和延伸,它針對某一特定領域、某一特定人群或某一特定需求提供、有一定價值的信息和相關服務。而廣告垂直搜索引擎則是針對競價廣告行業(yè)的專業(yè)搜索引擎,它以結構化廣告信息為搜索內容,向查詢商品的用戶提供相關廣告的信息。垂直搜索引擎與通用網(wǎng)頁搜索引擎有很大的不同,它和普通的網(wǎng)頁搜索引擎的最大區(qū)別是對被查信息進行了結構化信息抽取,也就是將網(wǎng)頁的非結構化數(shù)據(jù)變成特定的結構化信息數(shù)據(jù),網(wǎng)貞搜索是以網(wǎng)頁為最小單位,而垂直搜索是以結構化數(shù)據(jù)為最小單位,然后將這些數(shù)據(jù)存儲到數(shù)據(jù)庫,進行進一步的加工處理,如:去重、分類等,最后分詞、索引再以搜索的方式滿足用戶的需求。 廣告垂直搜索引擎通要而對海量數(shù)據(jù),引擎庫中通常會存儲幾千萬乃至上億條的廣告信息數(shù)據(jù)。在索引量和搜索量大到一定程度的時候,索引查詢、更新的效率會逐漸降低,服務器的壓力逐漸升高,整個搜索引擎的利用率可以說是越來越低了,并且隨著海量數(shù)據(jù)存儲帶來的困難,設計一個良好的分布式垂直搜索引擎將成為一個垂直搜索引擎能否面向未來發(fā)展的關鍵因素。 如何向用戶推送最相關的廣告,以及如何解決每天數(shù)以億計的廣告投放的問題是本文研究的重點。本文采用了文本內容匹配、長期用戶行為分析、短期用戶行為分析、關聯(lián)推薦、融合模式等多種方法來為用戶推送其最喜愛的廣告,并通過多節(jié)點分布式計算體系,解決了超大規(guī)模并發(fā)下的實時請求處理問題。利用多層次數(shù)據(jù)處理架構解決了海量數(shù)據(jù)的匯總統(tǒng)計需求,實現(xiàn)出了一種在數(shù)據(jù)量、訪問量增加的情況下可擴展的一種服務架構。
[Abstract]:With the rapid development of Internet advertising industry, targeted advertising as a new network advertising model has also developed rapidly. This advertising model has attracted people's attention for its accurate, timely and efficient characteristics. Targeted advertising is a kind of advertising mode which is placed on the web page and related to the content of the web page or the behavior characteristics of the user itself. It can be divided into content oriented advertising and behavioral targeted advertising according to the different orientation mode. Compared with the traditional print media and TV media, the rich interactive means of the Internet make the Internet advertisement have the multimedia interaction ability which the traditional media can not compare with the optimization space of advertising placement. PC terminal is rich in multimedia interaction. State-keeping ability provides a good platform for advertising targeting. The basis of targeted advertising is the vertical search engine. Vertical search engine is a professional search engine for a certain industry. It is the subdivision and extension of a search engine. It aims at a specific field, a specific population or a specific need to provide information and related services of a certain value. The vertical advertising search engine is a professional search engine for the bidding advertising industry. It uses structured advertising information as the search content to provide relevant advertising information to the users who query the products. The vertical search engine is very different from the general web search engine. The biggest difference between vertical search engine and common web search engine is the structured information extraction. In other words, the unstructured data of a web page is changed into a specific structured information data. Web search takes the web page as the smallest unit, while the vertical search takes the structured data as the minimum unit, and then stores the data into the database. Further processing, such as removing heavy, classifying, finally partitioning, indexing and searching to meet the needs of users. Advertising vertical search engines usually store tens of millions and even hundreds of millions of ad information data for mass data. When the number of indexes and searches is large to a certain extent, the efficiency of index queries and updates will gradually decrease, the pressure on the server will gradually increase, and the utilization rate of the whole search engine can be said to be getting lower and lower. With the difficulties of mass data storage, the design of a good distributed vertical search engine will be a key factor for the future development of a vertical search engine. How to push the most relevant ads to users and how to solve the problem of hundreds of millions of ads every day is the focus of this paper. In this paper, text content matching, long term user behavior analysis, short term user behavior analysis, association recommendation, fusion mode and other methods are used to push their favorite advertisements for users, and multi-node distributed computing system is adopted. It solves the problem of real-time request processing under super large scale concurrency. The multilevel data processing architecture is used to solve the statistical requirement of mass data, and an extensible service architecture is implemented with the increase of data volume and traffic.
【學位授予單位】:北京郵電大學
【學位級別】:碩士
【學位授予年份】:2012
【分類號】:TP311.52
【相似文獻】
相關期刊論文 前10條
1 相春雷;;2009年中國搜索引擎市場趨勢分析[J];軟件世界;2010年02期
2 ;揭秘搜索引擎收錄網(wǎng)站的秘密[J];計算機與網(wǎng)絡;2010年Z1期
3 劉治綱;葉水生;;基于多本體的搜索引擎框架設計[J];南昌航空大學學報(自然科學版);2011年02期
4 金凡;顧進廣;;一種改進的T-Spider分布式爬蟲[J];微電子學與計算機;2011年08期
5 馬s,
本文編號:2111682
本文鏈接:http://sikaile.net/kejilunwen/sousuoyinqinglunwen/2111682.html
最近更新
教材專著