招投標(biāo)信息監(jiān)測(cè)系統(tǒng)設(shè)計(jì)及實(shí)現(xiàn)
本文關(guān)鍵詞: 招標(biāo)信息監(jiān)測(cè) 中標(biāo)信息監(jiān)測(cè) Mysql 蜘蛛爬蟲 ACE中間件 出處:《吉林大學(xué)》2014年碩士論文 論文類型:學(xué)位論文
【摘要】:隨著國內(nèi)經(jīng)濟(jì)的深入發(fā)展,隨著《招投標(biāo)法》的貫徹與落實(shí),政府不斷加強(qiáng)透明與反腐倡廉,積極促進(jìn)了招投標(biāo)市場(chǎng)的巨大發(fā)展,隨著市場(chǎng)經(jīng)濟(jì)的深入發(fā)展,企業(yè)也越來越多通過公開招標(biāo)的方式選擇合作伙伴。 招投標(biāo)企業(yè)數(shù)量越來越多,招標(biāo)信息發(fā)布平臺(tái)也越來越多---各省、市、縣區(qū)都建有公開發(fā)布的招標(biāo)信息。市場(chǎng)正面臨著供給與需求的不對(duì)稱,標(biāo)訊發(fā)布方希望有實(shí)力參與的企業(yè)能夠進(jìn)行應(yīng)標(biāo),潛在應(yīng)標(biāo)企業(yè)也希望能夠跨越地域區(qū)隔,第一時(shí)間獲悉自己具有競(jìng)爭(zhēng)有事的招標(biāo)項(xiàng)目,并及時(shí)參與公開競(jìng)標(biāo)。面對(duì)浩如煙海的過剩信息,從中篩選出企業(yè)關(guān)注的招標(biāo)信息并不容易,這造成了供需信息的不對(duì)稱—標(biāo)訊發(fā)布方不能找到足夠多的優(yōu)質(zhì)競(jìng)標(biāo)方,,想要參與競(jìng)標(biāo)的優(yōu)秀企業(yè)及時(shí)找到自己有優(yōu)勢(shì)的招標(biāo)項(xiàng)目并不容易。市場(chǎng)需要這樣一個(gè)信息化程度高的智能招投標(biāo)信息監(jiān)測(cè)系統(tǒng)平臺(tái)來解決信息嚴(yán)重不對(duì)稱的問題。 本系統(tǒng)能夠?qū)W(wǎng)絡(luò)上的超過2000家招標(biāo)、中標(biāo)信息發(fā)布平臺(tái)進(jìn)行實(shí)時(shí)監(jiān)測(cè),通過系統(tǒng)自動(dòng)、智能獲取標(biāo)訊對(duì)應(yīng)的結(jié)構(gòu)化內(nèi)容,并與客戶關(guān)注的內(nèi)容進(jìn)行匹配,讓用戶通過該系統(tǒng)平臺(tái),即時(shí)找到優(yōu)質(zhì)競(jìng)標(biāo)項(xiàng)目。 系統(tǒng)自動(dòng)分析和識(shí)別標(biāo)訊內(nèi)容,如:標(biāo)訊發(fā)布地區(qū)--大區(qū)/省/直轄市/地級(jí)市、標(biāo)訊公告名稱、標(biāo)訊文件下載地址、標(biāo)訊快照、采購方式、招標(biāo)機(jī)構(gòu)、招標(biāo)內(nèi)容、項(xiàng)目預(yù)算、發(fā)標(biāo)時(shí)間、開標(biāo)時(shí)間、來源等。監(jiān)測(cè)系統(tǒng)通過結(jié)構(gòu)化數(shù)據(jù)結(jié)構(gòu)化數(shù)據(jù)庫實(shí)時(shí)存儲(chǔ),供客戶隨時(shí)通過各種終端在線使用(瀏覽、檢索、導(dǎo)出)。 系統(tǒng)除了對(duì)招標(biāo)信息進(jìn)行監(jiān)測(cè)外,還提供對(duì)招標(biāo)信息的行業(yè)、地域等進(jìn)行統(tǒng)計(jì)分析和統(tǒng)計(jì)分析以及對(duì)中標(biāo)信息提供監(jiān)測(cè)和統(tǒng)計(jì)分析服務(wù)。 系統(tǒng)采用了通用蜘蛛框架算法,采用多線程模式,設(shè)計(jì)簡潔穩(wěn)定,可以支持單服務(wù)器多進(jìn)程部署,或者多服務(wù)器分布式部署。抓取全國數(shù)千家標(biāo)訊站點(diǎn)的數(shù)據(jù),半小時(shí)內(nèi)完成,系統(tǒng)性能穩(wěn)定,支撐國內(nèi)標(biāo)訊抓取已經(jīng)足夠用,在下一階段的升級(jí)版本,我們希望將競(jìng)爭(zhēng)情報(bào)以及國外標(biāo)訊納入監(jiān)測(cè)范圍,會(huì)對(duì)蜘蛛框架及算法進(jìn)行升級(jí),考慮使用云架構(gòu),基于Hadoop系統(tǒng)進(jìn)行蜘蛛集群部署。 BMS Spider招投標(biāo)監(jiān)測(cè)蜘蛛子系統(tǒng)是采用事件驅(qū)動(dòng),流水線作業(yè)的多線程蜘蛛系統(tǒng),使用了ACE的stream、Task等模式及ACE Socket Wrapper Fa ade實(shí)現(xiàn)對(duì)數(shù)千家標(biāo)訊發(fā)布站點(diǎn)進(jìn)行定點(diǎn)抓取、過濾、識(shí)別、索引與存儲(chǔ)等功能。數(shù)據(jù)存儲(chǔ)進(jìn)入標(biāo)訊數(shù)據(jù)庫,供前臺(tái)User Platform用戶標(biāo)訊使用平臺(tái)調(diào)用。 系統(tǒng)啟動(dòng)后會(huì)從標(biāo)訊站點(diǎn)字典中讀取種子URL列表,并壓入待抓取標(biāo)訊URL隊(duì)列中,蜘蛛讀取待抓取URL現(xiàn)成從此隊(duì)列抽取URL后進(jìn)行DNS解析并通過網(wǎng)頁下載線程對(duì)該標(biāo)訊URL進(jìn)行抓取,成功抓取的標(biāo)訊內(nèi)容頁是HTML代碼,經(jīng)過編碼、HTML內(nèi)容解析后,過濾掉導(dǎo)航信息、廣告、版權(quán)等無效信息后,將種子URL頁面內(nèi)標(biāo)訊鏈接進(jìn)行識(shí)別,識(shí)別符合規(guī)則的標(biāo)訊鏈接進(jìn)入標(biāo)訊URL列表庫,并通過標(biāo)訊URL鏈接發(fā)射線程根據(jù)待抓取URL隊(duì)列長度,實(shí)時(shí)補(bǔ)充進(jìn)入待抓取URL隊(duì)列列表;標(biāo)訊URL抓取成功后,通過3.11圖示的流程進(jìn)行Dom節(jié)點(diǎn)標(biāo)注,通過過濾算法過濾掉無效信息后,進(jìn)行結(jié)構(gòu)化標(biāo)訊數(shù)據(jù)分析與提取,提取后的結(jié)構(gòu)化標(biāo)訊,存儲(chǔ)進(jìn)入進(jìn)入Mysql標(biāo)訊內(nèi)容數(shù)據(jù)庫中,供客戶端調(diào)用。
[Abstract]:With the further development of the domestic economy, with the implement of "Bidding Law", the government continues to strengthen transparency and anti-corruption, and actively promote the great development of the bidding market, with the development of market economy, more and more enterprises through public bidding to choose partners.
The bidding number of enterprises more and more, more and more tender information publishing platform, the provinces, city, counties have built public bidding information. The market is facing the asymmetry of supply and demand, news release party hope to have the strength in the enterprise can be marked, the potential should be the standard of enterprise also hope to across geographical area the first time that he has historic, competitive advantage of the tender project, and timely public bidding. In the face of the multitude of excess information, selected from the enterprises concerned the bidding information is not easy, which caused the supply and demand information asymmetry as news publishers cannot find enough quality bidders, to find their own advantage the bidding project outstanding enterprises want to participate in the bidding is not easy. The market needs such a high degree of information intelligent bidding platform bidding information monitoring system to solve information A serious problem of asymmetry.
This system can on the network more than 2000 tenders, bid information release platform for real-time monitoring, the system automatically, the corresponding intelligent access to structured content notices, and match the customer's attention, let the user through the system platform, real-time quality bid was found.
System of automatic analysis and recognition of news content, such as news release area -- region / Province / municipality / City, news bulletin, news file download address, notices snapshot, procurement, bidding agency tender, project budget, issuing time, opening time, source monitoring system through structured data structure. The real-time database storage for customers at any time through a variety of terminal use (online browsing, retrieval, export).
In addition to monitoring the bidding information, the system also provides statistical analysis and statistical analysis for the industry and region of the bidding information, and provides monitoring and statistical analysis services for the winning bid information.
The system uses the universal spider framework algorithm, multi thread mode design is simple and stable, can support the deployment of single or multi process server, distributed multi server deployment. Grasping the thousands of news site data, completed within half an hour, the system performance is stable, support domestic grab enough notices in the upgraded version of the next stage. We hope that the competitive intelligence, and foreign news will be included in the scope of monitoring, to upgrade the spider framework and algorithm, consider the use of Cloud Architecture, Hadoop system based on spider cluster deployment.
BMS Spider bidding monitoring subsystem is the event driven spider spider, multi thread pipelining system, using ACE stream, Task ACE Socket Wrapper Fa ade model and Realization of thousands of home news release sites are designated capture, filtering, identification, indexing and storage. Data is stored into the database for tenders. The User Platform user notices using platform invoke.
The system will start to read the list from the news site URL seed dictionary, and pressed to grab notices in the URL queue, the spider crawl URL from the queue to be read off from URL after DNS analysis and through the website of the news thread crawl URL, successfully grab notices content page is HTML code, after encoding. HTML content analysis, filtering out the navigation information, advertising, copyright information is invalid, the seeds of the URL page link to tenders in accordance with the rules of recognition, recognition standard news link into the URL list and through tenders, tenders URL link process according to URL ray capture real-time queue length, added to be captured URL queue list; URL news crawl after the success of the Dom node marked through 3.11 graphic process, through filtering algorithm to filter out invalid information, analyze and extract structured data structure after extraction of the news, Enter the storage of tenders, tenders Mysql content database, for the clients.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:TP274
【相似文獻(xiàn)】
相關(guān)期刊論文 前10條
1 王璐;趙永建;;淺析手機(jī)上網(wǎng)不良信息監(jiān)測(cè)實(shí)現(xiàn)方案[J];郵電設(shè)計(jì)技術(shù);2011年09期
2 熊新農(nóng);黃堅(jiān);;牲畜生命信息監(jiān)測(cè)雷達(dá)的信號(hào)處理與仿真[J];計(jì)算機(jī)仿真;2011年11期
3 王戰(zhàn)備;;基于ZigBee的農(nóng)田信息監(jiān)測(cè)網(wǎng)絡(luò)設(shè)計(jì)[J];國外電子測(cè)量技術(shù);2013年08期
4 何光虹;趙英凱;李彥文;;網(wǎng)絡(luò)信息監(jiān)測(cè)采集技術(shù)在中醫(yī)藥情報(bào)研究中的應(yīng)用[J];醫(yī)學(xué)信息(上旬刊);2011年09期
5 蔣安波;張福生;;基于ZigBee的油田信息監(jiān)測(cè)云系統(tǒng)[J];電子技術(shù)與軟件工程;2013年24期
6 王承統(tǒng);;云計(jì)算在礦區(qū)信息監(jiān)測(cè)管理中的應(yīng)用[J];中國高新技術(shù)企業(yè);2014年24期
7 鄭少雄;;基于WSN的農(nóng)田環(huán)境信息監(jiān)測(cè)技術(shù)研究現(xiàn)狀與分析[J];農(nóng)業(yè)網(wǎng)絡(luò)信息;2013年08期
8 李麗蓉;;網(wǎng)絡(luò)社會(huì)的信息傳播模式及不良信息監(jiān)測(cè)技術(shù)[J];山西警官高等?茖W(xué)校學(xué)報(bào);2012年02期
9 ;世界上最小的擴(kuò)音器[J];功能材料信息;2012年01期
10 孫丹峰,周光源,楊冀紅;變化信息監(jiān)測(cè)的時(shí)域IHS變換[J];國土資源遙感;2000年03期
相關(guān)會(huì)議論文 前1條
1 沈蘭蓀;;互聯(lián)網(wǎng)信息監(jiān)測(cè)過濾儀器關(guān)鍵技術(shù)研究[A];第三屆科學(xué)儀器前沿技術(shù)及應(yīng)用學(xué)術(shù)研討會(huì)論文摘要集[C];2006年
相關(guān)重要報(bào)紙文章 前8條
1 張?jiān)?我省市場(chǎng)信息監(jiān)測(cè)與全國對(duì)接[N];山西日?qǐng)?bào);2006年
2 記者 來慶琳;消費(fèi)市場(chǎng)信息監(jiān)測(cè)更趨合理[N];安康日?qǐng)?bào);2009年
3 本報(bào)記者 江娜;沒有生產(chǎn)信息監(jiān)測(cè)就像少了一條腿[N];農(nóng)民日?qǐng)?bào);2011年
4 陳偉豐 肖群鷹 福建總隊(duì)干部處;強(qiáng)化敏感信息監(jiān)測(cè)[N];人民武警報(bào);2012年
5 ;新聞背景[N];計(jì)算機(jī)世界;2003年
6 記者 王科巖 實(shí)習(xí)生 王雨;全國就業(yè)信息監(jiān)測(cè)數(shù)據(jù)質(zhì)量評(píng)估:內(nèi)蒙古位列第一[N];呼和浩特日?qǐng)?bào)(漢);2012年
7 記者 鐘國斌;證監(jiān)會(huì)擬籌建虛假信息監(jiān)測(cè)網(wǎng)絡(luò)[N];深圳商報(bào);2013年
8 記者 馬婧妤 郭玉志;證監(jiān)會(huì)擬籌建虛假信息監(jiān)測(cè)網(wǎng)絡(luò) 加快研發(fā)行情異動(dòng)監(jiān)測(cè)系統(tǒng)[N];上海證券報(bào);2013年
相關(guān)碩士學(xué)位論文 前6條
1 張霞;森林信息監(jiān)測(cè)用無線傳感器網(wǎng)絡(luò)節(jié)點(diǎn)的研究[D];北京林業(yè)大學(xué);2010年
2 劉海波;動(dòng)態(tài)Web信息監(jiān)測(cè)相關(guān)技術(shù)研究[D];哈爾濱工業(yè)大學(xué);2011年
3 陸婷婷;弧焊過程信息監(jiān)測(cè)及工程化應(yīng)用研究[D];北京工業(yè)大學(xué);2014年
4 吳銘;內(nèi)網(wǎng)出口信息監(jiān)測(cè)系統(tǒng)方案設(shè)計(jì)與實(shí)現(xiàn)[D];四川大學(xué);2006年
5 代媛;基于ZigBee無線傳感器網(wǎng)絡(luò)的農(nóng)田信息監(jiān)測(cè)研究[D];西北農(nóng)林科技大學(xué);2010年
6 景麗芳;招投標(biāo)信息監(jiān)測(cè)系統(tǒng)設(shè)計(jì)及實(shí)現(xiàn)[D];吉林大學(xué);2014年
本文編號(hào):1555728
本文鏈接:http://sikaile.net/wenyilunwen/guanggaoshejilunwen/1555728.html