海量音頻指紋數(shù)據(jù)的存儲(chǔ)與檢索研究
發(fā)布時(shí)間:2018-04-19 01:05
本文選題:音頻指紋 + 海量數(shù)據(jù) ; 參考:《天津大學(xué)》2014年碩士論文
【摘要】:隨著大數(shù)據(jù)時(shí)代的到來,尤其是包括圖像、音頻、視頻在內(nèi)的海量多媒體數(shù)據(jù),這些數(shù)據(jù)亟需被有效地管理起來,并為廣大用戶提供方便、快捷的檢索方式。隨著模式識別、機(jī)器學(xué)習(xí)、云計(jì)算技術(shù)的發(fā)展,基于內(nèi)容的多媒體檢索技術(shù)應(yīng)運(yùn)而生,這種技術(shù)的出現(xiàn)使得信息檢索不再依賴于數(shù)據(jù)的標(biāo)簽和關(guān)鍵字,而且搜索結(jié)果更為準(zhǔn)確,搜索方式更為便捷。 音頻數(shù)據(jù)作為多媒體中重要組成部分,其數(shù)據(jù)規(guī)模也在迅速膨脹,,人們面臨的問題不再是缺少多媒體信息,而是如何在海量的數(shù)據(jù)中找到自己所需要的信息。如何快速有效的檢索海量音頻成為當(dāng)前學(xué)術(shù)界和工業(yè)界信息檢索研究領(lǐng)域的一個(gè)重要課題。 音頻指紋檢索技術(shù)是基于音頻內(nèi)容的信息檢索方式,通過對未知音頻片段提取名為音頻指紋的數(shù)字特征,然后在事先準(zhǔn)備的海量音頻指紋數(shù)據(jù)庫中進(jìn)行音頻指紋的搜索與相似度計(jì)算,獲得音頻詳細(xì)信息的方法。這種方法解決了傳統(tǒng)基于文本關(guān)鍵字的搜索音頻存在的文本標(biāo)注不全、錯(cuò)誤等問題,同時(shí)解決了用戶不知道關(guān)鍵詞時(shí)無從下手搜索的困難。 音頻指紋提取與匹配算法已經(jīng)在實(shí)驗(yàn)室中取得了豐碩的成果,并在部分產(chǎn)品中得到了應(yīng)用,但所處理的數(shù)據(jù)集規(guī)模相對較小。在應(yīng)用到大規(guī)模數(shù)據(jù)集時(shí)會(huì)遇到性能瓶頸,以及并發(fā)性、擴(kuò)展性等問題。 本文在對音頻指紋提取與匹配算法的深入研究基礎(chǔ)上,對海量音頻指紋數(shù)據(jù)的存儲(chǔ)與檢索進(jìn)行了設(shè)計(jì)、實(shí)現(xiàn)及優(yōu)化。首先提出了基于哈希結(jié)構(gòu)的音頻指紋存儲(chǔ)結(jié)構(gòu),然后進(jìn)一步提出了兩種分布式哈希解決方案,并通過實(shí)驗(yàn)證明了所設(shè)計(jì)的方法的有效性。在此基礎(chǔ)上,本文又提出了一種海量音頻指紋數(shù)據(jù)的序列化分布式存儲(chǔ)方案,并再一次通過實(shí)驗(yàn)證明了其有效性。 本文所設(shè)計(jì)的存儲(chǔ)結(jié)構(gòu)和分布式存儲(chǔ)檢索方案具有多級并發(fā)、高性能、可容錯(cuò)、易擴(kuò)展等特點(diǎn),對于構(gòu)建海量音頻指紋檢索系統(tǒng)具有實(shí)際價(jià)值,對于推進(jìn)音頻指紋檢索技術(shù)在社會(huì)中的應(yīng)用具有重要意義。
[Abstract]:With the arrival of big data era, especially the mass of multimedia data including image, audio and video, these data need to be effectively managed, and provide convenient and fast retrieval methods for the majority of users.With the development of pattern recognition, machine learning and cloud computing technology, content-based multimedia retrieval technology emerges as the times require, which makes information retrieval no longer rely on data tags and keywords, and the search results are more accurate.Search is more convenient.Audio data as an important part of multimedia, its data scale is expanding rapidly, people are faced with the problem is no longer lack of multimedia information, but how to find their own information in the mass of data.How to retrieve massive audio quickly and effectively has become an important topic in the field of information retrieval in academia and industry.Audio fingerprint retrieval is an information retrieval method based on audio content.Then, the search and similarity calculation of audio fingerprint are carried out in the massive audio fingerprint database prepared in advance, and the method of obtaining audio detail information is presented.This method solves the problems of incomplete text tagging and error in traditional text keyword based search audio, and solves the difficulty of searching when users do not know key words.The algorithm of audio fingerprint extraction and matching has achieved a lot in the laboratory and has been applied in some products, but the scale of the data set is relatively small.Performance bottlenecks, concurrency and expansibility are encountered when applying to large data sets.Based on the deep research of audio fingerprint extraction and matching algorithm, this paper designs, implements and optimizes the storage and retrieval of massive audio fingerprint data.Firstly, the audio fingerprint storage structure based on hash structure is proposed, and then two distributed hash solutions are proposed. The experimental results show that the proposed method is effective.On this basis, a serialized distributed storage scheme for massive audio fingerprint data is proposed, and its effectiveness is proved by experiments again.The storage structure and the distributed storage retrieval scheme designed in this paper have the characteristics of multilevel concurrency, high performance, fault-tolerant, easy to extend, etc., which have practical value for constructing massive audio fingerprint retrieval system.It is of great significance to promote the application of audio fingerprint retrieval technology in society.
【學(xué)位授予單位】:天津大學(xué)
【學(xué)位級別】:碩士
【學(xué)位授予年份】:2014
【分類號】:TP333;TP391.3
【參考文獻(xiàn)】
相關(guān)期刊論文 前3條
1 李超;熊璋;朱成軍;;基于距離相關(guān)圖的音頻相似性度量方法[J];北京航空航天大學(xué)學(xué)報(bào);2006年02期
2 王磊;杜利民;王勁林;;基于音頻的電視新聞節(jié)目的主題檢索和聚類[J];電子與信息學(xué)報(bào);2007年10期
3 李國輝,李恒峰;基于內(nèi)容的音頻檢索:概念和方法[J];小型微型計(jì)算機(jī)系統(tǒng);2000年11期
本文編號:1770880
本文鏈接:http://sikaile.net/kejilunwen/jisuanjikexuelunwen/1770880.html
最近更新
教材專著