基于內(nèi)容的音頻哼唱識(shí)別及檢索系統(tǒng)
發(fā)布時(shí)間:2018-10-04 22:17
【摘要】: 在這個(gè)音視頻數(shù)字化的時(shí)代,數(shù)字影視、數(shù)字音樂(lè)、數(shù)字動(dòng)漫等多媒體已經(jīng)大量的進(jìn)入我們的生活。在數(shù)據(jù)庫(kù)中,多媒體文件(例如歌曲)都是用它們的名字、作者、歌手等等來(lái)索引的,然而,人們對(duì)歌曲旋律的印象往往比名字、作者、歌手等等更深。隨著多媒體數(shù)據(jù)庫(kù)越來(lái)越龐大,數(shù)據(jù)的文字索引(名稱(chēng)、作者等)越來(lái)越多,人們不可能完全記住。因此,內(nèi)涵式查詢(xún)就突顯出了其重要性與必要性。本文介紹了關(guān)于數(shù)字音頻的哼唱識(shí)別系統(tǒng)的開(kāi)發(fā)以及相關(guān)理論研究工作,詳細(xì)討論了在音頻哼唱識(shí)別中的各部分的關(guān)鍵技術(shù),并且實(shí)現(xiàn)了可用于演示的音頻哼唱識(shí)別系統(tǒng)DEMO。 在整個(gè)研發(fā)過(guò)程中,我們總共在兩個(gè)平臺(tái)上進(jìn)行:PC平臺(tái)和Altera公司的DE2嵌入式平臺(tái)。我們首先分別在PC上和DE2驗(yàn)證板上實(shí)現(xiàn)了基于20首歌的哼唱識(shí)別,進(jìn)行了充分的實(shí)驗(yàn)和參數(shù)調(diào)整,實(shí)現(xiàn)了關(guān)于特征提取、噪聲去除、特征值識(shí)別等課題,最終在DE2板子上得出了比較高的識(shí)別率和較好的運(yùn)行時(shí)間。接下來(lái)主要在PC上研發(fā),基于30多首不到的音樂(lè)建立一個(gè)有效的部分哼唱識(shí)別系統(tǒng),同時(shí)對(duì)基礎(chǔ)音的歸一化算法、改進(jìn)的DTW算法進(jìn)行了研究。我們基于“首尾靠近”的先驗(yàn)條件,創(chuàng)造性地提出了利用正反兩次DTW進(jìn)行部分匹配的識(shí)別算法,并對(duì)該算法的時(shí)間復(fù)雜度、有效性、兼容性進(jìn)行了深入的分析和研究。得到比較令人滿(mǎn)意的結(jié)果:PC平臺(tái)上52首樂(lè)段利用部分匹配算法可以達(dá)到85%左右的搜索成功率,相比較不支持部分匹配的48%的識(shí)別率是有了很大進(jìn)步。而且正反DTW方法在時(shí)間復(fù)雜度上并沒(méi)有太大的損失,運(yùn)行時(shí)間僅僅是整體匹配方法的約1.5倍,同時(shí)它還保留了對(duì)整體匹配優(yōu)秀的兼容性,完全滿(mǎn)足實(shí)際的要求。
[Abstract]:In this digital audio and video era, digital video, digital music, digital animation and other multimedia has entered our lives. In databases, multimedia files (such as songs) are indexed by their names, authors, singers, etc. However, people tend to be more impressed with the melody of songs than names, authors, singers, etc. As multimedia databases become larger and more text indexes (names, authors, etc.) become more and more, it is impossible to fully remember them. Therefore, the implicit query highlights its importance and necessity. This paper introduces the development of digital audio humming recognition system and related theoretical research work, discusses the key technologies of each part of audio humming recognition in detail, and implements the audio humming recognition system DEMO., which can be used for demonstration. In the whole research and development process, we have two platforms: PC and Altera DE2 embedded platform. First of all, we have realized the humming recognition based on 20 songs on PC and DE2 verification board, carried on the full experiment and the parameter adjustment, has realized about the feature extraction, the noise removal, the characteristic value recognition and so on. Finally, the higher recognition rate and better running time are obtained on the DE2 board. Then it is mainly developed on PC, based on more than 30 pieces of music to establish an effective partial humming recognition system. At the same time, the normalization algorithm of basic sound and the improved DTW algorithm are studied. Based on the priori condition of "front and tail approach", we creatively propose a partial matching recognition algorithm using positive and negative DTW, and analyze and study the time complexity, validity and compatibility of the algorithm. The results show that the partial matching algorithm can be used in 52 segments on the platform of: PC to achieve a search success rate of about 85%, and the recognition rate of 48% which does not support partial matching has been greatly improved. Moreover, there is no great loss in the time complexity of the forward and inverse DTW method, and the running time is only about 1.5 times that of the global matching method. At the same time, it also retains the excellent compatibility of the global matching and fully meets the actual requirements.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2008
【分類(lèi)號(hào)】:TP391.42
本文編號(hào):2252086
[Abstract]:In this digital audio and video era, digital video, digital music, digital animation and other multimedia has entered our lives. In databases, multimedia files (such as songs) are indexed by their names, authors, singers, etc. However, people tend to be more impressed with the melody of songs than names, authors, singers, etc. As multimedia databases become larger and more text indexes (names, authors, etc.) become more and more, it is impossible to fully remember them. Therefore, the implicit query highlights its importance and necessity. This paper introduces the development of digital audio humming recognition system and related theoretical research work, discusses the key technologies of each part of audio humming recognition in detail, and implements the audio humming recognition system DEMO., which can be used for demonstration. In the whole research and development process, we have two platforms: PC and Altera DE2 embedded platform. First of all, we have realized the humming recognition based on 20 songs on PC and DE2 verification board, carried on the full experiment and the parameter adjustment, has realized about the feature extraction, the noise removal, the characteristic value recognition and so on. Finally, the higher recognition rate and better running time are obtained on the DE2 board. Then it is mainly developed on PC, based on more than 30 pieces of music to establish an effective partial humming recognition system. At the same time, the normalization algorithm of basic sound and the improved DTW algorithm are studied. Based on the priori condition of "front and tail approach", we creatively propose a partial matching recognition algorithm using positive and negative DTW, and analyze and study the time complexity, validity and compatibility of the algorithm. The results show that the partial matching algorithm can be used in 52 segments on the platform of: PC to achieve a search success rate of about 85%, and the recognition rate of 48% which does not support partial matching has been greatly improved. Moreover, there is no great loss in the time complexity of the forward and inverse DTW method, and the running time is only about 1.5 times that of the global matching method. At the same time, it also retains the excellent compatibility of the global matching and fully meets the actual requirements.
【學(xué)位授予單位】:上海交通大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2008
【分類(lèi)號(hào)】:TP391.42
【相似文獻(xiàn)】
相關(guān)碩士學(xué)位論文 前1條
1 陳旭;基于內(nèi)容的音頻哼唱識(shí)別及檢索系統(tǒng)[D];上海交通大學(xué);2008年
,本文編號(hào):2252086
本文鏈接:http://sikaile.net/wenyilunwen/dongmansheji/2252086.html
最近更新
教材專(zhuān)著