基于分類導向的三維聯(lián)合頭部姿態(tài)估計與人臉關鍵點定位
發(fā)布時間:2019-01-01 18:23
【摘要】:互聯(lián)網(wǎng)開啟了數(shù)字化的時代,利用機器學習和深度學習的方法提取大量數(shù)據(jù)信息中的高層知識進行學習以完成人機交互成為一大研究熱點。人機交互的關鍵首先在于根據(jù)不同的交互需求識別人體的特定部位的特征,生物識別作為一種快捷、友好的身份識別特征應運而生,F(xiàn)有的較為成熟的生物識別技術包括虹膜識別,指紋識別,語音識別,步態(tài)識別以及人臉識別等,其中人臉作為一種重要的生物識別特征,由于其具有提取便利以及非侵犯性的特點,更能夠被受試者所接受,也因此促使該領域的研究不斷成熟。研究人體的頭部姿態(tài),以及眼角、鼻尖、嘴巴、下巴等人臉關鍵點是人臉分析領域的關鍵性問題,這兩個問題已經(jīng)能夠在圖片上獲得不錯的結果,但是基于圖片的方法大都對光照的敏感性較強,且不能很好的處理具有大角度頭部偏轉的人臉以及遮擋的情況。由于三維掃描儀器制造成本的不斷降低,掃描數(shù)據(jù)精度的逐步提升,以及深度數(shù)據(jù)自身包含的豐富幾何信息這一特點,使得越來越多的研究者將深度信息應用于人臉分析領域。頭部姿態(tài)估計與人臉關鍵點定位常被分為兩個問題獨立研究,但是頭部姿態(tài)估計的結果可以為臉部關鍵點定位提供很好的空間變換信息,同時臉部關鍵點的結構又可以反映頭部姿態(tài)向量的數(shù)值,因此如何將兩者結合起來優(yōu)化是本文的一個核心問題。本文提出了一種基于分類導向的3D聯(lián)合頭部姿態(tài)估計與人臉關鍵點定位方法。首先,分類導向是指將頭部姿態(tài)空間分為若干個類,在各類中分別執(zhí)行臉部關鍵點定位算法。這樣做可以保證在同一姿態(tài)空間下,頭部點云數(shù)據(jù)缺失部位相對一致,對關鍵點定位算法穩(wěn)定性的提升大有幫助。其次,本文提出了聯(lián)合的概念,在級聯(lián)的隨機森林回歸框架下,將頭部姿態(tài)估計的結果與標記有關鍵點的人臉模板相結合,為級聯(lián)回歸的初始化階段提供一個很好的初值,且級聯(lián)過程中每一階段的關鍵點定位結果又可以反過來優(yōu)化頭部姿態(tài)向量。最后,本文給出了一個三維人臉數(shù)據(jù)庫,它包含了不同身份、不同表情以及不同的頭部姿態(tài)數(shù)據(jù),且給定了頭部姿態(tài)向量與人臉關鍵點形狀向量的真實值。豐富的實驗展示了本方法的有效性和高效性。本文中的方法在BIWI以及B3D(AC)2兩個常用的三維數(shù)據(jù)庫上均取得了較現(xiàn)有的方法更為精確的結果。另外,本文的方法也適用于其他涉及姿態(tài)估計和關鍵點定位領域,具有一定的泛化能力。
[Abstract]:The Internet has opened a digital era. It has become a research hotspot to extract the high-level knowledge from a large amount of data information to complete human-computer interaction by means of machine learning and in-depth learning. The key of human-computer interaction is to recognize the characteristics of specific parts of human body according to different interactive requirements. As a kind of quick and friendly identification features biometrics emerge as the times require. The existing biometrics include iris recognition, fingerprint recognition, speech recognition, gait recognition and face recognition, among which face is an important biometric feature. Because of its advantages of easy extraction and non-invasiveness, it can be accepted by the subjects, and the research in this field is becoming more and more mature. Studying the head posture of the human body, as well as the key points of the human face, such as the corners of the eye, nose tip, mouth, chin and so on, are the key problems in the field of face analysis. These two questions have been able to get good results in the pictures. However, most of the image-based methods are sensitive to illumination, and can not deal with face and occlusion with large angle head deflection. Because of the decreasing of manufacturing cost of 3D scanning instrument, the gradual improvement of scanning data precision and the rich geometric information contained in depth data, more and more researchers apply depth information to face analysis. Head pose estimation and face key point localization are often divided into two independent studies, but the results of head pose estimation can provide good spatial transformation information for face key point location. At the same time, the structure of the key points of the face can also reflect the value of the head attitude vector, so how to combine the two to optimize is a core problem in this paper. In this paper, a classification-oriented 3D joint head pose estimation and face key point location method is proposed. First of all, classification orientation refers to the head attitude space is divided into several classes, in which the face key point location algorithm is implemented. In the same attitude space, the missing position of the head point cloud data is relatively consistent, which is helpful to improve the stability of the key point location algorithm. Secondly, in this paper, the concept of joint is proposed. In the framework of cascaded stochastic forest regression, the result of head attitude estimation is combined with the face template with key points, which provides a good initial value for the initialization stage of cascade regression. The key point location results in each stage of the cascade process can be used to optimize the head attitude vector in turn. Finally, this paper presents a 3D face database, which includes different identities, different expressions and different head pose data, and gives the real values of head pose vector and face key point shape vector. Experiments show the effectiveness and efficiency of this method. The method in this paper is more accurate than the existing methods in BIWI and B3D (AC) 2. In addition, the method proposed in this paper is also applicable to other fields involving attitude estimation and key point location, and has a certain generalization ability.
【學位授予單位】:中國科學技術大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.41
本文編號:2397943
[Abstract]:The Internet has opened a digital era. It has become a research hotspot to extract the high-level knowledge from a large amount of data information to complete human-computer interaction by means of machine learning and in-depth learning. The key of human-computer interaction is to recognize the characteristics of specific parts of human body according to different interactive requirements. As a kind of quick and friendly identification features biometrics emerge as the times require. The existing biometrics include iris recognition, fingerprint recognition, speech recognition, gait recognition and face recognition, among which face is an important biometric feature. Because of its advantages of easy extraction and non-invasiveness, it can be accepted by the subjects, and the research in this field is becoming more and more mature. Studying the head posture of the human body, as well as the key points of the human face, such as the corners of the eye, nose tip, mouth, chin and so on, are the key problems in the field of face analysis. These two questions have been able to get good results in the pictures. However, most of the image-based methods are sensitive to illumination, and can not deal with face and occlusion with large angle head deflection. Because of the decreasing of manufacturing cost of 3D scanning instrument, the gradual improvement of scanning data precision and the rich geometric information contained in depth data, more and more researchers apply depth information to face analysis. Head pose estimation and face key point localization are often divided into two independent studies, but the results of head pose estimation can provide good spatial transformation information for face key point location. At the same time, the structure of the key points of the face can also reflect the value of the head attitude vector, so how to combine the two to optimize is a core problem in this paper. In this paper, a classification-oriented 3D joint head pose estimation and face key point location method is proposed. First of all, classification orientation refers to the head attitude space is divided into several classes, in which the face key point location algorithm is implemented. In the same attitude space, the missing position of the head point cloud data is relatively consistent, which is helpful to improve the stability of the key point location algorithm. Secondly, in this paper, the concept of joint is proposed. In the framework of cascaded stochastic forest regression, the result of head attitude estimation is combined with the face template with key points, which provides a good initial value for the initialization stage of cascade regression. The key point location results in each stage of the cascade process can be used to optimize the head attitude vector in turn. Finally, this paper presents a 3D face database, which includes different identities, different expressions and different head pose data, and gives the real values of head pose vector and face key point shape vector. Experiments show the effectiveness and efficiency of this method. The method in this paper is more accurate than the existing methods in BIWI and B3D (AC) 2. In addition, the method proposed in this paper is also applicable to other fields involving attitude estimation and key point location, and has a certain generalization ability.
【學位授予單位】:中國科學技術大學
【學位級別】:碩士
【學位授予年份】:2017
【分類號】:TP391.41
【參考文獻】
相關期刊論文 前1條
1 Zhe Zhu;Ralph R.Martin;Robert Pepperell;Alistair Burleigh;;3D modeling and motion parallax for improved videoconferencing[J];Computational Visual Media;2016年02期
,本文編號:2397943
本文鏈接:http://sikaile.net/kejilunwen/ruanjiangongchenglunwen/2397943.html
最近更新
教材專著