您的位置: 首页 > 外文期刊论文 > 详情页

Expression-Preserving Face Frontalization Improves Visually Assisted Speech Processing

作   者:
Kang, ZhiqiSadeghi, MostafaHoraud, RaduAlameda-Pineda, Xavier
作者机构:
Univ Grenoble AlpesInria Nancy Grand Est
关键词:
Student's t-distribution3DMIXTUREVariational auto-encodersALIGNMENTDATABASEFace frontalizationRobust point registrationMODELBayesian filteringLip readingAudio-visual speech enhancement
期刊名称:
International Journal of Computer Vision
i s s n:
0920-5691
年卷期:
2023 年 131 卷 5 期
页   码:
1122-1140
页   码:
摘   要:
Face frontalization consists of synthesizing a frontal view from a profile one. This paper proposes a frontalization method that preserves non-rigid facial deformations, i.e. facial expressions. It is shown that expression-preserving frontalization boosts the performance of visually assisted speech processing. The method alternates between the estimation of (i) the rigid transformation (scale, rotation, and translation) and (ii) the non-rigid deformation between an arbitrarily-viewed face and a face model. The method has two important merits: it can deal with non-Gaussian errors in the data and it incorporates a dynamical face deformation model. For that purpose, we use the Student's t-distribution in combination with a Bayesian filter in order to account for both rigid head motions and time-varying facial deformations, e.g. caused by speech production. The zero-mean normalized cross-correlation score is used to evaluate the ability of the method to preserve facial expressions. The method is thoroughly evaluated and compared with several state of the art methods, either based on traditional geometric models or on deep learning. Moreover, we show that the method, when incorporated into speech processing pipelines, improves word recognition rates and speech intelligibility scores by a considerable margin.
相关作者
载入中,请稍后...
相关机构
    载入中,请稍后...
应用推荐

意 见 箱

匿名:登录

个人用户登录

找回密码

第三方账号登录

忘记密码

个人用户注册

必须为有效邮箱
6~16位数字与字母组合
6~16位数字与字母组合
请输入正确的手机号码

信息补充