PROBLEM TO BE SOLVED: To enable a lecturer to grasp a state of audience more easily.SOLUTION: A video acquisition unit 11 receives video captured by imaging audience. A face detection unit 12 detects the faces of people included in the video. A facial expression measurement unit 13, a line-of-sight measurement unit 14, and a nod measurement unit 15 measure facial expressions, lines of sight, and nodding states of the detected faces. An individual state estimation unit 165 estimates the state of each of the faces on the basis of the facial expressions, lines of sight, and nodding states as measured. A group state estimation unit 166 estimates the state of the whole audience on the basis of the state of each of the faces. A consolidated avatar generation unit 17 generates an avatar on the basis of the state of the whole audience. An avatar presentation device 3 displays the generated avatar.SELECTED DRAWING: Figure 2COPYRIGHT: (C)2020,JPO&INPIT【課題】講演者が聴衆の状態をより容易に把握する。【解決手段】映像取得部11が聴衆を撮影した映像を受け取り、顔検出部12が映像に含まれる人間の顔を検出し、表情計測部13、視線計測部14、及び頷き計測部15が、検出した顔の表情、視線、及び頷き状態を計測し、個人状態推定部165が計測した表情、視線、及び頷き状態に基づいて各顔の状態を推定し、集団状態推定部166が各顔の状態に基づいて聴衆全体の状態を推定し、集約アバター生成部17が聴衆全体の状態に基づいてアバターを生成し、アバター提示装置3が生成されたアバターを表示する。【選択図】図2