PROBLEM TO BE SOLVED: To provide a device allowing articulation training excluding a McGurk effect.SOLUTION: The video/sound recording system of the present invention includes: a microphone for collecting sounds; storing means for storing a character or picture as an articulation object for training; a camera for photographing a patient; a display for displaying an image; a speaker for reproducing a sound; and a control unit for performing control so that the character or picture read from the storing means is displayed on a display screen 12a of the display 12, an image of the patient photographed by the camera is not displayed, and sounds are collected by the microphone. When the collected sounds are included a desired articulation, the image of the patient including his/her mouth shape is displayed, and the collected sounds are reproduced from the speaker.