A tomographic-image pickup unit is controlled so as to capture a tomographic image in response to a signal input from a signal input unit. Then, a display unit is controlled so as to display the captured tomographic image. An eyeground-image pickup unit is controlled so as to capture a two-dimensional image in response to a signal input from the signal input unit while the tomographic image is displayed on the display unit. Therewith, the user can more easily perform imaging, and the time load on the subject is reduced.