In order to obtain an OCT image from which speckle noise is effectively removed, an eye fundus observation device (1) divides low-coherence light (LO) into signal light (LS) and reference light (LR), generates coherent light (LC) by superimposing the signal light (LS) which has traveled via the eye fundus (Ef) and the reference light (LR) which has traveled via a reference light path, detects the coherent light, and forms the tomographic image of the eye fundus (Ef) on the basis of the result of the detection. A scan control unit (212) controls the motions of galvanometer mirrors (43, 44) via a scan drive unit (70) to form a plurality of tomogarphic images by scanning the signal light (LS) repeatedly a predetermined number of times along a plurality of nearby scan lines (R1-R3). An averaging unit (231) forms an averaged tomographic image by aligning the tomographic images and averaging pixel values at each pixel position.