A three-dimensional tomographic image (B) is formed which is composed of a plurality of two-dimensional tomographic images obtained by scanning an ocular fundus. A contour of a certain 2D region (M1, M2, M3, M4) in the tomographic image is determined for each tomographic image, and the volume of a certain 3D region is calculated through correcting each area of the certain 2D region defined by the determined contour or its accumulated value using an image correction coefficient in accordance with the diopter of the subject's eye. Even for subjects' eyes of different diopters, the influence of the diopter correction is eliminated and a quantitative comparison of subjects' eyes of different diopters is possible.