In an image processing apparatus, a display control unit controls displaying such that a subtraction image between a plurality of fundus images of an eye corresponding to a plurality of three-dimensional tomographic images obtained by capturing images of the eye at different times is displayed on a display unit, and a specifying unit specifies a plurality of two-dimensional tomographic images, in the plurality of three-dimensional tomographic images, to be displayed on the display unit, by a position specified on the displayed subtraction image.