In an image processing apparatus, an extracting unit extracts mutually the same region of interest from each of a plurality of pieces of three-dimensional image data corresponding to mutually-different time phases. Further, a position determining unit determines, on the basis of feature points included in the pieces of three-dimensional image data, a position used for superimposing together the regions of interest extracted by the extracting unit from the pieces of three-dimensional image data, in substantially the same position of a subject. After that, a display controlling unit changes a display format of each of the regions of interest extracted by the extracting unit from the pieces of three-dimensional image data so as to be mutually different and causes a superimposed image to be displayed by superimposing the regions of interest together in the position determined by the position determining unit.