An endoscope image processing apparatus includes a region-of-interest detection apparatus configured to sequentially receive observation images obtained by performing image pickup of an object and perform processing for detecting a region of interest for each of the observation images, and a processor. The processor calculates an appearance time period as an elapsed time period from a time when the region of interest appears within the observation image when the region-of-interest detection apparatus detects the region of interest, and starts emphasis processing for emphasizing a position of the region of interest existing within the observation image at a timing at which the appearance time period reaches a predetermined time period.