An image processing device includes a processor comprising hardware, wherein the processor is configured to execute: acquiring intraluminal images; generating, for each of the intraluminal images, lesion information by estimating a visual point with respect to a lesion region extracted from each of the intraluminal images and analyzing a three-dimensional structure of the lesion, the lesion information indicating any of a top portion, a rising portion, and a marginal protruding portion in the lesion region; and extracting, based on the lesion information, a target image satisfying a prescribed condition from the intraluminal images.