An image processing device includes an image sequence acquisition section (200) that acquires an input image sequence that includes first to N-th images, and a processing section (100) that performs an image summarization process that deletes some of the first to N-th images to generate a summary image sequence, the processing section (100) selecting an s-th (s is an integer that satisfies 0‰¤s‰¤N+1) image to be a provisional summary image, selecting a t-th (t is an integer that satisfies 0‰¤t‰¤s-1) image to be a provisional preceding summary image, selecting a u-th (u is an integer that satisfies t