The image processing device includes: a first image acquisition section that acquires a first image, the first image being an image that includes an object image including information within a wavelength band of white light a second image acquisition section that acquires a second image, the second image being an image that includes an object image including information within a specific wavelength band a candidate attention area detection section that detects a candidate attention area based on a feature quantity of each pixel within the second image, the candidate attention area being a candidate for an attention area a reliability calculation section that calculates reliability that indicates a likelihood that the candidate attention area is the attention area and a display mode setting section that performs a display mode setting process that sets a display mode of an output image corresponding to the reliability calculated by the reliability calculation section.