A base image includes a B image signal in which ductal structure is brighter than mucous membrane and capillary vessels are darker than the mucous membrane. The B image signal is subjected to a frequency filtering process for extracting frequency components including the ductal structure and the capillary vessels. Thereby, a structure-extracted image signal, in which a pixel value of the ductal structure is a positive value and a pixel value of the capillary vessels is a negative value, is generated. Based on the structure-extracted image signal, a display controlling image to be used for enhancing display of the ductal structure and suppressing display of the capillary vessels is generated. The base image is combined with the display controlling image to obtain a display-controlled image in which the display of the ductal structure is enhanced and the display of the capillary vessels is suppressed.