An image processing apparatus performs positioning between a plurality of spectral band images obtained by capturing images inside a lumen using a plurality of rays of light having wavelength bands different from one another. The image processing apparatus includes: a spectral band image acquisition unit configured to acquire the spectral band images a spatial frequency component extraction unit configured to extract feature data for each spatial frequency band from each pixel in at least one spectral band image of the spectral band images a weight calculation unit configured to calculate weights for each spatial frequency band given to the at least one spectral band image, based on the feature data for each spatial frequency band extracted from each pixel in the at least one spectral band image and a positioning unit configured to perform positioning between the spectral band images based on the weights for each spatial frequency band.