An image processing apparatus that can perform accurate positioning between a plurality of spectral band images acquired by a sequential lighting endoscope is provided. In an image processing apparatus 1 that performs positioning between a plurality of spectral band images obtained by capturing images inside a lumen by using a plurality of rays of light having wavelength bands different from one another, an image processing apparatus includes a spectral band image acquisition unit 110 that acquires a plurality of spectral band images, a spatial frequency component extraction unit 120 that extracts feature data for each spatial frequency band from each pixel in at least one spectral band image among the plurality of spectral band images, a weight calculation unit 130 that calculates weights for each spatial frequency band given to the at least one spectral band image based on the feature data for each spatial frequency band extracted from each pixel, and a positioning unit 140 that performs positioning between the plurality of spectral band images based on the weights.