An image processing apparatus generates one image by using at least one frame each of a plurality of moving images obtained by taking moving images of a plurality of different regions of an eye at different times. The apparatus includes a deciding unit configured to decide the at least one frame in each of the plurality of moving images, so that regions which have actually been shot are included in the plurality of moving images in the plurality of regions and an image generating unit configured to generate one image by using the at least one frames decided from each of the plurality of moving images.