An information processing apparatus acquires, for each of a plurality of three-dimensional images, information about a position where a two-dimensional image included in the three-dimensional image is present, identifies, based on an instruction about a position of a two-dimensional image to be displayed at a display unit, a three-dimensional image to be a target of the instruction, and identifies, based on information about the position specified by the instruction, and the information about the position where the two-dimensional image is present for each of the plurality of three-dimensional images, a two-dimensional image which is included in the identified three-dimensional image, and which is to be displayed at the display unit.