An endoscopic image processing apparatus is configured to create a three-dimensional shape model of an object by performing processing on an endoscopic image group of an inside of the object, and includes a processor. The processor estimates a self-position of the image pickup device based on the endoscopic image group, calculates a first displacement amount corresponding to a displacement amount of the image pickup device based on an estimation result of the self-position of the image pickup device obtained by the estimation, calculates a second displacement amount corresponding to a displacement amount in a direction parallel to a longitudinal axis direction of the insertion portion, based on a detection signal outputted from an insertion/removal state detection device that detects an insertion/removal state of an insertion portion inserted into the object, and generates scale information in which the first displacement amount and the second displacement amount are associated with each other.