In a three-dimensional image forming device for forming a phantom three-dimensional image in accordance with an image of an inner face of a tubular structure to be observed, luminance information of pixels corresponding to a prescribed range of each frame image of the tubular structure is obtained under prescribed lighting conditions when an imaging device with an optical axis extending to an axial direction of the tubular structure moves, a relative distances in a depth direction between points and an objective lens is calculated in accordance with the luminance information, pixels corresponding to the prescribed range of each frame image in the inner face of the tubular structure is arrayed in reflection of the relative distance, and the arrayed pixels are combined for a plurality of the frame images to form a three-dimensional image of the inner face of the tubular structure.