An image processing apparatus capable of displaying an image for providing guidance in moving an instrument to a target part in a subject to a user with high visual perceptibility. The image processing apparatus may be an ultrasound diagnostic apparatus including: a three-dimensional image analyzer determining target position indicating a three-dimensional position of the target part based on a three-dimensional image including the target part; a position information acquirer acquiring instrument position indicating a three-dimensional position of the instrument; a display state determiner selecting one display state from at least two display states, based on a positional relationship between the target part and the instrument; an assist image generator generating an assist image for the selected display state by using the target position and the instrument position; and a display controller performing control for outputting the assist image generated by the assist image generation unit to a display device.