An endoscope apparatus includes a processor. The processor performs controlling a focus position of an objective optical system, acquiring images sequentially captured by an image sensor, and combining the images in N frames thus captured into a depth of field extended image in one frame. The processor controls the focus position such that focus positions at timings when the respective images in N frames are captured differ from each other. The processor combines the images in N frames that have been controlled to receive a constant quantity of light emission of illumination light or the images in N frames that have undergone a correction process to make image brightness constant, into the depth of field extended image.