A conventional method in which an amount of movement of an eye ball between acquired images is calculated by extracting characteristic images of the fundus and comparing the images is excellent in precision, reproducibility and stability, but requires time for image processing. The aforementioned problem can be solved by using a tracking apparatus including: a fundus imaging apparatus for acquiring a fundus image and a measurement unit that extracts a characteristic image of a fundus image from a first fundus image captured by the fundus imaging apparatus, detects the characteristic image from a second fundus image that is different from the fundus image, and measures a position change in the fundus images from coordinates of the extracted characteristic image and the detected characteristic image in the respective fundus images, wherein a region in which the characteristic image is detected from the second fundus image is determined so that a region searched for the characteristic image from the first image includes the extracted characteristic image and is broader than a range of movement of the characteristic image resulting from movements of the eye ball within measurement time.