A control unit 30 includes: an image storage unit 31 constituted by a first image storage unit 32 that stores multiple templates created on the basis of an image including a specific site of a subject and a second image storage unit 33 that stores multiple positive images created on the basis of an image including the specific site of the subject; a learning unit 34 that, on the basis of the multiple positive images, creates a discriminator by machine learning; a position selection unit 35 that, with use of multiple images obtained by collecting an image including the specific site of the subject at a predetermined frame rate, selects a region including the specific site by machine learning using the discriminator; and a position detection unit 36 that detects the position of the specific site by performing template matching using the multiple templates on the region including the specific site selected by the position selection unit 35.