A controller has a function that: poygonizes and converts three-dimensional volume data, which is generated by a modality, into polygon data; divides this polygon data into a plurality of clusters; calculates an L2 norm vector of spherical harmonics as a feature vector with respect to each of the clusters based on the polygon data constituting each cluster; identifies whether each cluster is a target or not, based on each calculated feature vector and learning data; and displays an image of a cluster identified as the target at least on a screen.