This invention is directed to a technique of extracting, from a case database, a plurality of definite case data similar to an input case. A data search apparatus which extracts definite case data from a case database includes an input acceptance unit for accepting input of case data including at least medical image data, a derivation unit for deriving a similarity between each of the plurality of definite case data stored in the case database and the input case data, a classification unit for classifying the plurality of definite case data stored in the case database into a plurality of diagnosis groups, based on definite diagnosis information included in each of the plurality of definite case data, and an extraction unit for extracting, based on the derived similarity, a predetermined number or more of definite case data from each of the plurality of diagnosis groups.