A similarity analysis method, an apparatus, and a system where the method includes acquiring file fingerprint information of a file to be analyzed, sending an analysis request that carries the file fingerprint information to at least two MDSs, selecting at least one group according to an analysis result returned by each MDS, where the analysis result includes a group number and a similarity of at least one group that has the highest similarity with the file fingerprint information and is found by the MDS, and the MDS locally queries a duplicate data block in the selected group. Hence, each MDS needs to query only a file fingerprint information set of a group that the MDS itself is responsible for, which reduces the amount of data retrieval and waiting time of reading, writing, and locking a database file.