A system and method for automated localization, enumeration, and diagnoses of a tooth/condition. The system detects a condition for at least one defined localized and enumerated tooth structure within a cropped image from a full mouth series based on any one of a pixel-level prediction, wherein said condition is detected by at least one of detecting or segmenting a condition on at least one of the enumerated tooth structures within the cropped image by a 2-D R-CNN.