In some embodiments, the system is programmed to build from multiple training sets multiple digital models, each for recognizing plant diseases having symptoms of similar sizes. Each digital model can be implemented with a deep learning architecture that classifies an image into one of several classes. For each training set, the system is thus programmed to collect images showing symptoms of one or more plant diseases having similar sizes. These images are then assigned to multiple disease classes. For a first one of the training sets used to build the first digital model, the system is programmed to also include images that correspond to a healthy condition and images of symptoms having other sizes. These images are then assigned to a no-disease class and a catch-all class. Given a new image from a user device, the system is programmed to then first apply the first digital model. For the portions of the new image that are classified into the catch-all class, the system is programmed to then apply another one of the digital models. The system is programmed to finally transmit classification data to the user device indicating how each portion of the new image is classified into a class corresponding to a plant disease or no plant disease.