Methods to improve the selection of breeding individuals as part of a breeding program are provided in which optimized estimation data sets are constructed by selecting candidates for phenotyping, for which genotypic information is also available, from a candidate set and inputting them into the estimation data set and then evaluating accuracy of genomic estimated breeding values for each candidate (i.e. genomic prediction accuracy). The optimized estimation data set is then used as a model to determine genomic estimated breeding values of breeding individuals based purely on genotypic information.