A method of building a machine learning pipeline for predicting the efficacy of anti-epilepsy drug treatment regimens is provided. The method includes providing electronic health records data; constructing a patient cohort from the electronic health records data by selecting patients based on a defined target variable indicating anti-epilepsy drug treatment regimen efficacy; constructing a set features found in or derived from the electronic health records data; electronically processing the patient cohort to identify a subset of the features that are predictive for anti-epilepsy drug treatment regimen efficacy for inclusion in predictive models configured for generating predictions representative of efficacy for a plurality of anti-epilepsy drug treatment regimens; and training the predictive computerized model to generate predictions representative of efficacy for a plurality of anti-epilepsy drug treatment regimens for the patients based on the defined target variable indicating anti-epilepsy drug treatment regimen efficacy.