Stephanie Lanius,Erina Ghosh,Emma Holdrich Schwager,Larry James Eshelman
申请号:
US16840856
公开号:
US20200320391A1
申请日:
2020.04.06
申请国别(地区):
US
年份:
2020
代理人:
摘要:
A method for training a baseline risk model, including: pre-processing input data by normalizing continuous variable inputs and producing one-hot input features for categorical variables; providing definitions for clean input data and dirty input data based upon various input data related to a patient condition; segmenting the input data into clean input data and dirty input data, wherein the clean input data includes a first subset and a second subset, where the first subset and the second subset include all of the clean input data and are disjoint; training a machine learning model using the first subset of the clean data; and evaluating the performance of the trained machine learning model using the second subset of the clean input data and the dirty input data.